Skip to main content

Model Hierarchy

Four model tiers built from the same corpus, filtered by color domain. Same data, different edges.

The Four Models

BlackRainbow (Full Spectrum)

The complete model. All 11 colors, all 63,174 pairs. Knows offense, defense, infrastructure, governance, and everything between. This is the general-purpose operator copilot.

Color filter: ALL Use case: Full-spectrum engagements, purple team operations, training pipeline validation.

Shinobit (Attack)

The blade. Filtered to offensive colors only. Leaner, faster, sharper on attack chains. No defensive knowledge, no governance. Pure offense.

Color filter: Red, Orange, Yellow Use case: Penetration testing, red team operations, exploit development, credential attacks.

Onibit (Detect)

The shield. Filtered to defensive and governance colors. Detection engineering, log analysis, incident response. Sees what Shinobit does, from the other side.

Color filter: Blue, Grey Use case: Detection rule authoring, threat hunting, SOC operations, compliance mapping.

Immortal Blade (Purple Team)

The flip. Red and Blue combined, no noise from other domains. Understands both the attack and the detection. Purpose-built for purple team exercises where the operator needs to think in both directions simultaneously.

Color filter: Red + Blue (combined, not separated) Use case: Purple team engagements, detection gap analysis, adversary emulation with detection validation.

Base Model

All four tiers start from Qwen2.5-7B-Instruct.

Selection criteria:

  • 7B parameter count fits in 32GB VRAM with QLoRA overhead
  • Strong instruction-following baseline
  • Permissive license for commercial fine-tuning
  • Good performance on code and technical content
  • 128K context window

Training Method

QLoRA (Quantized Low-Rank Adaptation):

ParameterValue
Rank (r)64
Alpha128
Dropout0.05
Target modulesAll linear layers
Quantization4-bit NormalFloat (NF4)
OptimizerAdamW
Learning rate2e-4
SchedulerCosine
Epochs3
Batch sizePer-GPU micro-batch 1, gradient accumulation 8
Max sequence length4096

The r=64/alpha=128 configuration (alpha = 2x rank) provides strong adaptation without catastrophic forgetting of the base model's general capabilities.

Training Hardware

NodeGPUVRAMRole
gpu-node-1RTX 509032GBPrimary training, QLoRA fine-tuning
gpu-node-2RTX 509032GBParallel training, evaluation runs
mlx-nodeM4 Pro64GB unifiedMLX training for 7B-13B models

Typical training run for the full BlackRainbow model (63K pairs, 3 epochs) takes ~10 hours on a single RTX 5090.

Deployment

GGUF Quantization

After training, LoRA adapters are merged back into the base model, then quantized to GGUF format for Ollama deployment:

QuantizationSizeQualityUse Case
Q5_K_M~5.1GBHigher fidelityPrimary inference, evaluation
Q4_K_M~4.4GBGood balanceFast inference, resource-constrained

Ollama Deployment

Models are deployed as Ollama modelfiles:

FROM ./blackrainbow-v08.Q5_K_M.gguf

PARAMETER temperature 0.3
PARAMETER top_p 0.9
PARAMETER num_ctx 4096

SYSTEM """You are BlackRainbow, a security assurance domain expert..."""

The model section in blackrainbow.yaml controls which model the framework uses:

model:
provider: ollama
model: blackrainbow-v08
host: http://localhost:11434
temperature: 0.3

Version History

Models are versioned sequentially. Each version represents a corpus expansion, hyperparameter change, or base model upgrade. The corpus grows monotonically. Models are retrained from scratch on each version, not incrementally fine-tuned.

v01 → v02 → ... → v08 (current)

Every version is validated by the operator before deployment. No model ships without human evaluation.