Model Hierarchy

Four model tiers built from the same corpus, filtered by color domain. Same data, different edges.

The Four Models

BlackRainbow (Full Spectrum)

The complete model. All 11 colors, all 63,174 pairs. Knows offense, defense, infrastructure, governance, and everything between. This is the general-purpose operator copilot.

Color filter: ALL Use case: Full-spectrum engagements, purple team operations, training pipeline validation.

Shinobit (Attack)

The blade. Filtered to offensive colors only. Leaner, faster, sharper on attack chains. No defensive knowledge, no governance. Pure offense.

Color filter: Red, Orange, Yellow Use case: Penetration testing, red team operations, exploit development, credential attacks.

Onibit (Detect)

The shield. Filtered to defensive and governance colors. Detection engineering, log analysis, incident response. Sees what Shinobit does, from the other side.

Color filter: Blue, Grey Use case: Detection rule authoring, threat hunting, SOC operations, compliance mapping.

Immortal Blade (Purple Team)

The flip. Red and Blue combined, no noise from other domains. Understands both the attack and the detection. Purpose-built for purple team exercises where the operator needs to think in both directions simultaneously.

Color filter: Red + Blue (combined, not separated) Use case: Purple team engagements, detection gap analysis, adversary emulation with detection validation.

Base Model

All four tiers start from Qwen2.5-7B-Instruct.

Selection criteria:

7B parameter count fits in 32GB VRAM with QLoRA overhead
Strong instruction-following baseline
Permissive license for commercial fine-tuning
Good performance on code and technical content
128K context window

Training Method

QLoRA (Quantized Low-Rank Adaptation):

Parameter	Value
Rank (r)	64
Alpha	128
Dropout	0.05
Target modules	All linear layers
Quantization	4-bit NormalFloat (NF4)
Optimizer	AdamW
Learning rate	2e-4
Scheduler	Cosine
Epochs	3
Batch size	Per-GPU micro-batch 1, gradient accumulation 8
Max sequence length	4096

The r=64/alpha=128 configuration (alpha = 2x rank) provides strong adaptation without catastrophic forgetting of the base model's general capabilities.

Training Hardware

Node	GPU	VRAM	Role
gpu-node-1	RTX 5090	32GB	Primary training, QLoRA fine-tuning
gpu-node-2	RTX 5090	32GB	Parallel training, evaluation runs
mlx-node	M4 Pro	64GB unified	MLX training for 7B-13B models

Typical training run for the full BlackRainbow model (63K pairs, 3 epochs) takes ~10 hours on a single RTX 5090.

Deployment

GGUF Quantization

After training, LoRA adapters are merged back into the base model, then quantized to GGUF format for Ollama deployment:

Quantization	Size	Quality	Use Case
Q5_K_M	~5.1GB	Higher fidelity	Primary inference, evaluation
Q4_K_M	~4.4GB	Good balance	Fast inference, resource-constrained

Ollama Deployment

Models are deployed as Ollama modelfiles:

FROM ./blackrainbow-v08.Q5_K_M.gguf

PARAMETER temperature 0.3
PARAMETER top_p 0.9
PARAMETER num_ctx 4096

SYSTEM """You are BlackRainbow, a security assurance domain expert..."""

The model section in blackrainbow.yaml controls which model the framework uses:

model:
  provider: ollama
  model: blackrainbow-v08
  host: http://localhost:11434
  temperature: 0.3

Version History

Models are versioned sequentially. Each version represents a corpus expansion, hyperparameter change, or base model upgrade. The corpus grows monotonically. Models are retrained from scratch on each version, not incrementally fine-tuned.

v01 → v02 → ... → v08 (current)

Every version is validated by the operator before deployment. No model ships without human evaluation.

The Four Models​

BlackRainbow (Full Spectrum)​

Shinobit (Attack)​

Onibit (Detect)​

Immortal Blade (Purple Team)​

Base Model​

Training Method​

Training Hardware​

Deployment​

GGUF Quantization​

Ollama Deployment​

Version History​