Unsloth Fine-Tuning Pipeline

Out of the box, IBM Granite 4 models are highly intelligent generalists. To make them elite cybersecurity operators, AuroraSOC employs an advanced fine-tuning pipeline powered by Unsloth.

This document explains our fine-tuning philosophy, how the datasets are prepared, and how models are generated.

The Power of Unsloth and LoRA

Training Large Language Models traditionally requires massive GPU clusters and weeks of compute time. We bypass this by utilizing LoRA (Low-Rank Adaptation) and the Unsloth library.

LoRA does not change the core knowledge of the model (the billions of pre-trained parameters). Instead, it injects tiny, secondary matrices into the network. During training, only these tiny matrices are updated.

Analogy: Imagine the base Granite model is a brilliant neurosurgeon with an encyclopedic knowledge of medicine. LoRA is a small handbook on "Operating in a War Zone." The handbook doesn't teach the surgeon medicine; it teaches them how to apply their existing knowledge in a very specific, high-stress scenario.

Unsloth is a custom optimization library that rewrites the core mathematical operations of the training process in Triton (a low-level GPU language). By doing so, it delivers 2x faster training speeds and uses 70% less VRAM compared to standard Hugging Face pipelines.

This allows you to train an AuroraSOC specialist agent on a single consumer GPU in a matter of hours.

Pipeline Overview

The pipeline executes in three stages:

Dataset Preparation (prepare_datasets.py)
LoRA Fine-Tuning (finetune_granite.py)
Model Export (finetune_granite.py / serve_model.py)

Configuration for the entire pipeline is defined in a single file: training/configs/granite_soc_finetune.yaml.

1. Dataset Preparation

Our agents learn from real-world security data. training/scripts/prepare_datasets.py is responsible for building this knowledge base.

It downloads, normalizes, and injects data from:

MITRE ATT&CK Framework: Teaching TTP (Tactics, Techniques, and Procedures) mapping.
Sigma Rules: Teaching log and behavior detection.
Atomic Red Team: Teaching incident response based on adversary emulation.
NVD/CVE: Teaching vulnerability assessment and CVSS scoring.
Synthetic Scenarios: For domains that lack public datasets (e.g., Web Security and Cloud Security), the script programmatically generates high-quality synthetic WAF and CloudTrail alerts.

Running the dataset build

Crucial Concept: The --output-dir argument is a strict Path contract. The fine-tuning script will look for soc_train.jsonl precisely in the directory you define here.

python training/scripts/prepare_datasets.py --output-dir training/data

The script will split the data into a training set and an evaluation set, formatting them into conversational JSON Lines format matching the Granite chat template.

2. LoRA Fine-Tuning

The training/scripts/finetune_granite.py script drives the actual learning process.

It reads the granite_soc_finetune.yaml configuration, loads the unsloth/granite-4.0-h-tiny base model, and begins applying the dataset.

System Prompting & Masking

Our training data includes the System Prompt, the User Alert, and the Assistant Response. However, we only calculate loss (the penalty for being "wrong") on the Assistant Response.

We do this because we don't want the model learning how to generate the prompt itself; we only want it to learn how to produce the correct analytical response when given that prompt.

Per-Agent Specialization

The configuration defines agent_profiles (e.g., security_analyst, threat_hunter). The script filters the training data for each specific domain and generates a unique LoRA adapter.

The Orchestrator agent is a special case: because it requires superior reasoning capability to coordinate the fleet, the configuration dynamically overrides its base model, specifying unsloth/granite-4.0-h-small (a larger model size) and an increased LoRA rank (r=128).

Running Training

python training/scripts/finetune_granite.py \
    --config training/configs/granite_soc_finetune.yaml \
    --data-dir training/data

3. Model Export

Once training concludes, a LoRA adapter is useless on its own—it must be merged with the base model to be served.

The finetune_granite.py script handles this automatically via the export_model function, outputting formats based on your target serving backend:

Merged FP16 (granite_soc_merged_16bit): This merges the LoRA and base model into standard PyTorch float16 format. This is required for vLLM production serving.
GGUF (granite_soc_gguf): This heavily quantizes (compresses) the merged model so it can be handled by Ollama.

You can then use the packaging script to finalize deployment to Ollama:

# Push the GGUF to Ollama with the correct Modelfile
python training/scripts/serve_model.py ollama \
    --gguf training/checkpoints/granite_soc_gguf/unsloth.Q8_0.gguf \
    --name granite-soc:latest

Your model is now an autonomous SOC analyst, ready for action.

The Power of Unsloth and LoRA​

Pipeline Overview​

1. Dataset Preparation​

Running the dataset build​

2. LoRA Fine-Tuning​

System Prompting & Masking​

Per-Agent Specialization​

Running Training​

3. Model Export​