Prerequisites & System Requirements

Before training Granite models for AuroraSOC, ensure your environment meets the hardware and software requirements below. The requirements vary depending on whether you train locally, in Docker, or on Google Colab.

Hardware Requirements

Local GPU Training

Component	Minimum	Recommended	Notes
GPU	NVIDIA with 8 GB VRAM	NVIDIA with 16+ GB VRAM	Must be CUDA-capable (compute capability ≥ 7.0)
GPU Models	RTX 3060 (12 GB), T4 (16 GB)	RTX 4090 (24 GB), A100 (40/80 GB)	Consumer GPUs work fine for micro/tiny models
RAM	16 GB	32 GB	Dataset processing needs memory
Disk	30 GB free	50+ GB free	Model checkpoints + datasets
CPU	4 cores	8+ cores	Data loading is CPU-bound

VRAM Requirements by Model Variant

These are the VRAM requirements during training with 4-bit quantization (QLoRA) and Unsloth optimizations:

Model	Training VRAM (4-bit)	Training VRAM (8-bit)	Inference VRAM (GGUF Q8_0)
`granite-4.0-micro`	~4 GB	~6 GB	~1.5 GB
`granite-4.0-h-micro`	~6 GB	~8 GB	~2 GB
`granite-4.0-h-tiny`	~8 GB	~12 GB	~3 GB
`granite-4.0-h-small`	~14 GB	~20 GB	~5 GB

Free GPU Option

If you don't have a local GPU, use Google Colab's free T4 (16 GB VRAM). See the Colab Training guide for details.

Docker Training

Same GPU requirements as local training, plus:

Docker ≥ 24.0
NVIDIA Container Toolkit (nvidia-docker2 or nvidia-container-toolkit)
Docker Compose ≥ 2.20

Google Colab

No local hardware needed. Colab provides:

Free tier: T4 GPU (16 GB VRAM), 12 GB RAM, ~78 GB disk
Colab Pro: A100 (40 GB VRAM), 50+ GB RAM
Colab Pro+: A100 (80 GB VRAM), persistent sessions

Software Requirements

Local Training (Bare Metal)

# Required
Python >= 3.11
CUDA >= 12.1 (with matching cuDNN)
Git

# Install via pip/uv
pip install unsloth torch transformers trl datasets pyyaml httpx

The full Python dependencies are listed in pyproject.toml under the [project.optional-dependencies.training] group:

# Install all training dependencies
pip install -e ".[training]"
# Or using uv (faster)
uv pip install -e ".[training]"
# Or via Make
make train-install

Verifying CUDA

Before training, verify your CUDA setup:

# Check NVIDIA driver
nvidia-smi

# Check CUDA version
nvcc --version

# Check PyTorch sees the GPU
python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')"
python -c "import torch; print(f'GPU: {torch.cuda.get_device_name(0)}')"
python -c "import torch; print(f'VRAM: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB')"

Expected output:

CUDA available: True
GPU: NVIDIA GeForce RTX 4090
VRAM: 24.0 GB

Verifying Unsloth

python -c "from unsloth import FastLanguageModel; print('Unsloth OK')"

If this fails with a Mamba-related error, install the Mamba kernels:

pip install --no-build-isolation mamba_ssm==2.2.5 causal_conv1d==1.5.2

Mamba Kernel Compilation

The first time Mamba kernels are installed, they compile from source. This takes ~10 minutes and requires a working CUDA toolkit. On Colab, this happens automatically in the first notebook cell.

Docker Training

# Verify Docker + NVIDIA runtime
docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi

If this fails, install the NVIDIA Container Toolkit:

# Ubuntu/Debian
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

Ollama (For Deployment)

After training, models are served via Ollama. Install it if you haven't:

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Verify
ollama --version

# Pull base Granite model (for comparison/fallback)
ollama pull granite4:8b

Or use the automated setup script:

./scripts/setup_local.sh

Directory Structure

The training pipeline uses this directory layout:

training/
├── configs/
│   ├── granite_soc_finetune.yaml    # Main training config
│   └── Modelfile.granite-soc        # Ollama Modelfile template
├── data/
│   ├── raw/                         # Downloaded source data
│   │   ├── mitre_attack/
│   │   ├── mitre_car/
│   │   ├── sigma_rules/
│   │   └── atomic_red_team/
│   ├── domain/                      # Per-agent domain splits
│   │   ├── security_analysis.jsonl
│   │   ├── threat_hunting.jsonl
│   │   └── ...
│   ├── soc_train.jsonl              # Combined training data
│   └── soc_eval.jsonl               # Evaluation data
├── checkpoints/                     # Training outputs
│   ├── granite_soc_lora/            # LoRA adapters
│   ├── granite_soc_merged_16bit/    # Merged FP16 (for vLLM)
│   └── granite_soc_gguf/            # GGUF files (for Ollama)
├── notebooks/
│   └── AuroraSOC_Granite4_Finetune.ipynb  # Colab notebook
└── scripts/
    ├── prepare_datasets.py          # Data preparation
    ├── finetune_granite.py          # Training pipeline
    ├── evaluate_model.py            # Model evaluation
    ├── serve_model.py               # Model deployment
    └── train_all_agents.py          # Batch per-agent training

Environment Variables

Training scripts primarily use training/configs/granite_soc_finetune.yaml for model and hyperparameter settings. Environment variables are mainly used for credentials, GPU selection, and container orchestration:

Variable	Default	Description
`AGENT_NAME`	(none)	Agent profile name used by `training-agent` Docker service
`CUDA_VISIBLE_DEVICES`	`0`	Which GPU(s) training should use (`0`, `1`, `0,1`, etc.)
`NVIDIA_VISIBLE_DEVICES`	`all`	GPU visibility for Docker-based training runs
`HF_TOKEN`	(none)	Hugging Face token (only needed for Hub push)
`WANDB_API_KEY`	(none)	Weights & Biases token for experiment tracking

For hyperparameter changes (model name, LoRA rank, learning rate, epochs, sequence length), update the YAML config file and rerun training.

Next Steps

Once your environment is ready:

Dataset Preparation — Download and process SOC training data
Local GPU Training — Start training on your hardware

Hardware Requirements​

Local GPU Training​

VRAM Requirements by Model Variant​

Docker Training​

Google Colab​

Software Requirements​

Local Training (Bare Metal)​

Verifying CUDA​

Verifying Unsloth​

Docker Training​

Ollama (For Deployment)​

Directory Structure​

Environment Variables​

Next Steps​

Hardware Requirements

Local GPU Training

VRAM Requirements by Model Variant

Docker Training

Google Colab

Software Requirements

Local Training (Bare Metal)

Verifying CUDA

Verifying Unsloth

Docker Training

Ollama (For Deployment)

Directory Structure

Environment Variables

Next Steps