إنتقل إلى المحتوى الرئيسي

LLM provider configuration

What this page is

How AuroraSOC supports hosted LLM providers (DeepSeek, Gemini, OpenAI, Anthropic) in addition to local Ollama/vLLM: the configuration surface, where API keys are stored, the runtime admin API, and how a switch propagates.

Why it exists this way

Architecture §3.5 commits to a provider abstraction where agents can run against local or hosted inference keyed on sensitivity. Some hosts cannot run a local model at all, and trusted deployments may prefer a managed endpoint. ADR 022 records the decision: native BeeAI providers as first-class backends, runtime configuration via an admin API, and Vault-first key storage with an encrypted-database fallback.

How it works

Backends

ServingBackend (in granite/init.py) now includes deepseek, gemini, and anthropic next to ollama, vllm, and openai. Each resolves a BeeAI model reference with the provider's native prefix so ChatModel.from_name loads the right LiteLLM adapter:

  • deepseek:deepseek-v4-flash
  • gemini:gemini-2.5-pro
  • anthropic:claude-sonnet-4-6
  • openai:gpt-4o (also used for any OpenAI-compatible endpoint via a base URL)

The vLLM auto-probe (ADR 005) is unchanged and stays vLLM-only; the new providers are selected explicitly.

Settings (env)

All keys use the LLM_ prefix and live in config/settings/llm.py:

LLM_BACKEND=deepseek # ollama | vllm | openai | deepseek | gemini | anthropic | auto

LLM_DEEPSEEK_API_KEY=...
LLM_DEEPSEEK_MODEL=deepseek-v4-flash
LLM_DEEPSEEK_ORCHESTRATOR_MODEL=deepseek-v4-pro
LLM_DEEPSEEK_BASE_URL=https://api.deepseek.com

LLM_GEMINI_API_KEY=...
LLM_GEMINI_MODEL=gemini-2.5-pro # Gemini takes no base_url

LLM_ANTHROPIC_API_KEY=...
LLM_ANTHROPIC_MODEL=claude-sonnet-4-6

# OpenAI / OpenAI-compatible endpoints use the existing compatible vars:
LLM_OPENAI_COMPATIBLE_BASE_URL=https://api.openai.com/v1
LLM_OPENAI_COMPATIBLE_MODEL=gpt-4o
LLM_OPENAI_COMPATIBLE_API_KEY=...

Env values are the boot defaults. Keys entered at runtime override them in the running process and persist to the secret store.

Key storage

core/secret_store.py writes provider keys Vault-first (KV-v2 at aurorasoc/llm-providers/<provider>) and falls back to a Fernet-encrypted row in the llm_provider_credentials table (Alembic 028) when Vault is not configured. The Fernet key is derived from AURORA_SECRET_KEY if set, otherwise from the mandatory JWT_SECRET_KEY. Raw keys are never returned by the API, only key_present, key_last4, and the storage source.

On boot, apply_stored_keys_to_settings() loads stored keys into the live settings so the granite layer sees them.

Runtime API

The llm_providers router (api/routers/llm_providers.py) exposes admin endpoints (all mutations require users:manage):

GET /api/v1/system/llm/providers # inventory + active backend + key presence
PUT /api/v1/system/llm/config # { provider, model?, orchestrator_model?, base_url? }
PUT /api/v1/system/llm/providers/{p}/key # { api_key }
DELETE /api/v1/system/llm/providers/{p}/key
POST /api/v1/system/llm/providers/{p}/test # one-token reachability probe

Every mutation invalidates the shared ChatModel pool (reset_chat_model_pool) and the backend-resolution cache (reset_resolution_cache) and is written to the hash-chained user audit log. In-process inference (chat, investigation workflows) adopts the change immediately; agent A2A server processes adopt it on their next factory build or restart. Fleet-wide hot propagation is a later wave.

Choosing a provider

BackendKeyBase URLData leaves hostBest for
ollamanolocalnodev workstations, air-gapped sites
vllmnolocalnoGPU production (default)
deepseekyesoptionalyeslow-cost hosted reasoning
geminiyesnoneyeslong-context, multimodal
openaiyesoptionalyesbroad model choice / compatible endpoints
anthropicyesoptionalyeshardest investigations, tool use

What goes wrong and how do you fix it

  • No model configured for <provider>. Set the specialist model in Settings or the LLM_<PROVIDER>_MODEL env var before testing.
  • Key stored but agents still keyless after restart. Confirm AURORA_SECRET_KEY (or JWT_SECRET_KEY) is stable across restarts, a changed seed cannot decrypt the database fallback. Vault-stored keys are unaffected.
  • Gemini rejects a base URL. The Gemini adapter takes no base_url; the runtime never sends one. Leave it blank.