franki routes across all configured providers and falls back automatically when one is rate-limited. You can also add any OpenAI-compatible endpoint.
Ultra-fast inference. Recommended models: llama-3.3-70b-versatile, llama-3.1-8b-instant. Get a key at console.groq.com/keys
Generous free quota with long context. Recommended models: gemini-2.5-flash, gemini-2.0-flash. Get a key at aistudio.google.com/apikey
Access hundreds of models. Free options: meta-llama/llama-3.3-70b:free, google/gemini-2.5-flash:free. Get a key at openrouter.ai/keys
Run models entirely on your machine. Recommended: llama3, codellama, qwen2.5-coder. Use /ollama to pick from installed models.
Claude models. Recommended: claude-sonnet-4-6, claude-haiku-4-5-20251001. Get a key at console.anthropic.com/settings/api-keys
European models with strong JSON mode. Recommended: mistral-small-latest, mistral-large-latest. Get a key at console.mistral.ai/api-keys
Extremely fast inference on custom silicon. Recommended: llama-3.3-70b. Get a key at cloud.cerebras.ai
Broad model selection. Recommended: meta-llama/Meta-Llama-3.3-70B-Instruct-Turbo. Get a key at api.together.ai/settings/api-keys
Strong long-context and reasoning. Recommended: command-r-plus-08-2024, command-r7b-12-2024. Get a key at dashboard.cohere.com/api-keys
Enterprise Azure OpenAI deployments. Supports gpt-4o and gpt-4o-mini. Configure your endpoint during franki init.
Any OpenAI-compatible endpoint. Point it at your own deployment, NIM, LM Studio, or proxy.
# Open the interactive model and provider manager /model # Switch by provider name or model name (partial match works) /model gemini-2.5-flash /model groq # Use the explicit provider/model format /model groq/llama-3.3-70b-versatile # List installed Ollama models and pick one /ollama # Prefer local providers such as Ollama first franki config set local_first true