Skip to main content
Skip to main content
AI Models by kiwi

Current API Models, Clearly Tiered

Free API keys can call auto and hrLLM at 40 requests per hour. PRO unlocks direct access to the rest of the new lineup, while legacy Kiwi models remain visible as deprecated compatibility entries.

Legacy Kiwi aliases are deprecated and marked with EOL: 09.03.2026. They stay visible for lineage and transition planning, but the current public lineup centers on auto, hrLLM, and the new direct PRO models.

Recommended Free Croatian Model

FreeRecommended
hrLLM
Our Croatian-first model for teams and individuals who need dependable Croatian language quality instead of generic multilingual approximations.
LLM.kiwi logo

hrLLM is our Croatian-first model. It writes and answers only in grammatically correct Croatian and is being actively tuned because Croatian is still poorly covered by most general-purpose models.

Built specifically for Croatian instead of treating it as a low-priority multilingual edge case.

Keeps tone, inflection, and sentence structure cleaner than general-purpose models on Croatian prompts.

Recommended free model for Croatian-first API and dashboard workflows.

Public API access

Direct model ID: hrllm

Free API keys: 40 requests/hour

Recommended for Croatian-first products, assistants, and writing workflows.

Open hrLLM page

Current API Lineup

These are the current public-facing models in the lineup. hrLLM is the recommended free Croatian model, while the other direct models are available as PRO.

FreeRecommended
hrLLM
hrllm
LLM.kiwi logo

api.llm.kiwi

Croatian-first model for writing and answering in grammatically correct Croatian.

Best used for: Croatian customer support, formal business writing, public-sector communication, and education content.

Based on: hrllm

Open model page
PRO
DeepSeekR1
DeepSeekR1
LLM.kiwi logo

api.llm.kiwi

Reasoning-heavy Pro model for deeper analysis, technical planning, and multi-step problem solving.

Best used for: Complex reasoning, technical architecture, advanced debugging plans, and step-heavy analytical work.

Based on: DeepSeekR1

Open model page
PRO
Qwen3-1.4B
Qwen3-1.4B
LLM.kiwi logo

api.llm.kiwi

Compact Pro model for quick reasoning, drafting, and lightweight production tasks.

Best used for: Fast general chat, structured drafting, lightweight copilots, and low-latency automations.

Based on: Qwen3-1.4B

Open model page
PRO
SmolLM2-1.7B
SmolLM2-1.7B
LLM.kiwi logo

api.llm.kiwi

Small Pro model tuned for efficient text work, simple assistants, and lean automation.

Best used for: Short-form generation, compact task agents, headline variants, and simple classification-style prompts.

Based on: SmolLM2-1.7B

Open model page
PRO
starcoder2-7b
starcoder2-7b
LLM.kiwi logo

api.llm.kiwi

Coding-first Pro model for implementation, refactors, and repository-aware engineering help.

Best used for: Code generation, repository edits, bug fixing, refactors, and engineering assistance workflows.

Based on: starcoder2-7b

Open model page

Deprecated Models

Deprecated models remain listed for continuity, migration, and provider lineage. They are intentionally greyed out and clearly marked with their EOL date.

DeprecatedEOL: 09.03.2026
Kiwi Codestral
kiwi-codestral
Nexusflow logo

Kiwi Code Frontier

High-throughput coding lane tuned for enterprise repositories and API services.

Best used for: Backend implementation, SQL-heavy services, and test-driven code generation.

Based on: Kiwi Codestral lane

Open model page
DeprecatedEOL: 09.03.2026
Kiwi DeepSeek V3
kiwi-deepseek-v3
Nexusflow logo

Kiwi Code Frontier

Coding-focused model lane for advanced implementation, refactors, and debugging.

Best used for: Large codebase edits, architectural refactors, and deep debugging workflows.

Based on: Kiwi DeepSeek V3 lane

Open model page
DeprecatedEOL: 09.03.2026
Kiwi Forge
kiwi-forge
Mistral AI logo

Mistral AI

Stable all-purpose instruction model for consistent team outputs.

Best used for: General business writing, reusable templates, and dependable delivery.

Based on: Mistral 7B Instruct v0.1

Open model page
DeprecatedEOL: 09.03.2026
Kiwi Gem
kiwi-gem
Google logo

Google

Efficient compact model from Google's Gemma family.

Best used for: Quick copy variants, concise outlines, and fast idea expansion.

Based on: Gemma 2B Instruct LoRA

Open model page
DeprecatedEOL: 09.03.2026
Kiwi GLM Flash
kiwi-glm-4.6v-flash
GLM

Kiwi Frontier

Fast multimodal lane for lightweight reasoning and visual-text blended prompts.

Best used for: Rapid assistants, concise drafting, and image-aware prompt pipelines.

Based on: Kiwi GLM 4.6V Flash lane

Open model page
DeprecatedEOL: 09.03.2026
Kiwi Llama Turbo
kiwi-llama-3.1-8b-turbo
Meta logo

Kiwi Frontier

Turbo reasoning lane for high-context conversation and tool-compatible outputs.

Best used for: Long-context technical Q&A, API assistants, and dynamic copilots.

Based on: Kiwi Llama 3.1 8B Turbo lane

Open model page
DeprecatedEOL: 09.03.2026
Kiwi Ministral 8B
kiwi-ministral-8b
Mistral AI logo

Kiwi Frontier

Multimodal-capable Pro lane tuned for fast reasoning and robust instruction following.

Best used for: Mixed text/image workflows, compact automation agents, and rapid product features.

Based on: Kiwi Ministral 8B lane

Open model page
DeprecatedEOL: 09.03.2026
Kiwi Nano
kiwi-nano
Microsoft logo

Microsoft

Fast lightweight assistant for short tasks and quick checks.

Best used for: Simple prompts, short rewrites, and rapid iteration loops.

Based on: Phi-2

Open model page
DeprecatedEOL: 09.03.2026
Kiwi Orbit
kiwi-orbit
Meta logo

Meta Llama

Balanced model based on Meta Llama family for dependable dialog tasks.

Best used for: General chat, assistant workflows, and robust business Q&A.

Based on: Llama 2 7B Chat LoRA

Open model page
DeprecatedEOL: 09.03.2026
Kiwi OSS 20B
kiwi-gpt-oss-20b
OSS

Kiwi Frontier

Open-model lane for balanced general tasks and flexible experimentation.

Best used for: General workflows, iterative prompts, and broad assistant behavior tuning.

Based on: Kiwi OSS 20B lane

Open model page
DeprecatedEOL: 09.03.2026
Kiwi Pro
kiwi-pro
LLM.kiwi logo

LLM.kiwi Core

Reasoning-first assistant for technical and strategic deliverables.

Best used for: Deep technical guidance, architecture writing, and detailed analysis.

Based on: Kiwi Core Reasoning

Open model page
DeprecatedEOL: 09.03.2026
Kiwi Smart
kiwi-smart
LLM.kiwi logo

LLM.kiwi Core

Balanced general-purpose model for reliable daily production work.

Best used for: Blog content drafts, marketing copy, and structured Q&A.

Based on: Kiwi Core Balanced

Open model page
DeprecatedEOL: 09.03.2026
Kiwi Sprint
kiwi-sprint
Microsoft logo

Microsoft

Low-latency assistant optimized for speed and practical action.

Best used for: Fast answers, tactical task lists, and lightweight workflow support.

Based on: Phi-2

Open model page
DeprecatedEOL: 09.03.2026
Kiwi Starlight
kiwi-starlight
Meta logo

Meta Llama

High-signal assistant tuned for clarity and practical structure.

Best used for: Clear explainers, comparison writeups, and concise plans.

Based on: Llama 3.1 8B Instruct FP8

Open model page
DeprecatedEOL: 09.03.2026
Kiwi Tempest
kiwi-tempest
Mistral AI logo

Mistral AI

Instruction-heavy profile powered by Mistral family models.

Best used for: Complex instruction following, workflows, and technical drafting.

Based on: Mistral 7B Instruct v0.2 LoRA

Open model page
DeprecatedEOL: 09.03.2026
Kiwi Ultra
kiwi-ultra
LLM.kiwi logo

Cloudflare Workers AI

Highest depth assistant for long-form reasoning and premium output quality.

Best used for: Executive briefs, long-form strategy, and advanced reasoning tasks.

Based on: Workers AI runtime default (ultra quality profile)

Open model page

Access and Usage Limits

These are the model-access highlights users need most often. The complete reference stays in the docs.

Public API chat
POST https://api.llm.kiwi/v1/chat/completions

Free: 40 requests/hour for auto and hrLLM

PRO unlocks direct access to the new advanced models with higher sustained throughput.

Public model catalog
GET https://api.llm.kiwi/v1/models

192 requests/minute per IP

Cache-friendly endpoint for model discovery and compatibility metadata.

Dashboard internal chat
POST /api/internal/chat

36 requests/minute per signed-in user + IP

hrLLM additionally uses a tighter free-tier hourly model limit.

Playground sessions
POST /api/playground/chat

24 requests/minute per signed-in user

hrLLM additionally uses a tighter free-tier hourly model limit.