Models overview
Available models for Berget AI serverless inference
All models run on Berget AI's European infrastructure and are accessed via an OpenAI-compatible API. Pricing is per million tokens.
Text
General-purpose instruction-following models for chat, reasoning, and structured output.
Llama 3.1 8B Instruct
meta-llama/Llama-3.1-8B-Instruct · 128k tokens
Input: €0.20 / M tokens · Output: €0.20 / M tokens
Llama 3.3 70B Instruct
meta-llama/Llama-3.3-70B-Instruct · 128k tokens
Input: €0.90 / M tokens · Output: €0.90 / M tokens
Mistral Small 3.2 24B
mistralai/Mistral-Small-3.2-24B-Instruct-2506 · 128k tokens
Input: €0.30 / M tokens · Output: €0.30 / M tokens
GPT-OSS 120B
openai/gpt-oss-120b · 128k tokens
Input: €0.30 / M tokens · Output: €0.90 / M tokens
GLM 4.7 FP8
zai-org/GLM-4.7-FP8 · 200k tokens
Input: €0.70 / M tokens · Output: €2.50 / M tokens
Embedding
Vector representations of text for semantic search, clustering, and retrieval-augmented generation.
Multilingual-E5-large-instruct
intfloat/multilingual-e5-large-instruct
€0.03 / M tokens
Multilingual-E5-large
intfloat/multilingual-e5-large
€0.03 / M tokens
Reranking
Scores and reorders a list of documents by relevance to a query, useful for improving retrieval quality in RAG pipelines.
Speech-to-text
Transcribes audio to text, with specialised models for Swedish and Norwegian alongside a multilingual option.
KB-Whisper-Large
KBLab/kb-whisper-large
€3.00 / 1,000 min
NB-Whisper Large
NbAiLab/nb-whisper-large
€3.00 / 1,000 min
Whisper Large v3
openai/whisper-large-v3
€3.00 / 1,000 min