Model Specifications

Voxtral Model Family

Choose the right model for your use case. All models feature automatic language detection, built-in Q&A capabilities, and Apache 2.0 licensing.

Recommended

Voxtral Large
v1.0

Our flagship model with 32K token context window and state-of-the-art accuracy

95.2%

Accuracy

32K tokens

Context

15+

Languages

7B parameters

Parameters

Key Features

32K token context window (40+ minutes)
Built-in Q&A and summarization
Function calling from voice
Automatic language detection
Real-time processing support

Benchmark Results

Common Voice95.2%

vs 91.7% Whisper large-v3

FLEURS93.8%

vs 89.4% Whisper large-v3

LibriSpeech96.1%

vs 94.2% Whisper large-v3

Download Model View Documentation

Voxtral Base
v1.0

Balanced performance and efficiency for most use cases

92.8%

Accuracy

16K tokens

Context

15+

Languages

1.5B parameters

Parameters

Key Features

16K token context window (20+ minutes)
Built-in Q&A and summarization
Automatic language detection
Optimized for edge deployment

Benchmark Results

Common Voice92.8%

vs 89.1% Whisper base

FLEURS90.5%

vs 86.2% Whisper base

LibriSpeech93.4%

vs 91.8% Whisper base

Download Model View Documentation

Voxtral Small
v1.0

Lightweight model optimized for mobile and edge devices

89.4%

Accuracy

8K tokens

Context

10+

Languages

244M parameters

Parameters

Key Features

8K token context window (10+ minutes)
Automatic language detection
Mobile and edge optimized
Low memory footprint

Benchmark Results

Common Voice89.4%

vs 85.7% Whisper small

FLEURS86.2%

vs 82.1% Whisper small

LibriSpeech90.8%

vs 88.3% Whisper small

Download Model View Documentation

Supported Languages

All models support automatic language detection across these languages

English

Spanish

French

Portuguese

Hindi

German

Dutch

Italian

Chinese

Japanese

Korean

Russian

Arabic

Turkish

Polish

Quick Start

Get started with Voxtral models using our API or self-hosted deployment

API Usage

curl -X POST https://api.mistral.ai/v1/audio/transcriptions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@your_audio.wav" \
  -F "model=voxtral-large" \
  -F "response_format=verbose_json"

Self-hosted Deployment

docker run -p 8000:8000 \
  -v ./models:/models \
  voxtral/voxtral-server:latest \
  --model voxtral-large

Ready to Try Voxtral?

Experience the power of our speech AI models with our interactive demo or start building with our API.

Try Interactive Demo View Documentation

Voxtral Model Family

Voxtral Largev1.0

Key Features

Benchmark Results

Voxtral Basev1.0

Key Features

Benchmark Results

Voxtral Smallv1.0

Key Features

Benchmark Results

Supported Languages

Quick Start

API Usage

Self-hosted Deployment

Ready to Try Voxtral?

Voxtral Large
v1.0

Voxtral Base
v1.0

Voxtral Small
v1.0