FunASR Blog

2026-06-17

Transcribe Long Audio with FunASR: Hours in One Call

Whisper caps at 30s; FunASR ingests any length via built-in VAD - 13 min in 4.3s (186x).

2026-06-17

Real-Time Streaming Speech-to-Text with FunASR

~600ms low-latency streaming ASR with chunks + cache, plus the 2-pass (streaming + offline) best practice.

2026-06-17

Beyond Transcription: Language, Emotion & Audio Events with SenseVoice

Transcription + language + emotion + audio events in one pass — the 4-in-1 Whisper cannot do.

2026-06-17

Speaker Diarization with FunASR: Who Spoke When

Transcription + speaker labels + timestamps in one generate() call. A pyannote+Whisper alternative, no HF gated access.

2026-06-16

FunASR vs Whisper: Real Chinese ASR Benchmark

Measured on 184 Chinese files (H100): SenseVoice 169.6x, 7.81% CER — full speed & accuracy data.

2026-06-16

Fun-ASR-Nano Guide: 800M End-to-End ASR LLM

Flagship — 31 languages, 7 dialects, hotwords, streaming, with tested code.

2026-06-16

SenseVoice Deployment Guide: 15x Faster Than Whisper

Multilingual ASR in 3 lines, with language/emotion/event tags, VAD, GPU/CPU.

More: Quickstart · Models