Typhoon ASR Documentation
About The Model
Section titled “About The Model”-
Typhoon ASR Real-Time:
-
Web Playground - Try it instantly in your browser. Great for casual users.
🎧 Supported File Types
Section titled “🎧 Supported File Types”.wav
, .mp3
, .flac
, .ogg
, .opus
🔌 Option 1: Use Typhoon API
Section titled “🔌 Option 1: Use Typhoon API”Typhoon’s hosted API is OpenAI-compatible. This is the fastest way to integrate ASR without setting up infrastructure.
You will need a Typhoon API key which you can get one for free at our web playground.
Example:
Section titled “Example:”See OpenAI’s API Doc for Transcription
from openai import OpenAI
client = OpenAI( api_key="<YOUR_API_KEY>", base_url="https://api.opentyphoon.ai/v1")
def transcribe_audio_file(audio_file_path): """ Transcribe an audio file using Typhoon ASR API """ try: with open(audio_file_path, 'rb') as audio_file: transcription = client.audio.transcriptions.create( file=audio_file, model="typhoon-asr-realtime" ) return transcription except Exception as e: print(f"Error transcribing audio: {e}") return None
audio_file_path = "path/to/your/audio.wav"
transcription = transcribe_audio_file(audio_file_path)if transcription: print(f"Transcription: {transcription.text}") print(f"Usage: {transcription.usage}")
Reference
Section titled “Reference”Model ID | Size | Description | Rate Limits | Release Date |
---|---|---|---|---|
typhoon-asr-realtime | 114M | Streaming ASR | 100 reqs/minute | 2025-09-08 |
🖥️ Option 2: Self-Hosting with Python Package
Section titled “🖥️ Option 2: Self-Hosting with Python Package”For developers who want to run the model locally (CPU or GPU). No API key required.
Install package
Section titled “Install package”pip install typhoon-asr
Example: Local Usage
Section titled “Example: Local Usage”from typhoon_asr import transcribe
# Basic transcriptionresult = transcribe("audio.wav")print(result['text'])
# With word timestampsresult = transcribe("audio.wav", with_timestamps=True)for ts in result['timestamps']: print(f"[{ts['start']:.2f}s - {ts['end']:.2f}s] {ts['word']}")
# Specify device (CPU/GPU/auto)result = transcribe("audio.wav", device="cuda")print(result['text'])
API Reference (Self-Host Mode)
Section titled “API Reference (Self-Host Mode)”transcribe( input_file, model_name="scb10x/typhoon-asr-realtime", with_timestamps=False, device="auto")
Parameters:
Section titled “Parameters:”-
input_file (str)
– Path to audio file -
model_name (str)
– Hugging Face model identifier (default: scb10x/typhoon-asr-realtime) -
with_timestamps (bool)
– Return word timestamps (default: False) -
device (str)
– “auto”, “cpu”, “cuda”
Returns (dict):
Section titled “Returns (dict):”-
text
– Transcribed text -
timestamps
– Word timestamps (if enabled) -
processing_time
– Processing duration in seconds -
audio_duration
– Input audio length in seconds
Requirements
Section titled “Requirements”-
Python ≥ 3.8
-
CUDA (optional, for GPU acceleration)
Related Link
Section titled “Related Link”See our Github Repo for more example codes including a fine-tuning example: https://github.com/scb-10x/typhoon-asr
License
Section titled “License”Apache Software License 2.0