Qwen3 ASR Flash - Speech-to-Text
High-performance audio-to-text conversion by Alibaba Cloud optimized for short audio files. Qwen3 ASR Flash features multilingual support covering 27 languages and Chinese dialects with advanced noise reduction and emotion recognition.
Note: Audio files must first be uploaded using the Asset API before transcription. The audioUrl parameter should contain the path returned from the Asset API upload.
Endpoint
Request Headers
| Field | Value |
|---|---|
| API-KEY | <api-key> |
| Content-Type | application/json |
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
type | string | Yes | Feature type identifier. Must be SPEECH_TO_TEXT |
model | string | Yes | AI model identifier. Must be qwen3-asr-flash |
promptObject | object | Yes | Configuration object containing all transcription parameters |
Prompt Object Parameters
| Parameter | Type | Required | Description | Default |
|---|---|---|---|---|
audioUrl | string | Yes | Path to audio file (uploaded via Asset API). Maximum file size: 10MB. Supported formats: aac, amr, avi, aiff, flac, flv, m4a, mkv, mp3, mpeg, ogg, opus, wav, webm, wma, wmv | - |
language | string | No | Input language code (see Language Options below) | - |
enable_itn | boolean | No | Enable Inverse Text Normalization (applicable to Chinese and English only) | false |
Language Options
Specify the language of the audio to improve recognition accuracy. If the language is uncertain or includes multiple languages, leave this parameter unspecified for automatic detection.
| Language Code | Language |
|---|---|
zh | Chinese (Mandarin, Sichuanese, Minnan, Wu) |
yue | Cantonese |
en | English |
ja | Japanese |
de | German |
ko | Korean |
ru | Russian |
fr | French |
pt | Portuguese |
ar | Arabic |
it | Italian |
es | Spanish |
hi | Hindi |
id | Indonesian |
th | Thai |
tr | Turkish |
uk | Ukrainian |
vi | Vietnamese |
cs | Czech |
da | Danish |
fil | Filipino |
fi | Finnish |
is | Icelandic |
ms | Malay |
no | Norwegian |
pl | Polish |
sv | Swedish |
Code Examples
- cURL
- JavaScript
- Python
curl -X POST "https://api.1min.ai/api/features" \
-H "API-KEY: <api-key>" \
-H "Content-Type: application/json" \
-d '{
"type": "SPEECH_TO_TEXT",
"model": "qwen3-asr-flash",
"promptObject": {
"audioUrl": "development/audios/2025_01_15_18_30_45_001_meeting.mp3",
"language": "en",
"enable_itn": true
}
}'
fetch('https://api.1min.ai/api/features', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'API-KEY': 'YOUR_API_KEY'
},
body: JSON.stringify({
type: 'SPEECH_TO_TEXT',
model: 'qwen3-asr-flash',
promptObject: {
audioUrl: 'development/audios/2025_01_15_18_30_45_001_meeting.mp3',
language: 'en',
enable_itn: true
}
})
})
.then(response => response.json())
.then(data => console.log(data));
import requests
url = "https://api.1min.ai/api/features"
headers = {
"Content-Type": "application/json",
"API-KEY": "YOUR_API_KEY"
}
data = {
"type": "SPEECH_TO_TEXT",
"model": "qwen3-asr-flash",
"promptObject": {
"audioUrl": "development/audios/2025_01_15_18_30_45_001_meeting.mp3",
"language": "en",
"enable_itn": True
}
}
response = requests.post(url, headers=headers, json=data)
print(response.json())
Interactive Playground
API Playground
https://api.1min.ai/api/featuresPath to the audio file (upload via Asset API first)
Specify if known for better accuracy, or leave blank for auto-detection
Convert spoken form to written form (Chinese and English only)
Generated cURL Command:
curl -X POST "https://api.1min.ai/api/features" \
-H "API-KEY: <your-api-key>" \
-H "Content-Type: application/json" \
-d '{
"type": "SPEECH_TO_TEXT",
"model": "qwen3-asr-flash",
"promptObject": {
"audioUrl": "development/audios/2025_01_15_18_30_45_001_meeting.mp3",
"enable_itn": false
}
}'
Response Format
{
"aiRecord": {
"uuid": "acddedab-4525-44d7-a507-0cdcdf51c773",
"userId": "f944dd01-da40-405c-b698-708269eb9664",
"teamId": "7d530bad-357c-4f6e-b627-16f043c9a16b",
"model": "qwen3-asr-flash",
"type": "SPEECH_TO_TEXT",
"metadata": null,
"rating": null,
"feedback": null,
"conversationId": null,
"status": "SUCCESS",
"createdAt": "2026-01-12T09:39:12.997Z",
"aiRecordDetail": {
"promptObject": {
"audioUrl": "development/audios/2026_01_12_16_27_19_160_812932.wav",
"language": "en",
"enable_itn": false
},
"resultObject": [
"Welcome to Google's Read TTS Flash, offering natural and expressive multilingual speech synthesis."
],
"responseObject": {}
},
"additionalData": null,
"temporaryUrl": ""
}
}