Skip to main content

Qwen3 ASR Flash - Speech-to-Text

High-performance audio-to-text conversion by Alibaba Cloud optimized for short audio files. Qwen3 ASR Flash features multilingual support covering 27 languages and Chinese dialects with advanced noise reduction and emotion recognition.

Note: Audio files must first be uploaded using the Asset API before transcription. The audioUrl parameter should contain the path returned from the Asset API upload.

Endpoint

Request Headers

FieldValue
API-KEY<api-key>
Content-Typeapplication/json

Parameters

ParameterTypeRequiredDescription
typestringYesFeature type identifier. Must be SPEECH_TO_TEXT
modelstringYesAI model identifier. Must be qwen3-asr-flash
promptObjectobjectYesConfiguration object containing all transcription parameters

Prompt Object Parameters

ParameterTypeRequiredDescriptionDefault
audioUrlstringYesPath to audio file (uploaded via Asset API).
Maximum file size: 10MB.
Supported formats: aac, amr, avi, aiff, flac, flv, m4a, mkv, mp3, mpeg, ogg, opus, wav, webm, wma, wmv
-
languagestringNoInput language code (see Language Options below)-
enable_itnbooleanNoEnable Inverse Text Normalization (applicable to Chinese and English only)false

Language Options

Specify the language of the audio to improve recognition accuracy. If the language is uncertain or includes multiple languages, leave this parameter unspecified for automatic detection.

Language CodeLanguage
zhChinese (Mandarin, Sichuanese, Minnan, Wu)
yueCantonese
enEnglish
jaJapanese
deGerman
koKorean
ruRussian
frFrench
ptPortuguese
arArabic
itItalian
esSpanish
hiHindi
idIndonesian
thThai
trTurkish
ukUkrainian
viVietnamese
csCzech
daDanish
filFilipino
fiFinnish
isIcelandic
msMalay
noNorwegian
plPolish
svSwedish

Code Examples

curl -X POST "https://api.1min.ai/api/features" \
-H "API-KEY: <api-key>" \
-H "Content-Type: application/json" \
-d '{
"type": "SPEECH_TO_TEXT",
"model": "qwen3-asr-flash",
"promptObject": {
"audioUrl": "development/audios/2025_01_15_18_30_45_001_meeting.mp3",
"language": "en",
"enable_itn": true
}
}'

Interactive Playground

API Playground

https://api.1min.ai/api/features
Path to the audio file (upload via Asset API first)
Specify if known for better accuracy, or leave blank for auto-detection
Convert spoken form to written form (Chinese and English only)

Generated cURL Command:

curl -X POST "https://api.1min.ai/api/features" \
-H "API-KEY: <your-api-key>" \
-H "Content-Type: application/json" \
-d '{
"type": "SPEECH_TO_TEXT",
"model": "qwen3-asr-flash",
"promptObject": {
"audioUrl": "development/audios/2025_01_15_18_30_45_001_meeting.mp3",
"enable_itn": false
}
}'

Response Format

{
"aiRecord": {
"uuid": "acddedab-4525-44d7-a507-0cdcdf51c773",
"userId": "f944dd01-da40-405c-b698-708269eb9664",
"teamId": "7d530bad-357c-4f6e-b627-16f043c9a16b",
"model": "qwen3-asr-flash",
"type": "SPEECH_TO_TEXT",
"metadata": null,
"rating": null,
"feedback": null,
"conversationId": null,
"status": "SUCCESS",
"createdAt": "2026-01-12T09:39:12.997Z",
"aiRecordDetail": {
"promptObject": {
"audioUrl": "development/audios/2026_01_12_16_27_19_160_812932.wav",
"language": "en",
"enable_itn": false
},
"resultObject": [
"Welcome to Google's Read TTS Flash, offering natural and expressive multilingual speech synthesis."
],
"responseObject": {}
},
"additionalData": null,
"temporaryUrl": ""
}
}