Telephony - Speech to Text

Convert telephony audio recordings and telecommunication content to accurate text transcriptions with specialized processing optimized for telephony systems and communication networks. Perfect for call center operations, PBX recordings, and telecommunication infrastructure.

Note: Audio files must first be uploaded using the Asset API before transcription. The audioUrl parameter should contain the path returned from the Asset API upload.

Supported Models

telephony: Specialized for telephony audio with enhanced processing for telecommunication system characteristics

Endpoint

POSThttps://api.1min.ai/api/features

Request Headers

Field	Value
API-KEY	`<api-key>`
Content-Type	`application/json`

Supported Audio Formats

MP3 - MPEG Audio Layer III
WAV - Waveform Audio File Format (preferred for telephony)
M4A - MPEG-4 Audio
FLAC - Free Lossless Audio Codec
MP4 - MPEG-4 Part 14 (audio only)
WEBM - WebM Audio
OGG - Ogg Vorbis

Language Support

The API supports various languages including:

en-US - English (US)
en-GB - English (UK)
vi-VN - Vietnamese
es-ES - Spanish
fr-FR - French
de-DE - German
it-IT - Italian
pt-PT - Portuguese
ru-RU - Russian
ja-JP - Japanese
ko-KR - Korean
zh-CN - Chinese (Simplified)
ar-SA - Arabic

Note: For a complete list of all supported languages and their language codes, please refer to the Google Cloud Text-to-Speech documentation.

Code Example

Parameters

Parameter	Type	Required	Description
`type`	string	Yes	Feature type, must be "SPEECH_TO_TEXT"
`model`	string	Yes	Model identifier, use "telephony"
`promptObject.audioUrl`	string	Yes	Path to audio file (uploaded via Asset API)
`promptObject.language`	string	Yes	Language code for transcription (e.g., "en-US", "vi-VN")

Code Examples

cURL
JavaScript
Python

curl --location 'https://api.1min.ai/api/features' \
--header 'API-KEY: <api-key>' \
--header 'Content-Type: application/json' \
--data '{
"type": "SPEECH_TO_TEXT",
"model": "telephony",
"promptObject": {
  "audioUrl": "audios/2025_02_20_16_26_08_652_New_Recording.m4a",
  "language": "en-US"
}
}'

fetch('https://api.1min.ai/api/features', {
method: 'POST',
headers: {
  'Content-Type': 'application/json',
  'API-KEY': 'YOUR_API_KEY'
},
body: JSON.stringify({
  type: 'SPEECH_TO_TEXT',
  model: 'telephony',
  promptObject: {
    audioUrl: 'audios/2025_02_20_16_26_08_652_New_Recording.m4a',
    language: 'en-US'
  }
})
})

import requests

url = "https://api.1min.ai/api/features"
headers = {
"Content-Type": "application/json",
"API-KEY": "YOUR_API_KEY"
}

data = {
"type": "SPEECH_TO_TEXT",
"model": "telephony",
"promptObject": {
  "audioUrl": "audios/2025_02_20_16_26_08_652_New_Recording.m4a",
  "language": "en-US"
}
}

response = requests.post(url, headers=headers, json=data)

Interactive Playground

Try the API directly in your browser:

API Playground

https://api.1min.ai/api/features

AI Model *

Audio URL *

Path to the telephony audio file you want to transcribe (upload via Asset API first)

Language *

Generated cURL Command:

curl -X POST "https://api.1min.ai/api/features" \
  -H "API-KEY: <your-api-key>" \
  -H "Content-Type: application/json" \
  -d '{
  "type": "SPEECH_TO_TEXT",
  "model": "telephony",
  "promptObject": {
    "audioUrl": "audios/2025_02_20_16_26_08_652_New_Recording.m4a",
    "language": "en-US"
  }
}'

Response Format

{
"success": true,
"data": {
  "transcription": "Welcome to our customer service line. Your call is important to us. Please hold while we connect you to the next available agent. Thank you for your patience.",
  "duration": "00:03:45", 
  "language": "en-US",
  "confidence": 0.93,
  "telephony_metadata": {
    "call_quality": "standard",
    "codec": "G.711",
    "sample_rate": "8kHz"
  }
}
}

Use Cases

Call Center Operations: Transcribe customer service and support calls
PBX Recordings: Convert private branch exchange system recordings
VoIP Communications: Transcribe Voice over IP call recordings
Telecommunication Analytics: Analyze call patterns and customer interactions
Quality Assurance: Monitor and evaluate call center performance
Compliance Recording: Document regulatory compliance calls
Conference Bridge: Transcribe multi-party conference calls
IVR Integration: Convert interactive voice response system interactions

Tips for Best Results

Upload First: Use the Asset API to upload your telephony recording before transcription
Telephony Quality: Works well with standard telephony audio quality (8kHz sampling)
Format Compatibility: WAV format is preferred for telephony recordings
Language Selection: Choose the correct language for optimal accuracy
Call Quality: Standard telephony codecs (G.711, G.729) are well supported
Processing Time: Processing time varies with call duration and complexity

Supported Models​

Endpoint​

Request Headers​

Supported Audio Formats​

Language Support​

Code Example​

Parameters​

Code Examples​

Interactive Playground​