Skip to main content

Phone Call - Speech to Text

Convert phone call recordings and telephony audio to accurate text transcriptions with specialized processing optimized for phone call audio quality and characteristics. Perfect for customer service calls, business communications, and phone interviews.

Note: Audio files must first be uploaded using the Asset API before transcription. The audioUrl parameter should contain the path returned from the Asset API upload.

Supported Models

  • phone_call: Specialized for phone call audio with enhanced processing for telephony audio characteristics

Endpoint

Request Headers

FieldValue
API-KEY<api-key>
Content-Typeapplication/json

Supported Audio Formats

  • MP3 - MPEG Audio Layer III
  • WAV - Waveform Audio File Format (common for phone recordings)
  • M4A - MPEG-4 Audio
  • FLAC - Free Lossless Audio Codec
  • MP4 - MPEG-4 Part 14 (audio only)
  • WEBM - WebM Audio
  • OGG - Ogg Vorbis

Language Support

The API supports various languages including:

  • en-US - English (US)
  • en-GB - English (UK)
  • vi-VN - Vietnamese
  • es-ES - Spanish
  • fr-FR - French
  • de-DE - German
  • it-IT - Italian
  • pt-PT - Portuguese
  • ru-RU - Russian
  • ja-JP - Japanese
  • ko-KR - Korean
  • zh-CN - Chinese (Simplified)
  • ar-SA - Arabic

Note: For a complete list of all supported languages and their language codes, please refer to the Google Cloud Text-to-Speech documentation.

Code Example

Parameters

ParameterTypeRequiredDescription
typestringYesFeature type, must be "SPEECH_TO_TEXT"
modelstringYesModel identifier, use "phone_call"
promptObject.audioUrlstringYesPath to audio file (uploaded via Asset API)
promptObject.languagestringYesLanguage code for transcription (e.g., "vi-VN", "en-US")

Code Examples

curl --location 'https://api.1min.ai/api/features' \
--header 'API-KEY: <api-key>' \
--header 'Content-Type: application/json' \
--data '{
"type": "SPEECH_TO_TEXT",
"model": "phone_call",
"promptObject": {
"audioUrl": "audios/2025_02_20_16_26_08_652_New_Recording.m4a",
"language": "en-US"
}
}'

Interactive Playground

Try the API directly in your browser:

API Playground

https://api.1min.ai/api/features
Path to the phone call audio file you want to transcribe (upload via Asset API first)

Generated cURL Command:

curl -X POST "https://api.1min.ai/api/features" \
-H "API-KEY: <your-api-key>" \
-H "Content-Type: application/json" \
-d '{
"type": "SPEECH_TO_TEXT",
"model": "phone_call",
"promptObject": {
"audioUrl": "audios/2025_02_20_16_26_08_652_New_Recording.m4a",
"language": "en-US"
}
}'

Use Cases

  • Customer Service: Transcribe customer support calls for quality assurance and training
  • Sales Calls: Convert sales conversations into searchable text for CRM systems
  • Phone Interviews: Create transcripts of recruitment interviews and research calls
  • Conference Calls: Document business meetings and conference calls
  • Legal Depositions: Transcribe legal proceedings conducted over the phone
  • Telehealth: Convert medical consultations to text for patient records
  • Market Research: Transcribe phone surveys and market research interviews
  • Call Center Analytics: Analyze customer interactions and sentiment

Tips for Best Results

  1. Upload First: Use the Asset API to upload your phone call recording before transcription
  2. Audio Quality: Phone call quality can vary - ensure the recording is as clear as possible
  3. Format Compatibility: WAV and MP3 are commonly used formats for phone recordings
  4. Language Selection: Choose the correct language for best accuracy results
  5. Speaker Separation: The model can identify different speakers in the conversation
  6. Background Noise: Minimize background noise during phone recordings when possible

Error Handling

Common error scenarios and solutions:

  • File not found: Ensure the audio file was uploaded via Asset API first
  • Invalid audioUrl: Verify the path matches exactly what was returned from Asset API upload
  • Language not supported: Check that the language code is in the supported list
  • Poor audio quality: Phone recordings may have lower quality - consider audio enhancement

Response