Skip to main content

Medical Dictation - Speech to Text

Convert medical dictation and clinical notes to accurate text transcriptions with specialized processing for medical terminology, clinical abbreviations, and healthcare documentation formats. Perfect for physician dictation, clinical notes, patient records, and medical documentation.

Note: Audio files must first be uploaded using the Asset API before transcription. The audioUrl parameter should contain the path returned from the Asset API upload.

Supported Models

  • medical_dictation: Specialized for medical dictation with enhanced medical terminology recognition and clinical formatting

Endpoint

Request Headers

FieldValue
API-KEY<api-key>
Content-Typeapplication/json

Supported Audio Formats

  • MP3 - MPEG Audio Layer III
  • WAV - Waveform Audio File Format
  • M4A - MPEG-4 Audio
  • FLAC - Free Lossless Audio Codec
  • MP4 - MPEG-4 Part 14 (audio only)
  • WEBM - WebM Audio
  • OGG - Ogg Vorbis

Language Support

The API supports various languages including:

  • en-US - English (US)
  • en-GB - English (UK)
  • vi-VN - Vietnamese
  • es-ES - Spanish
  • fr-FR - French
  • de-DE - German
  • it-IT - Italian
  • pt-PT - Portuguese
  • ru-RU - Russian
  • ja-JP - Japanese
  • ko-KR - Korean
  • zh-CN - Chinese (Simplified)
  • ar-SA - Arabic

Note: For a complete list of all supported languages and their language codes, please refer to the Google Cloud Text-to-Speech documentation.

Parameters

ParameterTypeRequiredDescription
typestringYesFeature type, must be "SPEECH_TO_TEXT"
modelstringYesModel identifier, use "medical_dictation"
promptObject.audioUrlstringYesPath to audio file (uploaded via Asset API)
promptObject.languagestringYesLanguage code for transcription (e.g., "en-US", "vi-VN")

Code Examples

curl --location 'https://api.1min.ai/api/features' \
--header 'API-KEY: <api-key>' \
--header 'Content-Type: application/json' \
--data '{
"type": "SPEECH_TO_TEXT",
"model": "medical_dictation",
"promptObject": {
"audioUrl": "audios/2025_10_21_08_22_58_741_medical_dictation.m4a",
"language": "en-US"
}
}'

Interactive Playground

Try the API directly in your browser:

API Playground

https://api.1min.ai/api/features
Path to the medical dictation audio file (upload via Asset API first)

Generated cURL Command:

curl -X POST "https://api.1min.ai/api/features" \
-H "API-KEY: <your-api-key>" \
-H "Content-Type: application/json" \
-d '{
"type": "SPEECH_TO_TEXT",
"model": "medical_dictation",
"promptObject": {
"audioUrl": "audios/2025_10_21_08_22_58_741_medical_dictation.m4a",
"language": "en-US"
}
}'

Use Cases

  • Clinical Notes: Transcribe physician dictation into structured clinical documentation
  • Patient Records: Convert dictated patient information and medical histories
  • Discharge Summaries: Create comprehensive discharge documentation from dictation
  • Operative Reports: Transcribe surgical procedure descriptions and outcomes
  • Diagnostic Reports: Convert radiology, pathology, and laboratory result dictation
  • Progress Notes: Document patient care updates and treatment plans
  • Prescription Dictation: Accurately transcribe medication orders and instructions
  • Medical Letters: Create referral letters and communication between healthcare providers

Tips for Best Results

  1. Upload First: Use the Asset API to upload your audio file before transcription
  2. Clear Dictation: Speak clearly and at a moderate pace for best results
  3. Medical Terminology: The model is optimized for medical terminology and abbreviations
  4. Structured Format: Works best with organized dictation following clinical documentation patterns
  5. Language Selection: Choose the correct language for accurate medical terminology recognition
  6. Audio Quality: Ensure clear, noise-free audio for optimal transcription accuracy
  7. Consistent Speaker: Single speaker dictation produces the most accurate results

Error Handling

Common error scenarios and solutions:

  • File not found: Ensure the audio file was uploaded via Asset API first
  • Invalid audioUrl: Verify the path matches exactly what was returned from Asset API upload
  • Language not supported: Check that the language code is in the supported list
  • Poor audio quality: Medical dictation requires clear audio for accurate transcription
  • Multiple speakers: Model is optimized for single speaker dictation scenarios

Response

The API returns a JSON response with the transcribed text from the medical dictation, formatted for clinical documentation with proper medical terminology recognition.