How to Use the AudioToTextAI Transcription API

The AudioToTextAI API lets developers integrate speech-to-text transcription directly into their applications. This guide covers authentication, key endpoints, and practical examples to get you started.

Getting Your API Key

To use the API, you need an API key. Generate one from your AudioToTextAI account dashboard under Settings > API Keys. Keep your API key secure and never expose it in client-side code.

Authentication

Include your API key in the Authorization header of every request:

Authorization: Bearer YOUR_API_KEY

Transcription Endpoints

Submit a File for Transcription

POST /api/v1/transcribe/
Content-Type: multipart/form-data

Parameters:
- file: Audio/video file (required)
- language: Language code, e.g., "en" (optional, auto-detect if omitted)
- diarize: true/false (optional, default false)
- timestamps: true/false (optional, default true)
- summary: true/false (optional, default false)

Submit a URL for Transcription

POST /api/v1/transcribe/url/
Content-Type: application/json

{
    "url": "https://example.com/audio.mp3",
    "language": "en",
    "diarize": true,
    "timestamps": true
}

Get Transcription Results

GET /api/v1/transcribe/<uuid>/

List Supported Languages

GET /api/v1/languages/

Response Format

Successful transcription responses include:

{
    "uuid": "abc123-...",
    "status": "completed",
    "transcript_text": "Full transcript text here...",
    "segments": [
        {
            "start": 0.0,
            "end": 2.5,
            "text": "Hello everyone.",
            "speaker": "Speaker 1"
        }
    ],
    "summary": "Meeting summary here...",
    "language": "en",
    "duration": 3600.0
}

Handling Asynchronous Processing

Transcription is asynchronous. When you submit a file, the API returns a UUID immediately. Poll the results endpoint to check the status:

  • pending: File received, waiting to be processed
  • processing: Transcription in progress
  • completed: Results are ready
  • failed: An error occurred

For production workflows, use webhooks instead of polling. Configure a callback URL in your account settings to receive a POST request when transcription is complete.

Error Handling

The API returns standard HTTP status codes:

  • 200: Success
  • 400: Bad request (invalid parameters)
  • 401: Unauthorized (invalid API key)
  • 402: Insufficient credits
  • 404: Transcription not found
  • 429: Rate limited
  • 500: Server error

Best Practices

  • Use webhooks for production workloads instead of polling
  • Implement exponential backoff for retries
  • Store the UUID returned from submission to retrieve results later
  • Validate file formats before uploading to avoid unnecessary API calls
  • Monitor your credit balance and enable auto top-up for uninterrupted service

For complete API documentation and additional code examples, visit the API documentation page.

Tags: API developer tutorial integration

Try AudioToTextAI Today

Convert your audio and video files to text with AI-powered accuracy. Get started in seconds.

Start Transcribing Free