Spaces:

GarchenArchive
/

ArchiveAI

Running

App Files Files Community

ArchiveAI / API.md

billingsmoore

sync ArchiveAI to HF space

8b6cd89 1 day ago

preview code

raw

history blame contribute delete

9.02 kB

ArchiveAI — API Reference

Authentication

API token

All API requests must include the token:

X-API-Token: <token>

A missing or incorrect token returns 401 Unauthorized. This applies to every endpoint listed in this document.

Gemini API key

The Space has a Gemini API key configured server-side. You do not need to pass gemini_api_key — it is used automatically for translation, post-edit, and summary.

Submit a job

POST https://garchenarchive-archiveai.hf.space/api/jobs
Content-Type: multipart/form-data

Returns immediately with a job_id. Processing runs in the background.

Parameters

Parameter	Type	Default	Description
`file`	file	—	Audio, video, `.srt`, `.txt`, or `.json` (see File types)
`do_stt`	boolean	`true`	Run Speech-to-Text
`do_translation`	boolean	`false`	Run Translation
`do_tts`	boolean	`false`	Run Text-to-Speech
`do_summary`	boolean	`false`	Generate a summary
`language`	string	`"Both"`	STT language — `"English"`, `"Tibetan"`, `"Tibetan (Base)"`, or `"Both"`
`selected_speakers`	JSON string	`"[]"`	Speaker names to keep, e.g. `'["Rinpoche"]'`. Empty = all speakers
`speaker_threshold`	float	`0.5`	Speaker similarity threshold, 0.0–1.0
`use_gemini_post_edit`	boolean	`false`	Correct STT output via Gemini
`gemini_model`	string	`"gemini-2.5-flash"`	See Models
`min_clip_duration`	int	`3`	Minimum segment length in seconds
`max_clip_duration`	int	`30`	Maximum segment length in seconds
`target_language`	string	`"English"`	Translation target language
`gemini_api_key`	string	`""`	Optional — override the server-side Gemini API key
`voice_label`	string	`"Female: Sarah"`	TTS voice (see Voices)
`prose_speed`	float	`1.0`	Prose speed, 0.5–1.0
`mantra_speed`	float	`0.75`	Mantra speed, 0.5–1.0
`webhook_url`	string	`""`	HTTPS URL to POST results to on completion or failure. Must use `https://` and resolve to a public IP (private/loopback addresses are rejected with 422)

Response — 202 Accepted

{
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "queued"
}

curl example

curl -X POST https://garchenarchive-archiveai.hf.space/api/jobs \
  -H "X-API-Token: <token>" \
  -F "file=@/path/to/audio.mp3" \
  -F "do_stt=true" \
  -F "do_translation=true" \
  -F "target_language=English" \
  -F "webhook_url=https://yourserver.com/webhooks/archiveai"

Poll job status

GET https://garchenarchive-archiveai.hf.space/api/jobs/{job_id}

Response — queued or running

{
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "running",
  "source": "api",
  "created_at": 1713312000.0,
  "updated_at": 1713312015.3,
  "has_webhook": true,
  "params": {
    "do_stt": true,
    "do_translation": true,
    "do_tts": false,
    "do_summary": false,
    "language": "Both",
    "target_language": "English",
    "gemini_api_key": "***",
    "..."  : "..."
  }
}

Response — done

{
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "done",
  "source": "api",
  "created_at": 1713312000.0,
  "updated_at": 1713312087.6,
  "has_webhook": true,
  "params": { "...": "..." },
  "result": {
    "segments": [
      {
        "source": "SPEAKER_00: Original transcribed text.",
        "target": "Translated text.",
        "timestamp": "00:00:01,000 --> 00:00:04,500"
      }
    ],
    "summary": "Summary text, or null if not requested.",
    "srt_content": "1\n00:00:01,000 --> 00:00:04,500\nOriginal transcribed text.\n\n",
    "audio_wav_base64": "<base64-encoded WAV, or null if TTS not requested>",
    "audio_sample_rate": 24000
  }
}

Response — failed

{
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "failed",
  "source": "api",
  "created_at": 1713312000.0,
  "updated_at": 1713312020.1,
  "has_webhook": false,
  "params": { "...": "..." },
  "error": "STT produced no segments: ..."
}

Response — 404

Returned if the job ID is unknown or has expired (jobs are kept for 1 hour). Treat as a permanent failure and resubmit.

Webhook

If webhook_url was provided, the server POSTs the same JSON body as the poll response to that URL when the job reaches done or failed. The Content-Type is application/json. Your endpoint should return any 2xx status.

The URL is validated at submission time: it must use https:// and its hostname must resolve to a public IP address. Invalid URLs are rejected immediately with 422 Unprocessable Entity.

List all jobs

GET https://garchenarchive-archiveai.hf.space/api/jobs

Returns all jobs in the in-memory store, newest first. Intended for operator monitoring.

Query parameters

Parameter	Description
`status`	Optional. Filter by status: `queued`, `running`, `done`, or `failed`

curl example

# All jobs
curl https://garchenarchive-archiveai.hf.space/api/jobs \
  -H "X-API-Token: <token>"

# Running jobs only
curl "https://garchenarchive-archiveai.hf.space/api/jobs?status=running" \
  -H "X-API-Token: <token>"

Response — 200 OK

Array of job objects. Each item includes a result_summary (not the full result) for done jobs to keep the response lightweight.

[
  {
    "job_id": "550e8400-e29b-41d4-a716-446655440000",
    "status": "done",
    "source": "api",
    "created_at": 1713312000.0,
    "updated_at": 1713312087.6,
    "has_webhook": true,
    "params": {
      "do_stt": true,
      "do_translation": true,
      "do_tts": false,
      "language": "Both",
      "gemini_api_key": "***",
      "...": "..."
    },
    "result_summary": {
      "segment_count": 125,
      "has_audio": false,
      "has_srt": true,
      "summary": null
    }
  },
  {
    "job_id": "661f9511-f30c-52e5-b827-557766551111",
    "status": "running",
    "source": "api",
    "created_at": 1713312100.0,
    "updated_at": 1713312110.0,
    "has_webhook": false,
    "params": { "...": "..." }
  }
]

source is "api" for jobs submitted via the REST API and "ui" for jobs submitted through the Gradio interface (which will have params: null).

Space status

GET https://garchenarchive-archiveai.hf.space/api/status

Returns the count of currently active (queued or running) jobs across both the API and the Gradio UI, and whether the space is in drain mode.

curl example

curl https://garchenarchive-archiveai.hf.space/api/status \
  -H "X-API-Token: <token>"

Response — 200 OK

{
  "active_jobs": 1,
  "draining": false
}

Drain mode

POST https://garchenarchive-archiveai.hf.space/api/drain

Signals the space to stop accepting new jobs. In-flight jobs continue to completion. Used by push_to_hf.sh before a deployment to avoid interrupting users.

Auth: requires Authorization: Bearer <HF_TOKEN> (the space's HuggingFace token, not the API token). Returns 503 if HF_TOKEN is not configured on the space.

Once set, drain mode persists until the space restarts (i.e. a new deployment clears it automatically).

curl example

curl -X POST https://garchenarchive-archiveai.hf.space/api/drain \
  -H "Authorization: Bearer <HF_TOKEN>"

Response — 200 OK

{
  "draining": true,
  "active_jobs": 2
}

While draining, POST /api/jobs returns 503 with {"error": "draining"} and the Gradio UI shows an error immediately rather than queuing work.

Reference

Segment object

{
  "source": "SPEAKER_00: Original text",
  "target": "Translated text (empty string if not yet translated)",
  "timestamp": "00:00:01,000 --> 00:00:05,000"
}

timestamp follows SRT format. May be an empty string for plain-text inputs.

Supported file types

Extension	Handled as
`.mp3`, `.wav`, `.m4a`, `.mp4`, `.mov`, `.mkv`, `.avi`, `.webm`	Audio/video — passed to STT
`.srt`	Subtitle file — parsed into segments
`.txt`	Plain text — each line becomes a segment
`.json`	Segment array — must match the segment schema above

Models

Value	Notes
`gemini-2.5-flash`	Default — fast, good quality
`gemini-2.5-pro`	Higher quality, slower
`gemini-3-flash-preview`	Preview
`gemini-3.1-pro-preview`	Preview

If a model fails, the app automatically retries each remaining model in the order listed above before giving up.

Voices

`voice_label` value	Description
`Female: Sarah`	Default
`Female: Heart`
`Female: Alice`
`Female: Emma`
`Male: Adam`
`Male: Onyx`
`Male: Daniel`
`Male: George`