Claude commited on
Commit
4c2ca9c
·
unverified ·
1 Parent(s): 6736e17

docs: comprehensive English README

Browse files

Complete rewrite of README.md in English covering:
- Project overview and IIIF-native architecture
- Features list and architecture diagram
- Quick start (Docker + local dev)
- Usage workflow (corpus → ingest → AI → review → export)
- Corpus profiles table with layers
- AI providers configuration
- Full API reference (28 endpoints)
- Data model (PageMaster structure)
- IIIF-native vs file upload modes explained
- Project structure tree
- Testing, deployment (HuggingFace + self-hosted)

https://claude.ai/code/session_01UB4he7RdRPHLvNjky4X8Sw

Files changed (1) hide show
  1. README.md +255 -63
README.md CHANGED
@@ -10,107 +10,299 @@ pinned: false
10
 
11
  # IIIF Studio
12
 
13
- Plateforme générique de génération d'éditions savantes augmentées pour documents
14
- patrimoniaux numérisés : manuscrits médiévaux, incunables, cartulaires, archives,
15
- chartes, papyri tout type de document, toute époque, toute langue.
 
 
16
 
17
  ---
18
 
19
- ## Structure du dépôt
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
  ```
22
- iiif-studio/
23
- ├── backend/ # API FastAPI + pipeline Python
24
- ├── app/
25
- │ │ ├── api/v1/ # endpoints REST (/api/v1/...)
26
- │ ├── models/ # tables SQLAlchemy (SQLite async)
27
- │ │ ├── schemas/ # modèles Pydantic v2
28
- └── services/ # ingest / image / ai / export / search
29
- │ ├── tests/ # suite pytest (563 tests)
30
- └── pyproject.toml
31
- ├── frontend/ # React + TypeScript + Vite (design rétro)
32
- ├── profiles/ # 4 profils de corpus JSON
33
- ├── prompts/ # templates de prompts par profil
34
- ── infra/ # docker-compose (dev local)
35
- ├── Dockerfile # image multi-stage (frontend + backend)
36
- ── data/ # artefacts runtime — NON versionné
 
 
 
 
37
  ```
38
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
  ---
40
 
41
- ## Lancer en local (Docker)
 
 
42
 
43
  ```bash
44
- # 1. Cloner le dépôt
45
- git clone https://github.com/<org>/iiif-studio && cd iiif-studio
46
 
47
- # 2. Définir les variables d'environnement
48
- cp .env.example .env # puis renseigner les clés dans .env
 
49
 
50
- # 3. Démarrer le service
51
  docker compose -f infra/docker-compose.yml up --build
52
 
53
- # 4. Vérifier
54
- curl http://localhost:7860/api/v1/profiles
55
  ```
56
 
57
- L'API est accessible sur `http://localhost:7860`. La documentation interactive
58
- Swagger est disponible sur `http://localhost:7860/docs`.
59
-
60
- ---
61
-
62
- ## Lancer les tests
63
 
64
  ```bash
 
65
  cd backend
66
  pip install -e ".[dev]"
67
- pytest tests/ -v --cov=app
 
 
 
 
 
68
  ```
69
 
70
- Résultat attendu : **563 passed, 3 skipped**.
71
 
72
  ---
73
 
74
- ## Profils disponibles
75
 
76
- | Profil | Description |
77
- |--------|-------------|
78
- | `medieval-illuminated` | Manuscrits médiévaux enluminés (OCR diplomatique, iconographie, commentaire) |
79
- | `medieval-textual` | Manuscrits médiévaux textuels (OCR, traduction, commentaire savant) |
80
- | `early-modern-print` | Imprimés anciens (incunables, livres des XVIe–XVIIIe siècles) |
81
- | `modern-handwritten` | Documents manuscrits modernes (cursive, archives, chartes) |
82
 
83
- ```bash
84
- # Lister les profils via l'API
85
- curl http://localhost:7860/api/v1/profiles
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
86
  ```
87
 
 
 
88
  ---
89
 
90
- ## Providers IA
 
 
91
 
92
- Le backend détecte automatiquement quels providers sont disponibles selon les
93
- variables d'environnement présentes. Pas de sélecteur global `AI_PROVIDER` —
94
- le modèle est choisi par corpus depuis l'interface d'administration.
 
 
 
95
 
96
- | Provider | Variable d'environnement |
97
- |----------|--------------------------|
98
- | Google AI Studio | `GOOGLE_AI_STUDIO_API_KEY` |
99
- | Vertex AI (clé API) | `VERTEX_API_KEY` |
100
- | Vertex AI (compte de service) | `VERTEX_SERVICE_ACCOUNT_JSON` |
101
- | Mistral AI | `MISTRAL_API_KEY` |
102
 
103
- Au moins **une** clé est nécessaire pour que le pipeline fonctionne.
104
 
105
- Les clés ne doivent **jamais** figurer dans le code, les commits ou l'image Docker.
106
- Sur HuggingFace Spaces, les renseigner dans **Settings → Repository secrets**.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
107
 
108
  ---
109
 
110
- ## Déploiement HuggingFace Spaces
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
111
 
112
- Ce dépôt est configuré pour HuggingFace Spaces (SDK Docker, port 7860).
113
- Les artefacts de traitement (images, JSON maîtres, exports XML) sont stockés
114
- sur HuggingFace Datasets — pas dans l'image Docker.
115
 
116
- Voir `.huggingface/README.md` pour la configuration spécifique du Space.
 
10
 
11
  # IIIF Studio
12
 
13
+ A generic platform for generating AI-augmented scholarly editions from digitized heritage documents — medieval manuscripts, incunabula, cartularies, archives, charters, papyri. Any document type, any era, any language.
14
+
15
+ IIIF Studio ingests images from any [IIIF](https://iiif.io/)-compliant server, analyzes them with multimodal AI (Google Gemini, Mistral), and produces structured scholarly data: diplomatic OCR, layout detection, translations, commentaries, and iconographic analysis — all exportable as ALTO XML, METS, and IIIF Presentation 3.0 manifests.
16
+
17
+ **Images are never stored locally.** The platform streams them from origin servers using the IIIF Image API, storing only the AI-generated metadata (~5 KB per page instead of ~50 MB).
18
 
19
  ---
20
 
21
+ ## Features
22
+
23
+ - **IIIF-native architecture** — images streamed from origin servers (Gallica, BnF, Bodleian, etc.) with tiled deep zoom via OpenSeadragon
24
+ - **Multi-provider AI** — Google AI Studio, Vertex AI, Mistral AI. Model selected per corpus, auto-detected from environment
25
+ - **Profile-driven analysis** — 4 built-in corpus profiles (medieval illuminated, medieval textual, early modern print, modern handwritten), each with tailored prompts and active layers
26
+ - **Structured output** — layout regions with bounding boxes, diplomatic OCR, translations (FR/EN), scholarly and public commentary, iconographic analysis, uncertainty tracking
27
+ - **Standards-compliant export** — IIIF Presentation 3.0 manifests (with Image Service for tiled zoom), ALTO XML, METS XML, ZIP bundles
28
+ - **Human-in-the-loop** — editorial correction interface with versioned history and rollback
29
+ - **Full-text search** — accent-insensitive search across OCR text, translations, and iconographic tags
30
+
31
+ ---
32
+
33
+ ## Architecture
34
 
35
  ```
36
+ ┌──────────────────────────────┐
37
+ │ IIIF Image Servers │ Gallica, BnF, Bodleian, ...
38
+ (origin — images stay here)│
39
+ ──────────────┬───────────────┘
40
+ IIIF Image API
41
+ ──────────┼──────────┐
42
+
43
+ ───▼───┐ ┌───▼───┐ ┌───▼───┐
44
+ Backend│ │Viewer │ │Tiled │
45
+ │ (AI) │display│ zoom │
46
+ bytes │ │ │ │
47
+ │in RAM │ │ │ │
48
+ ───┬───┘ └───────┘ └───────┘
49
+
50
+ ───▼─────────────────────┐
51
+ │ Local storage │ JSON only (~5 KB/page)
52
+ │ master.json + ai_raw.json│ No images on disk
53
+ │ + SQLite metadata │
54
+ └──────────────────────────┘
55
  ```
56
 
57
+ ### Tech stack
58
+
59
+ | Layer | Technology |
60
+ |-------|-----------|
61
+ | Backend | Python 3.11+, FastAPI, Uvicorn |
62
+ | Database | SQLite via SQLAlchemy 2.0 async + aiosqlite |
63
+ | Validation | Pydantic v2 |
64
+ | AI providers | Google Gemini (google-genai SDK), Mistral AI |
65
+ | Image viewer | OpenSeadragon (IIIF tiled zoom) |
66
+ | Frontend | React 18, TypeScript, Vite, Tailwind CSS, React Router |
67
+ | Exports | lxml (ALTO/METS XML), IIIF Presentation 3.0 |
68
+ | Deployment | Docker (HuggingFace Spaces) |
69
+
70
  ---
71
 
72
+ ## Quick start
73
+
74
+ ### Docker (recommended)
75
 
76
  ```bash
77
+ git clone https://github.com/maribakulj/IIIF-Studio.git && cd IIIF-Studio
 
78
 
79
+ # Configure at least one AI provider key
80
+ cp .env.example .env
81
+ # Edit .env and add your API key(s)
82
 
83
+ # Build and run
84
  docker compose -f infra/docker-compose.yml up --build
85
 
86
+ # Open http://localhost:7860
 
87
  ```
88
 
89
+ ### Local development
 
 
 
 
 
90
 
91
  ```bash
92
+ # Backend
93
  cd backend
94
  pip install -e ".[dev]"
95
+ uvicorn app.main:app --reload --port 7860
96
+
97
+ # Frontend (separate terminal)
98
+ cd frontend
99
+ npm install
100
+ npm run dev
101
  ```
102
 
103
+ The API is available at `http://localhost:7860/api/v1/`. Interactive Swagger docs at `http://localhost:7860/docs`.
104
 
105
  ---
106
 
107
+ ## Usage workflow
108
 
109
+ 1. **Create a corpus** — select a profile matching your document type
110
+ 2. **Ingest pages** — provide a IIIF manifest URL, direct image URLs, or upload files
111
+ 3. **Select an AI model** choose a provider and model from the detected options
112
+ 4. **Run the pipeline** AI analyzes each page: layout detection, OCR, translation, commentary
113
+ 5. **Review and correct** use the Editor to validate, correct OCR, adjust regions
114
+ 6. **Export** download IIIF manifest, ALTO XML, METS XML, or a ZIP bundle
115
 
116
+ ---
117
+
118
+ ## Corpus profiles
119
+
120
+ Profiles control which analysis layers are active, which prompt templates are used, and what uncertainty thresholds apply.
121
+
122
+ | Profile | Script | Languages | Key layers |
123
+ |---------|--------|-----------|------------|
124
+ | `medieval-illuminated` | Caroline | Latin, French | OCR, translation, iconography, commentary, material notes |
125
+ | `medieval-textual` | Gothic | Latin, French | OCR, translation, scholarly commentary |
126
+ | `early-modern-print` | Print | French, Latin | OCR, summary |
127
+ | `modern-handwritten` | Cursive | French | OCR, summary |
128
+
129
+ Custom profiles can be added as JSON files in the `profiles/` directory with matching prompt templates in `prompts/`.
130
+
131
+ ---
132
+
133
+ ## AI providers
134
+
135
+ The backend auto-detects available providers from environment variables. No global selector — the model is chosen per corpus from the admin interface.
136
+
137
+ | Provider | Environment variable | Notes |
138
+ |----------|---------------------|-------|
139
+ | Google AI Studio | `GOOGLE_AI_STUDIO_API_KEY` | Free tier, good for development |
140
+ | Vertex AI (API key) | `VERTEX_API_KEY` | Production, pay-per-use |
141
+ | Vertex AI (service account) | `VERTEX_SERVICE_ACCOUNT_JSON` | Institutional deployments |
142
+ | Mistral AI | `MISTRAL_API_KEY` | Alternative provider |
143
+
144
+ At least **one** key is required for the pipeline to function. Keys must **never** appear in code, commits, or Docker images.
145
+
146
+ ---
147
+
148
+ ## API reference
149
+
150
+ All endpoints are prefixed with `/api/v1/`. Full OpenAPI docs available at `/docs`.
151
+
152
+ ### Corpus management
153
+ | Method | Endpoint | Description |
154
+ |--------|----------|-------------|
155
+ | `GET` | `/corpora` | List all corpora |
156
+ | `POST` | `/corpora` | Create a corpus (slug + title + profile) |
157
+ | `GET` | `/corpora/{id}` | Get a corpus |
158
+ | `DELETE` | `/corpora/{id}` | Delete a corpus (cascades) |
159
+ | `GET` | `/corpora/{id}/manuscripts` | List manuscripts in a corpus |
160
+
161
+ ### Ingestion
162
+ | Method | Endpoint | Description |
163
+ |--------|----------|-------------|
164
+ | `POST` | `/corpora/{id}/ingest/iiif-manifest` | Ingest from a IIIF manifest URL |
165
+ | `POST` | `/corpora/{id}/ingest/iiif-images` | Ingest from direct image URLs |
166
+ | `POST` | `/corpora/{id}/ingest/files` | Upload image files |
167
+
168
+ ### AI pipeline
169
+ | Method | Endpoint | Description |
170
+ |--------|----------|-------------|
171
+ | `GET` | `/providers` | List detected AI providers |
172
+ | `GET` | `/providers/{type}/models` | List models for a provider |
173
+ | `PUT` | `/corpora/{id}/model` | Set AI model for a corpus |
174
+ | `POST` | `/corpora/{id}/run` | Run pipeline on all pages |
175
+ | `POST` | `/pages/{id}/run` | Run pipeline on a single page |
176
+ | `GET` | `/jobs/{id}` | Check job status |
177
+ | `POST` | `/jobs/{id}/retry` | Retry a failed job |
178
+
179
+ ### Pages and content
180
+ | Method | Endpoint | Description |
181
+ |--------|----------|-------------|
182
+ | `GET` | `/pages/{id}` | Page metadata |
183
+ | `GET` | `/pages/{id}/master-json` | Full page master (canonical JSON) |
184
+ | `GET` | `/pages/{id}/layers` | List annotation layers |
185
+ | `POST` | `/pages/{id}/corrections` | Apply editorial corrections |
186
+ | `GET` | `/pages/{id}/history` | Version history |
187
+ | `GET` | `/search?q=` | Full-text search across all pages |
188
+
189
+ ### Export
190
+ | Method | Endpoint | Description |
191
+ |--------|----------|-------------|
192
+ | `GET` | `/manuscripts/{id}/iiif-manifest` | IIIF Presentation 3.0 manifest |
193
+ | `GET` | `/manuscripts/{id}/mets` | METS XML |
194
+ | `GET` | `/pages/{id}/alto` | ALTO XML |
195
+ | `GET` | `/manuscripts/{id}/export.zip` | ZIP bundle (manifest + METS + ALTO) |
196
+
197
+ ---
198
+
199
+ ## Data model
200
+
201
+ Each analyzed page produces a `master.json` — the canonical source of truth for all exports.
202
+
203
+ ```
204
+ PageMaster
205
+ ├── image → IIIF service URL, canvas dimensions, provenance
206
+ ├── layout → regions with bounding boxes [x, y, w, h] in absolute pixels
207
+ ├── ocr → diplomatic text, confidence, uncertain segments
208
+ ├── translation → French, English
209
+ ├── summary → short + detailed
210
+ ├── commentary → public, scholarly, sourced claims with certainty levels
211
+ ├── extensions → profile-specific data (iconography, materiality, etc.)
212
+ ├── processing → provider, model, prompt version, timestamp
213
+ └── editorial → status (machine_draft → validated → published), version
214
  ```
215
 
216
+ Bounding boxes follow the convention `[x, y, width, height]` in absolute pixels of the original image. Coordinates are automatically scaled from AI analysis space to full canvas dimensions.
217
+
218
  ---
219
 
220
+ ## IIIF-native image handling
221
+
222
+ IIIF Studio operates in two modes:
223
 
224
+ ### IIIF-native mode (default for manifest/URL ingestion)
225
+ - Images are **never downloaded or stored** locally
226
+ - At ingestion: IIIF Image Service URL and canvas dimensions are extracted from the manifest
227
+ - At analysis: a 1500px derivative is fetched in memory via the IIIF Image API (`{service}/full/!1500,1500/0/default.jpg`), sent to the AI, then discarded
228
+ - In the viewer: OpenSeadragon loads `info.json` from the IIIF server for native tiled deep zoom
229
+ - Storage per page: **~5 KB** (JSON metadata only)
230
 
231
+ ### File upload mode (for non-IIIF sources)
232
+ - Uploaded images are stored locally in `data/corpora/{slug}/`
233
+ - Derivatives (1500px) and thumbnails (256px) are created on disk
234
+ - Storage per page: **~50 MB** (images + JSON)
 
 
235
 
236
+ ---
237
 
238
+ ## Project structure
239
+
240
+ ```
241
+ IIIF-Studio/
242
+ ├── backend/
243
+ │ ├── app/
244
+ │ │ ├── main.py # FastAPI entry point
245
+ │ │ ├── config.py # Pydantic settings from env vars
246
+ │ │ ├── api/v1/ # REST endpoints
247
+ │ │ ├── models/ # SQLAlchemy ORM models
248
+ │ │ ├── schemas/ # Pydantic v2 schemas (canonical)
249
+ │ │ └── services/
250
+ │ │ ├── ai/ # Provider factory, analyzer, prompt loader
251
+ │ │ ├── ingest/ # IIIF fetcher, service detection
252
+ │ │ ├── image/ # Normalizer (in-memory + legacy disk)
253
+ │ │ └── export/ # ALTO, METS, IIIF manifest generators
254
+ │ ├── tests/ # 585 tests (pytest + pytest-asyncio)
255
+ │ └── pyproject.toml
256
+ ├── frontend/
257
+ │ ├── src/
258
+ │ │ ├── App.tsx # React Router (/, /admin, /reader, /editor)
259
+ │ │ ├── lib/api.ts # Typed API client
260
+ │ │ ├── pages/ # Home, Reader, Editor, Admin
261
+ │ │ └── components/ # Viewer (OpenSeadragon), retro UI system
262
+ │ └── package.json
263
+ ├── profiles/ # 4 corpus profile JSON files
264
+ ├── prompts/ # 9 prompt templates organized by profile
265
+ ├── Dockerfile # Multi-stage build (Node + Python)
266
+ ├── infra/docker-compose.yml # Local development
267
+ └── .env.example # Environment variable template
268
+ ```
269
 
270
  ---
271
 
272
+ ## Testing
273
+
274
+ ```bash
275
+ cd backend
276
+ pip install -e ".[dev]"
277
+ pytest tests/ -v --cov=app
278
+ ```
279
+
280
+ Expected result: **585 passed, 3 skipped**.
281
+
282
+ All AI calls are mocked in tests — no API keys required to run the test suite.
283
+
284
+ ---
285
+
286
+ ## Deployment
287
+
288
+ ### HuggingFace Spaces
289
+
290
+ This repository is configured for [HuggingFace Spaces](https://huggingface.co/spaces) with Docker SDK on port 7860. AI keys are stored as Space secrets (Settings → Repository secrets).
291
+
292
+ The CI pipeline (`.github/workflows/`) runs tests on every push and auto-deploys to HuggingFace Spaces on merge to `main`.
293
+
294
+ ### Self-hosted
295
+
296
+ ```bash
297
+ docker build -t iiif-studio .
298
+ docker run -p 7860:7860 \
299
+ -e GOOGLE_AI_STUDIO_API_KEY=your_key \
300
+ -v ./data:/app/data \
301
+ iiif-studio
302
+ ```
303
+
304
+ ---
305
 
306
+ ## License
 
 
307
 
308
+ [Apache License 2.0](LICENSE)