aigencydev commited on
Commit
6b1435e
·
verified ·
1 Parent(s): 002bf75

Initial release — AIGENCY V4 model card v1.0

Browse files
Files changed (1) hide show
  1. README.md +542 -0
README.md ADDED
@@ -0,0 +1,542 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: aigency-commercial
4
+ license_link: https://aigency.dev/license
5
+ language:
6
+ - tr
7
+ - en
8
+ library_name: aigency-api
9
+ pipeline_tag: text-generation
10
+ tags:
11
+ - turkish
12
+ - multimodal
13
+ - sovereign
14
+ - frontier-adjacent
15
+ - aigency
16
+ - ecloud
17
+ - production
18
+ inference: false
19
+ extra_gated_heading: AIGENCY V4 is offered via API
20
+ extra_gated_description: |
21
+ Model weights are not distributed on HuggingFace. AIGENCY V4 is accessible
22
+ via the eCloud production API at https://aigency.dev. This page is a
23
+ reference card describing architecture, evaluation methodology, and
24
+ benchmark results, and links to a live demo Space.
25
+ model-index:
26
+ - name: AIGENCY V4
27
+ results:
28
+ - task:
29
+ type: text-generation
30
+ name: Code generation
31
+ dataset:
32
+ type: openai_humaneval
33
+ name: HumanEval (pass@1)
34
+ metrics:
35
+ - type: pass@1
36
+ value: 84.15
37
+ name: pass@1
38
+ verified: false
39
+ - task:
40
+ type: text-generation
41
+ name: Code generation extended
42
+ dataset:
43
+ type: humaneval-plus
44
+ name: HumanEval+ (pass@1)
45
+ metrics:
46
+ - type: pass@1
47
+ value: 79.88
48
+ name: pass@1
49
+ verified: false
50
+ - task:
51
+ type: text-generation
52
+ name: Code generation
53
+ dataset:
54
+ type: mbpp
55
+ name: MBPP (sanitized)
56
+ metrics:
57
+ - type: pass@1
58
+ value: 84.82
59
+ name: pass@1
60
+ verified: false
61
+ - task:
62
+ type: text-generation
63
+ name: Code generation extended
64
+ dataset:
65
+ type: mbpp-plus
66
+ name: MBPP+
67
+ metrics:
68
+ - type: pass@1
69
+ value: 78.04
70
+ name: pass@1
71
+ verified: false
72
+ - task:
73
+ type: text-generation
74
+ name: Mathematical reasoning
75
+ dataset:
76
+ type: gsm8k
77
+ name: GSM8K
78
+ metrics:
79
+ - type: accuracy
80
+ value: 94.62
81
+ name: accuracy
82
+ verified: false
83
+ - task:
84
+ type: text-generation
85
+ name: Multitask language understanding
86
+ dataset:
87
+ type: cais/mmlu
88
+ name: MMLU (stratified n=1000)
89
+ metrics:
90
+ - type: accuracy
91
+ value: 80.10
92
+ name: accuracy
93
+ verified: false
94
+ - task:
95
+ type: text-generation
96
+ name: Multitask language understanding (Pro)
97
+ dataset:
98
+ type: TIGER-Lab/MMLU-Pro
99
+ name: MMLU-Pro (n=1000)
100
+ metrics:
101
+ - type: accuracy
102
+ value: 50.20
103
+ name: accuracy
104
+ verified: false
105
+ - task:
106
+ type: text-generation
107
+ name: Scientific reasoning
108
+ dataset:
109
+ type: ai2_arc
110
+ name: ARC-Challenge
111
+ metrics:
112
+ - type: accuracy
113
+ value: 94.88
114
+ name: accuracy
115
+ verified: false
116
+ - task:
117
+ type: text-generation
118
+ name: Graduate-level QA
119
+ dataset:
120
+ type: idavidrein/gpqa
121
+ name: GPQA Diamond
122
+ metrics:
123
+ - type: accuracy
124
+ value: 37.88
125
+ name: accuracy
126
+ verified: false
127
+ - task:
128
+ type: text-generation
129
+ name: Truthfulness
130
+ dataset:
131
+ type: truthful_qa
132
+ name: TruthfulQA MC1
133
+ metrics:
134
+ - type: accuracy
135
+ value: 76.38
136
+ name: accuracy
137
+ verified: false
138
+ - task:
139
+ type: text-generation
140
+ name: Instruction following
141
+ dataset:
142
+ type: google/IFEval
143
+ name: IFEval (strict)
144
+ metrics:
145
+ - type: accuracy
146
+ value: 80.22
147
+ name: strict-prompt-level
148
+ verified: false
149
+ - task:
150
+ type: text-generation
151
+ name: Commonsense reasoning
152
+ dataset:
153
+ type: hellaswag
154
+ name: HellaSwag (n=1000)
155
+ metrics:
156
+ - type: accuracy
157
+ value: 88.60
158
+ name: accuracy
159
+ verified: false
160
+ - task:
161
+ type: text-generation
162
+ name: Coreference reasoning
163
+ dataset:
164
+ type: winogrande
165
+ name: WinoGrande XL
166
+ metrics:
167
+ - type: accuracy
168
+ value: 74.66
169
+ name: accuracy
170
+ verified: false
171
+ - task:
172
+ type: text-generation
173
+ name: Turkish reading comprehension
174
+ dataset:
175
+ type: facebook/belebele
176
+ name: Belebele-TR (Turkish)
177
+ metrics:
178
+ - type: accuracy
179
+ value: 87.33
180
+ name: accuracy
181
+ verified: false
182
+ - task:
183
+ type: text-generation
184
+ name: Turkish extractive QA
185
+ dataset:
186
+ type: tquad
187
+ name: TQuAD (F1 ≥ 0.5)
188
+ metrics:
189
+ - type: f1
190
+ value: 82.40
191
+ name: F1 ≥ 0.5
192
+ verified: false
193
+ - task:
194
+ type: text-generation
195
+ name: Turkish multitask understanding
196
+ dataset:
197
+ type: tr-mmlu
198
+ name: TR-MMLU
199
+ metrics:
200
+ - type: accuracy
201
+ value: 70.80
202
+ name: accuracy
203
+ verified: false
204
+ - task:
205
+ type: text-generation
206
+ name: Turkish natural-language inference
207
+ dataset:
208
+ type: xnli
209
+ name: XNLI-TR
210
+ metrics:
211
+ - type: accuracy
212
+ value: 73.40
213
+ name: accuracy
214
+ verified: false
215
+ - task:
216
+ type: text-generation
217
+ name: Turkish grammar
218
+ dataset:
219
+ type: tr-grammar-synthetic
220
+ name: TR Grammar (synthetic 50/50)
221
+ metrics:
222
+ - type: accuracy
223
+ value: 79.00
224
+ name: accuracy
225
+ verified: false
226
+ - task:
227
+ type: image-text-to-text
228
+ name: Multimodal QA
229
+ dataset:
230
+ type: MMMU
231
+ name: MMMU (val, n=30)
232
+ metrics:
233
+ - type: accuracy
234
+ value: 53.33
235
+ name: accuracy
236
+ verified: false
237
+ - task:
238
+ type: image-text-to-text
239
+ name: Chart QA
240
+ dataset:
241
+ type: HuggingFaceM4/ChartQA
242
+ name: ChartQA (relaxed)
243
+ metrics:
244
+ - type: accuracy
245
+ value: 67.68
246
+ name: relaxed accuracy
247
+ verified: false
248
+ - task:
249
+ type: image-text-to-text
250
+ name: Document QA
251
+ dataset:
252
+ type: lmms-lab/DocVQA
253
+ name: DocVQA (ANLS ≥ 0.5)
254
+ metrics:
255
+ - type: accuracy
256
+ value: 79.17
257
+ name: ANLS ≥ 0.5
258
+ verified: false
259
+ - task:
260
+ type: image-text-to-text
261
+ name: Visual mathematical reasoning
262
+ dataset:
263
+ type: AI4Math/MathVista
264
+ name: MathVista (testmini)
265
+ metrics:
266
+ - type: accuracy
267
+ value: 34.13
268
+ name: accuracy
269
+ verified: false
270
+ ---
271
+
272
+ # AIGENCY V4
273
+
274
+ > **Sovereign, fully independent, multimodal — 128B parameters.**
275
+ > A globally competitive Turkish-first AI model: world-leading on Turkish
276
+ > reading comprehension and natural-language inference, frontier-level on
277
+ > grade-school math and scientific reasoning, KVKK-resident.
278
+
279
+ [**🇹🇷 Türkçe README**](#türkçe) · [**🇬🇧 English README**](#english) · [**📄 Whitepaper (EN)**](https://github.com/ecloud-bh/aigency-v4-whitepaper/blob/main/AIGENCY-V4-Whitepaper-EN.pdf) · [**📄 Whitepaper (TR)**](https://github.com/ecloud-bh/aigency-v4-whitepaper/blob/main/AIGENCY-V4-Whitepaper-TR.pdf) · [**🌐 Try the demo**](https://huggingface.co/spaces/aigencydev/AIGENCY-V4-Demo) · [**🔗 API**](https://aigency.dev)
280
+
281
+ ---
282
+
283
+ ## English
284
+
285
+ ### Model summary
286
+
287
+ **AIGENCY V4** is the multimodal successor to AIGENCY V3, developed by
288
+ **eCloud Yazılım Teknolojileri** and released to production in Q2 2026.
289
+ The model retains V3's four sovereignty principles — zero external parameter
290
+ dependency, sovereign data residency, transparent architectural documentation,
291
+ and Turkish morphological context fidelity — and adds a sovereign 8B-parameter
292
+ vision encoder for image, document, chart, and visual-math understanding.
293
+
294
+ | | |
295
+ |---|---|
296
+ | **Total parameters** | 128B (120B core + 8B vision encoder) |
297
+ | **Architecture** | Sovereign decoder-only transformer + side vision encoder |
298
+ | **Optimisations** | Adaptive LoRA+, Selective Layer Collapse, Localised MoE, 4-bit block quantization, chunked attention |
299
+ | **Context window** | 278K tokens (HBM 3-tier: STM 4k / ITM 64k / LTM 278k) |
300
+ | **Active inference memory** | ~6.5 GB GPU under 4-bit quant |
301
+ | **Languages** | Turkish (primary), English |
302
+ | **Modalities** | Text, image (one image per request, 30 MB max, image/* MIME) |
303
+ | **Release version** | 1.0 production |
304
+ | **Release date** | April 2026 |
305
+ | **Licence** | API-only commercial — see https://aigency.dev/license |
306
+
307
+ ### Distribution
308
+
309
+ **Weights are not distributed.** AIGENCY V4 is accessed exclusively through
310
+ the eCloud production API at `https://aigency.dev/api/v2`. This page provides
311
+ the architectural specification, the evaluation methodology, and the full
312
+ benchmark results. To try the model interactively, use the
313
+ [demo Space](https://huggingface.co/spaces/aigencydev/AIGENCY-V4-Demo). For
314
+ production access, see [aigency.dev](https://aigency.dev).
315
+
316
+ ### Evaluation
317
+
318
+ A comprehensive single-session evaluation was conducted on **27 April 2026**
319
+ against the production API. **13,344 real API calls** across **22 distinct
320
+ benchmarks** were executed; every result is reported with a Wilson 95%
321
+ confidence interval, deterministic subsampling (seed=42), and an open dataset
322
+ identifier.
323
+
324
+ #### Tier 1 — Critical benchmarks (full set)
325
+
326
+ | Benchmark | Accuracy | Wilson 95% CI | n | Errors |
327
+ |---|---|---|---|---|
328
+ | HumanEval (pass@1) | **0.8415** | [0.778, 0.889] | 164/164 | 0 |
329
+ | IFEval (strict) | **0.8022** | [0.767, 0.834] | 541/541 | 1 |
330
+ | GPQA Diamond | 0.3788 | [0.314, 0.448] | 198/198 | 0 |
331
+ | Belebele-TR | **0.8733** | [0.850, 0.893] | 900/900 | 0 |
332
+ | ARC-Challenge | **0.9488** | [0.935, 0.960] | 1172/1172 | 0 |
333
+ | TruthfulQA MC1 | **0.7638** | [0.734, 0.792] | 817/817 | 0 |
334
+ | GSM8K | **0.9462** | [0.933, 0.957] | 1319/1319 | 0 |
335
+
336
+ #### Tier 2 — Mid-volume
337
+
338
+ | Benchmark | Accuracy | Wilson 95% CI | n |
339
+ |---|---|---|---|
340
+ | MMLU (stratified) | **0.8010** | [0.775, 0.825] | 1000/1000 |
341
+ | MMLU-Pro | 0.5020 | [0.471, 0.533] | 1000/1000 |
342
+ | HellaSwag | **0.8860** | [0.865, 0.904] | 1000/1000 |
343
+ | WinoGrande XL | 0.7466 | [0.722, 0.770] | 1267/1267 |
344
+ | HumanEval+ (extended) | **0.7988** | [0.731, 0.853] | 164/164 |
345
+ | MBPP (sanitized) | **0.8482** | [0.799, 0.887] | 257/257 |
346
+ | MBPP+ | **0.7804** | [0.736, 0.819] | 378/378 |
347
+
348
+ #### Tier 3-A — Turkish (V4 is the de-facto global reference)
349
+
350
+ | Benchmark | Accuracy | Wilson 95% CI | n |
351
+ |---|---|---|---|
352
+ | Belebele-TR | **0.8733** | [0.850, 0.893] | 900/900 |
353
+ | TQuAD (F1 ≥ 0.5) | **0.8240** | [0.788, 0.855] | 500/500 |
354
+ | TR-MMLU | **0.7080** | [0.667, 0.746] | 500/500 |
355
+ | XNLI-TR | **0.7340** | [0.694, 0.771] | 500/500 |
356
+ | TR Grammar (synthetic) | **0.7900** | [0.700, 0.858] | 100/100 |
357
+
358
+ > Frontier models do not consistently publish Turkish-specific scores.
359
+ > Within published global evaluation, AIGENCY V4 is the **Turkish reference**.
360
+
361
+ #### Tier 3-B — Multimodal (first production release)
362
+
363
+ | Benchmark | Accuracy | Wilson 95% CI | n |
364
+ |---|---|---|---|
365
+ | MMMU (val) | 0.5333 | [0.361, 0.698] | 30/30 |
366
+ | ChartQA (relaxed) | 0.6768 | [0.634, 0.717] | 492/500 |
367
+ | DocVQA (ANLS ≥ 0.5) | 0.7917 | [0.595, 0.908] | 24 |
368
+ | MathVista (testmini) | 0.3413 | [0.280, 0.408] | 208 |
369
+
370
+ ### Comparison with frontier (April 2026)
371
+
372
+ | Benchmark | AIGENCY V4 | GPT-5 | Claude 4.6/4.7 | Gemini 3 Pro |
373
+ |---|---|---|---|---|
374
+ | GSM8K | **94.62** | 96.8 | ~96 | ~94 |
375
+ | ARC-Challenge | **94.88** | ~96 | ~96 | ~95 |
376
+ | HumanEval | 84.15 | 94.0 | 95.0 | 89.7 |
377
+ | MMLU | 80.10 | 94.2 | 88-93 | 92.4 |
378
+ | MMLU-Pro | 50.20 | ~85 | ~84 | ~81 |
379
+ | GPQA Diamond | 37.88 | 88-94 | 91.3-94.2 | 91.9 |
380
+ | MMMU | 53.33 | 79.1 | 84.1 | — |
381
+
382
+ V4 is **at frontier level on grade-school math and scientific reasoning**,
383
+ **upper-mid frontier on code generation**, **lower-mid frontier on general
384
+ academic and instruction following**, and **in active development on
385
+ graduate-level expert knowledge and multimodal**. The V4.1 roadmap (Q4 2026)
386
+ targets MMLU-Pro 0.65, GPQA Diamond 0.55, and average latency 4 s.
387
+
388
+ ### Operational performance (single-session, 27 April 2026)
389
+
390
+ - Total API calls: 13,344
391
+ - Persistent error rate: 0.3%
392
+ - Average latency: 9.55 s · p50 4.39 s · p95 32.77 s · p99 33.59 s
393
+ - V4.1 latency target: average ≤ 4 s · p95 ≤ 15 s
394
+
395
+ ### Reproducibility
396
+
397
+ Full evaluation harness, raw responses, scored items, summary JSON, and the
398
+ deterministic subsample seed are available at:
399
+
400
+ - **Benchmark code**: https://github.com/ecloud-bh/aigency-benchmarks
401
+ - **Evaluation results dataset**: https://huggingface.co/datasets/aigencydev/aigency-v4-evaluation
402
+ - **Whitepaper (EN/TR)**: https://github.com/ecloud-bh/aigency-v4-whitepaper
403
+
404
+ ### Intended use
405
+
406
+ **Primary deployment domains:**
407
+
408
+ 1. Public-sector and government workloads requiring KVKK residency
409
+ 2. Legal and legal-tech (statute search, contract analysis — Tural model integration)
410
+ 3. Education and higher education (Turkish academic, exam prep, course assistants)
411
+ 4. Banking, finance and insurance (Turkish-heavy KYC/AML)
412
+ 5. Healthcare administrative workloads (KVKK-compliant document handling)
413
+ 6. Media, publishing and editorial (Turkish grammar precision)
414
+ 7. Defence and critical infrastructure (sovereign architecture)
415
+ 8. Software, R&D and engineering (code generation, large-codebase analysis)
416
+
417
+ **Out-of-scope or non-recommended:**
418
+
419
+ - Clinical diagnosis or medical advice (administrative use only)
420
+ - Autonomous critical decisions without human review
421
+ - Graduate-level scientific research where GPQA-Diamond–class accuracy is required (use frontier model + V4 hybrid)
422
+ - High-fidelity multimodal reasoning where MMMU > 75 is required (await V4.1)
423
+
424
+ ### Safety and compliance
425
+
426
+ - KVKK §5 / §12 (Turkish PDPA) compliant — KVKK-resident hosting (TR DC)
427
+ - ISO/IEC 27001 — IT-ISMS, risk and control matrix
428
+ - NIST SP 800-207 (Zero-Trust) — mTLS, least privilege, continuous monitoring
429
+ - EU AI Act (ratified 2025) — high-risk classification with model card
430
+ - Memory encryption: AES-256-XTS (RAM), ChaCha20-Poly1305 (LTM disk)
431
+ - Image cache: AES-256-GCM, 30 MB limit, 24h TTL
432
+ - Pre-encoding visual safety filter + post-encoding output check
433
+
434
+ ### Known limitations
435
+
436
+ 1. **GPQA Diamond / MMLU-Pro gap** — 35-50pp behind frontier; graduate-level expert knowledge is a V4.1 target.
437
+ 2. **First-generation multimodal** — vision encoder is 8B; V4.1 plans to scale to 16B.
438
+ 3. **Latency 2-3× frontier** — vision-encoder overhead, multimodal safety filter; V4.1 targets ≤ 4 s avg.
439
+ 4. **Multimodal subsample size** — DocVQA n=24, MMMU n=30 (HF cache constraints); CIs are wide.
440
+ 5. **Multilingual non-TR evaluation not published** — global-scale claim is currently Turkish-anchored.
441
+
442
+ ### Citation
443
+
444
+ ```bibtex
445
+ @techreport{aigency-v4-2026,
446
+ title = {AIGENCY V4: Sovereign, Fully Independent and Multimodal 128B-Parameter AI Architecture},
447
+ author = {{eCloud Yaz{\i}l{\i}m Teknolojileri}},
448
+ year = {2026},
449
+ month = apr,
450
+ institution = {eCloud Yaz{\i}l{\i}m Teknolojileri},
451
+ url = {https://github.com/ecloud-bh/aigency-v4-whitepaper},
452
+ note = {Whitepaper v1.0, April 2026}
453
+ }
454
+ ```
455
+
456
+ ---
457
+
458
+ ## Türkçe
459
+
460
+ ### Model özeti
461
+
462
+ **AIGENCY V4**, eCloud Yazılım Teknolojileri tarafından geliştirilen, V3'ün
463
+ multimodal halefi olan 128 milyar parametreli yerli yapay zekâ modelidir.
464
+ 2026/Q2'de üretime alındı. V3'ün dört bağımsızlık ilkesini (dış parametre
465
+ sıfırlama, yerel veri egemenliği, şeffaf belgeleme, Türkçe bağlam uyumu)
466
+ korur ve görsel anlama, belge soru-cevap, grafik yorumlama, görsel matematik
467
+ yetkinliklerini ekleyen 8B parametreli yerli vision encoder ile genişletir.
468
+
469
+ | | |
470
+ |---|---|
471
+ | **Toplam parametre** | 128B (120B çekirdek + 8B vision encoder) |
472
+ | **Mimari** | Yerli decoder-only transformer + yan vision encoder |
473
+ | **Optimizasyonlar** | Adaptif LoRA+, Selective Layer Collapse, L-MoE, 4-bit blok kuantizasyon, öbekli dikkat |
474
+ | **Bağlam penceresi** | 278K token (HBM 3-katmanlı: STM 4k / ITM 64k / LTM 278k) |
475
+ | **Aktif inferans bellek** | 4-bit kuantizasyon altında ~6.5 GB GPU |
476
+ | **Diller** | Türkçe (birincil), İngilizce |
477
+ | **Modaliteler** | Metin, görsel (istek başına bir görsel, max 30 MB, image/* MIME) |
478
+ | **Sürüm** | 1.0 üretim |
479
+ | **Yayın tarihi** | Nisan 2026 |
480
+ | **Lisans** | API-only ticari — https://aigency.dev/license |
481
+
482
+ ### Dağıtım
483
+
484
+ **Ağırlıklar HuggingFace'de paylaşılmaz.** AIGENCY V4'e erişim yalnızca
485
+ `https://aigency.dev/api/v2` üzerinden sağlanır. Bu sayfa mimari
486
+ spesifikasyonu, değerlendirme metodolojisini ve tam benchmark sonuçlarını
487
+ sunar. Modeli interaktif olarak denemek için
488
+ [demo Space](https://huggingface.co/spaces/aigencydev/AIGENCY-V4-Demo)
489
+ sayfasını kullanın. Üretim erişimi için: [aigency.dev](https://aigency.dev).
490
+
491
+ ### Konumlandırma — Tek cümlede
492
+
493
+ AIGENCY V4, Türkçe okuma anlama ve doğal dil çıkarımında dünya lideri,
494
+ fen muhakemesi ve grade-school matematikte küresel frontier seviyesinde,
495
+ kod üretiminde üst-orta frontier segmentinde, multimodal ve graduate-level
496
+ uzman bilgide aktif geliştirme aşamasında, tam-bağımsız ve KVKK-yerel bir
497
+ yerli yapay zekâ modelidir.
498
+
499
+ ### Hedef kullanım alanları
500
+
501
+ 1. Kamu sektörü ve devlet kurumları (KVKK gereksinimi)
502
+ 2. Hukuk ve hukuk teknolojileri (mevzuat arama, sözleşme analizi)
503
+ 3. Eğitim ve yükseköğretim (Türkçe akademik, sınav hazırlık)
504
+ 4. Bankacılık, finans ve sigorta (Türkçe-yoğun KYC/AML)
505
+ 5. Sağlık idari iş yükleri (KVKK uyumlu belge işleme)
506
+ 6. Medya, yayıncılık ve editoryal (Türkçe dilbilgisi titizliği)
507
+ 7. Savunma ve kritik altyapı (egemen mimari)
508
+ 8. Yazılım, AR-GE ve mühendislik
509
+
510
+ ### Bilinen kısıtlar
511
+
512
+ 1. GPQA Diamond / MMLU-Pro frontier'ın 35-50pp gerisinde — V4.1 hedefi.
513
+ 2. Multimodal ilk üretim sürümü — V4.1'de 16B vision encoder planlandı.
514
+ 3. Latency frontier'ın 2-3 katı — V4.1 hedefi ≤ 4 s ortalama.
515
+ 4. Multimodal subsample boyutu küçük (DocVQA n=24, MMMU n=30); CI geniş.
516
+ 5. TR-dışı çok-dilli profil yayımlanmadı — küresel iddia şu an TR-merkezli.
517
+
518
+ ### Atıf
519
+
520
+ ```bibtex
521
+ @techreport{aigency-v4-2026,
522
+ title = {AIGENCY V4: Yerli, Tam Ba{\u g}{\i}ms{\i}z ve Multimodal 128B Parametreli Yapay Zek\^a Mimarisi},
523
+ author = {{eCloud Yaz{\i}l{\i}m Teknolojileri}},
524
+ year = {2026},
525
+ month = apr,
526
+ institution = {eCloud Yaz{\i}l{\i}m Teknolojileri},
527
+ url = {https://github.com/ecloud-bh/aigency-v4-whitepaper}
528
+ }
529
+ ```
530
+
531
+ ---
532
+
533
+ ## License
534
+
535
+ AIGENCY V4 is offered under the **eCloud AIGENCY Commercial Licence** (API-only).
536
+ Model weights are not redistributed. The accompanying whitepaper is licensed
537
+ under **CC BY-ND 4.0**, and the benchmark code is licensed under **MIT**.
538
+
539
+ For commercial use, partnership, or research collaboration:
540
+ **info@e-cloud.web.tr · ai@aigency.dev** · https://aigency.dev
541
+
542
+ © 2026 eCloud Yazılım Teknolojileri.