Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
12
Donkey Small
DonkeySmall
Follow
nastyboget's profile picture
Ganchik228's profile picture
luosaike's profile picture
5 followers
ยท
3 following
AI & ML interests
None yet
Recent Activity
liked
a Space
about 1 month ago
avans06/Image_Face_Upscale_Restoration-GFPGAN-RestoreFormer-CodeFormer-GPEN
reacted
to
fdaudens
's
post
with ๐ฅ
about 2 months ago
Is this the best tool to extract clean info from PDFs, handwriting and complex documents yet? Open source olmOCR just dropped and the results are impressive. Tested the free demo with various documents, including a handwritten Claes Oldenburg letter. The speed is impressive: 3000 tokens/second on your own GPU - that's 1/32 the cost of GPT-4o ($190/million pages). Game-changer for content extraction and digital archives. To achieve this, Ai2 trained a 7B vision language model on 260K pages from 100K PDFs using "document anchoring" - combining PDF metadata with page images. Best part: it actually understands document structure (columns, tables, equations) instead of just jumbling everything together like most OCR tools. Their human eval results back this up. ๐ Try the demo: https://olmocr.allenai.org Going right into the AI toolkit: https://huggingface.co/spaces/JournalistsonHF/ai-toolkit
reacted
to
schuler
's
post
with โค๏ธ
2 months ago
๐ฎ GPT-3 implemented in pure Free Pascal! https://github.com/joaopauloschuler/gpt-3-for-pascal This implementation follows the GPT-3 Small architecture from the landmark paper "Language Models are Few-Shot Learners": ``` โโโโโโโโโโโโโโโโโโโโโโโโโโโ โ Input Layer โ โโโโโโโโโโโโโโโโโโโโโโโโโโโค โ Token & Positional โ โ Embedding โ โโโโโโโโโโโโโโโโโโโโโโโโโโโค โ 12x Transformer โ โ Blocks โ โ - 12 heads โ โ - 768 hidden dims โ โ - 3072 intermediate โ โโโโโโโโโโโโโโโโโโโโโโโโโโโค โ Output Layer โ โโโโโโโโโโโโโโโโโโโโโโโโโโโ ``` Clean Pascal Implementation ``` for CntLayer := 1 to {Layers=}12 do begin Result.AddTransformerBlockCAI( {Heads=}12, {intermediate dimensions=}4*768, {NoForward=}true, {HasNorm=}true, false ); end; ```
View all activity
Organizations
models
None public yet
datasets
10
Sort:ย Recently updated
DonkeySmall/OCR-English-Printed-12
Preview
โข
Updated
Aug 3, 2024
โข
19
โข
2
DonkeySmall/OCR-Numbers-Printed-5
Preview
โข
Updated
Aug 3, 2024
โข
18
DonkeySmall/OCR-Cyrillic-Printed-6
Preview
โข
Updated
Jul 27, 2024
โข
29
โข
2
DonkeySmall/OCR-Cyrillic-Printed-8
Preview
โข
Updated
Jul 27, 2024
โข
39
DonkeySmall/OCR-English-Printed-14
Updated
Jul 27, 2024
โข
16
DonkeySmall/OCR-Cyrillic-Printed-10
Preview
โข
Updated
Jul 25, 2024
โข
27
DonkeySmall/OCR-Cyrillic-Printed-9
Preview
โข
Updated
Jul 19, 2024
โข
22
DonkeySmall/OCR-Numbers-Printed-2
Preview
โข
Updated
Jul 19, 2024
โข
15
DonkeySmall/OCR-Cyrillic-Printed-1
Preview
โข
Updated
May 6, 2024
โข
49
DonkeySmall/OCR-Numbers-Printed-0
Preview
โข
Updated
May 6, 2024
โข
22
โข
1