wb

whitebill

AI & ML interests

None yet

Recent Activity

reacted to singhsidhukuldeep's post with 🚀 5 days ago

Exciting breakthrough in AI: @Meta's new Byte Latent Transformer (BLT) revolutionizes language models by eliminating tokenization! The BLT architecture introduces a groundbreaking approach that processes raw bytes instead of tokens, achieving state-of-the-art performance while being more efficient and robust. Here's what makes it special: >> Key Innovations Dynamic Patching: BLT groups bytes into variable-sized patches based on entropy, allocating more compute power where the data is more complex. This results in up to 50% fewer FLOPs during inference compared to traditional token-based models. Three-Component Architecture: • Lightweight Local Encoder that converts bytes to patch representations • Powerful Global Latent Transformer that processes patches • Local Decoder that converts patches back to bytes >> Technical Advantages • Matches performance of Llama 3 at 8B parameters while being more efficient • Superior handling of non-English languages and rare character sequences • Remarkable 99.9% accuracy on spelling tasks • Better scaling properties than token-based models >> Under the Hood The system uses an entropy model to determine patch boundaries, cross-attention mechanisms for information flow, and hash n-gram embeddings for improved representation. The architecture allows simultaneous scaling of both patch and model size while maintaining fixed inference costs. This is a game-changer for multilingual AI and could reshape how we build future language models. Excited to see how this technology evolves!

updated a collection 8 days ago

my1

reacted to clem's post with 🚀 8 days ago

Coming back to Paris Friday to open our new Hugging Face office! We're at capacity for the party but add your name in the waiting list as we're trying to privatize the passage du Caire for extra space for robots 🤖🦾🦿 https://t.co/enkFXjWndJ

View all activity

Organizations

whitebill's activity

reacted to singhsidhukuldeep's post with 🚀 5 days ago

Post

3555

Exciting breakthrough in AI: @Meta 's new Byte Latent Transformer (BLT) revolutionizes language models by eliminating tokenization!

The BLT architecture introduces a groundbreaking approach that processes raw bytes instead of tokens, achieving state-of-the-art performance while being more efficient and robust. Here's what makes it special:

>> Key Innovations
Dynamic Patching: BLT groups bytes into variable-sized patches based on entropy, allocating more compute power where the data is more complex. This results in up to 50% fewer FLOPs during inference compared to traditional token-based models.

Three-Component Architecture:
• Lightweight Local Encoder that converts bytes to patch representations
• Powerful Global Latent Transformer that processes patches
• Local Decoder that converts patches back to bytes

>> Technical Advantages
• Matches performance of Llama 3 at 8B parameters while being more efficient
• Superior handling of non-English languages and rare character sequences
• Remarkable 99.9% accuracy on spelling tasks
• Better scaling properties than token-based models

>> Under the Hood
The system uses an entropy model to determine patch boundaries, cross-attention mechanisms for information flow, and hash n-gram embeddings for improved representation. The architecture allows simultaneous scaling of both patch and model size while maintaining fixed inference costs.

This is a game-changer for multilingual AI and could reshape how we build future language models. Excited to see how this technology evolves!

2 replies

updated a collection 8 days ago

my1

Collection

20 items • Updated 8 days ago

reacted to clem's post with 🚀 8 days ago

Post

1554

Coming back to Paris Friday to open our new Hugging Face office!

We're at capacity for the party but add your name in the waiting list as we're trying to privatize the passage du Caire for extra space for robots 🤖🦾🦿

https://t.co/enkFXjWndJ

1 reply

liked a Space 9 days ago

Running

👁

Mi50

MI50 inference

updated a collection 17 days ago

my1

Collection

20 items • Updated 8 days ago

upvoted an article 23 days ago

Article

Use Models from the Hugging Face Hub in LM Studio

•

29 days ago

• 127

updated a collection about 2 months ago

my1

Collection

20 items • Updated 8 days ago

liked a model about 2 months ago

ruslandev/llama-3-8b-gpt-4o

Text Generation • Updated Jun 12 • 42 • 2

updated a collection about 2 months ago

my1

Collection

20 items • Updated 8 days ago

reacted to averoo's post with 👍 about 2 months ago

Post

3790

Hello, researchers! I've tried to made reading HF Daily Papers easier and made a tool that does reviews with LLMs like Claude 3.5, GPT-4o and sometimes FLUX.

📚 Classification by topics
📅 Sorting by publication date and HF addition date
🔄 Syncing every 2 hours
💻 Hosted on GitHub
🌏 English, Russian, and Chinese
📈 Top by week/month (in progress)

👉 https://hfday.ru

Let me know what do you think of it.

updated a collection 2 months ago

my1

Collection

20 items • Updated 8 days ago

updated a collection 3 months ago

my1

Collection

20 items • Updated 8 days ago