Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
10086
14
221
Tien Dung
tiendung
Follow
HUNGPHAM's profile picture
daosysang's profile picture
doof-ferb's profile picture
13 followers
·
113 following
tiendung
AI & ML interests
None yet
Recent Activity
liked
a model
about 17 hours ago
turboderp/ERNIE-4.5-300B-A47B-PT-exl3
reacted
to
Jaward
's
post
with 😎
5 days ago
I played around with the new RXTX paper (XX^T) and was able to train nanogpt with 4x4 RXTX matmuls in both attention layer and optimizer🤕 It just works (well I had to add some guardrails) but still saves 5% of memory usage: The Patch: - Computes attention scores with a 4x4 blockwise RXTX matmuls (no pytorch dot prod) - Handles arbitrary sequence lengths by padding to the nearest multiple of 4. - An RXTX variant of shampoo with params reshaped into 4x4 blocks during each optimizer step. - Uses 5% less ops Code: https://github.com/Jaykef/ai-algorithms/blob/main/nanogpt-rxtx.ipynb Paper: https://arxiv.org/pdf/2505.09814
reacted
to
Jaward
's
post
with 👍
5 days ago
I played around with the new RXTX paper (XX^T) and was able to train nanogpt with 4x4 RXTX matmuls in both attention layer and optimizer🤕 It just works (well I had to add some guardrails) but still saves 5% of memory usage: The Patch: - Computes attention scores with a 4x4 blockwise RXTX matmuls (no pytorch dot prod) - Handles arbitrary sequence lengths by padding to the nearest multiple of 4. - An RXTX variant of shampoo with params reshaped into 4x4 blocks during each optimizer step. - Uses 5% less ops Code: https://github.com/Jaykef/ai-algorithms/blob/main/nanogpt-rxtx.ipynb Paper: https://arxiv.org/pdf/2505.09814
View all activity
Organizations
tiendung
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a model
about 17 hours ago
turboderp/ERNIE-4.5-300B-A47B-PT-exl3
Updated
about 10 hours ago
•
2
liked
2 models
23 days ago
mistralai/Magistral-Small-2506
Text Generation
•
24B
•
Updated
22 days ago
•
78k
•
•
551
KhangHatto/alpha
Feature Extraction
•
0.7B
•
Updated
Jun 6
•
36
•
1
liked
10 models
about 1 month ago
HPLT/hplt_bert_base_2_0_vie-Latn
Fill-Mask
•
Updated
20 days ago
•
22
•
1
fluxions/vui
Text-to-Speech
•
Updated
21 days ago
•
5.29k
•
139
inclusionAI/Ling-lite-1.5
Text Generation
•
17B
•
Updated
Jun 4
•
233
•
11
moonshotai/Moonlight-16B-A3B-Instruct
Text Generation
•
16B
•
Updated
Mar 3
•
15.9k
•
166
nvidia/Nemotron-Research-Reasoning-Qwen-1.5B
Text Generation
•
2B
•
Updated
Jun 5
•
13.5k
•
•
172
ACE-Step/ACE-Step-v1-3.5B
Text-to-Audio
•
Updated
May 22
•
531
IndexTeam/Index-anisora
Updated
27 days ago
•
34
•
163
OpenGVLab/ZeroGUI-AndroidLab-7B
Image-Text-to-Text
•
8B
•
Updated
May 30
•
51
•
4
Alibaba-NLP/gte-reranker-modernbert-base
Text Ranking
•
0.1B
•
Updated
4 days ago
•
79.8k
•
68
ServiceNow-AI/Apriel-5B-Instruct
Text Generation
•
5B
•
Updated
May 28
•
3.94k
•
47
liked
a dataset
3 months ago
OpenGVLab/MMPR-v1.2
Updated
May 29
•
14.7k
•
22
liked
a model
3 months ago
inclusionAI/Ling-lite
Text Generation
•
17B
•
Updated
May 8
•
242
•
45
liked
3 models
4 months ago
ds4sd/SmolDocling-256M-preview
Image-Text-to-Text
•
0.3B
•
Updated
May 16
•
240k
•
1.47k
5CD-AI/Vintern-3B-R-beta
Image-Text-to-Text
•
4B
•
Updated
Mar 26
•
6.14k
•
17
BAAI/bge-large-zh-v1.5
Feature Extraction
•
Updated
Apr 2, 2024
•
334k
•
•
541
liked
a dataset
8 months ago
microsoft/orca-agentinstruct-1M-v1
Viewer
•
Updated
Nov 1, 2024
•
1.05M
•
1.92k
•
446
liked
a model
8 months ago
5CD-AI/ColVintern-1B-v1
Feature Extraction
•
0.9B
•
Updated
Nov 14, 2024
•
89
•
6
Load more