Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
yicui
's Collections
Mechanistic
Coding
Benchmark
Training
ICL
Architecture
RL
TDD
Theory
Instructions
Mechanistic
updated
about 17 hours ago
Upvote
-
Massive Activations in Large Language Models
Paper
•
2402.17762
•
Published
Feb 27
•
1
What Matters in Transformers? Not All Attention is Needed
Paper
•
2406.15786
•
Published
Jun 22
•
29
The Super Weight in Large Language Models
Paper
•
2411.07191
•
Published
11 days ago
•
2
Top-nσ: Not All Logits Are You Need
Paper
•
2411.07641
•
Published
10 days ago
•
15
Upvote
-
Share collection
View history
Collection guide
Browse collections