-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 567 -
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper • 2403.03507 • Published • 175 -
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 50 -
ResLoRA: Identity Residual Mapping in Low-Rank Adaption
Paper • 2402.18039 • Published • 10
RachidAR
RachidAR
AI & ML interests
1.58 bit LLM
Organizations
Collections
2
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 567 -
Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding
Paper • 2404.16710 • Published • 55 -
Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory
Paper • 2405.08707 • Published • 25 -
Token-Scaled Logit Distillation for Ternary Weight Generative Language Models
Paper • 2308.06744 • Published • 1
spaces
1
models
21
RachidAR/Llama-3-8B-Instruct-DPO-v0.3-Q6_K-GGUF
Text Generation
•
Updated
•
44
RachidAR/Waktaverse-Llama-3-KO-8B-Instruct-Q6_K-GGUF
Updated
•
48
RachidAR/llama-3-indotuned-v0-Q6_K-GGUF
Updated
•
45
RachidAR/saiga_llama3_8b-Q6_K-GGUF
Updated
•
52
RachidAR/Llama-3-8B-saiga-suzume-ties-Q6_K-GGUF
Text Generation
•
Updated
•
83
•
2
RachidAR/wiz-llama3-8B-Q6_K-GGUF
Updated
•
44
RachidAR/ablation-model-fineweb-v1-Q6_K-GGUF
Updated
•
35
RachidAR/Llama-3-8B-Instruct-Physics-5k-Scar-Q6_K-GGUF
Updated
•
74
RachidAR/NorskGPT-Llama3-8b-Q6_K-GGUF
Updated
•
44
RachidAR/llama3-Mirage-Walker-8b-v0.2-slerp-Q6_K-GGUF
Updated
•
48
datasets
None public yet