LLaMAX3-8B-Alpaca 4bit

Model Details

Model Description

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

Developed by: LLaMAX/LLaMAX3-8B-Alpaca
Funded by [optional]: [More Information Needed]
Shared by [optional]: [More Information Needed]
Model type: [More Information Needed]
Language(s) (NLP): [More Information Needed]
License: [More Information Needed]
Finetuned from model [optional]: [More Information Needed]

Model Architecture

LlamaForCausalLM(
  (model): LlamaModel(
    (embed_tokens): Embedding(128256, 4096)
    (layers): ModuleList(
      (0-31): 32 x LlamaDecoderLayer(
        (self_attn): LlamaSdpaAttention(
          (q_proj): Linear4bit(in_features=4096, out_features=4096, bias=False)
          (k_proj): Linear4bit(in_features=4096, out_features=1024, bias=False)
          (v_proj): Linear4bit(in_features=4096, out_features=1024, bias=False)
          (o_proj): Linear4bit(in_features=4096, out_features=4096, bias=False)
          (rotary_emb): LlamaRotaryEmbedding()
        )
        (mlp): LlamaMLP(
          (gate_proj): Linear4bit(in_features=4096, out_features=14336, bias=False)
          (up_proj): Linear4bit(in_features=4096, out_features=14336, bias=False)
          (down_proj): Linear4bit(in_features=14336, out_features=4096, bias=False)
          (act_fn): SiLU()
        )
        (input_layernorm): LlamaRMSNorm()
        (post_attention_layernorm): LlamaRMSNorm()
      )
    )
    (norm): LlamaRMSNorm()
  )
  (lm_head): Linear(in_features=4096, out_features=128256, bias=False)
)

🔥 Excellent Translation Performance

LLaMAX3-8B-Alpaca achieves an average spBLEU score improvement of over 5 points compared to the LLaMA3-8B-Alpaca model on the Flores-101 dataset.

System	Size	en-X (COMET)	en-X (BLEU)	zh-X (COMET)	zh-X (BLEU)	de-X (COMET)	de-X (BLEU)	ne-X (COMET)	ne-X (BLEU)	ar-X (COMET)	ar-X (BLEU)	az-X (COMET)	az-X (BLEU)	ceb-X (COMET)	ceb-X (BLEU)
LLaMA3-8B-Alpaca	8B	67.97	17.23	64.65	10.14	64.67	13.62	62.95	7.96	63.45	11.27	60.61	6.98	55.26	8.52
LLaMAX3-8B-Alpaca	8B	75.52	22.77	73.16	14.43	73.47	18.95	75.13	15.32	72.29	16.42	72.06	12.41	68.88	15.85

System	Size	X-en (COMET)	X-en (BLEU)	X-zh (COMET)	X-zh (BLEU)	X-de (COMET)	X-de (BLEU)	X-ne (COMET)	X-ne (BLEU)	X-ar (COMET)	X-ar (BLEU)	X-az (COMET)	X-az (BLEU)	X-ceb (COMET)	X-ceb (BLEU)
LLaMA3-8B-Alpaca	8B	77.43	26.55	73.56	13.17	71.59	16.82	46.56	3.83	66.49	10.20	58.30	4.81	52.68	4.18
LLaMAX3-8B-Alpaca	8B	81.28	31.85	78.34	16.46	76.23	20.64	65.83	14.16	75.84	15.45	70.61	9.32	63.35	12.66

Supported Languages

Akrikaans (af), Amharic (am), Arabic (ar), Armenian (hy), Assamese (as), Asturian (ast), Azerbaijani (az), Belarusian (be), Bengali (bn), Bosnian (bs), Bulgarian (bg), Burmese (my), Catalan (ca), Cebuano (ceb), Chinese Simpl (zho), Chinese Trad (zho), Croatian (hr), Czech (cs), Danish (da), Dutch (nl), English (en), Estonian (et), Filipino (tl), Finnish (fi), French (fr), Fulah (ff), Galician (gl), Ganda (lg), Georgian (ka), German (de), Greek (el), Gujarati (gu), Hausa (ha), Hebrew (he), Hindi (hi), Hungarian (hu), Icelandic (is), Igbo (ig), Indonesian (id), Irish (ga), Italian (it), Japanese (ja), Javanese (jv), Kabuverdianu (kea), Kamba (kam), Kannada (kn), Kazakh (kk), Khmer (km), Korean (ko), Kyrgyz (ky), Lao (lo), Latvian (lv), Lingala (ln), Lithuanian (lt), Luo (luo), Luxembourgish (lb), Macedonian (mk), Malay (ms), Malayalam (ml), Maltese (mt), Maori (mi), Marathi (mr), Mongolian (mn), Nepali (ne), Northern Sotho (ns), Norwegian (no), Nyanja (ny), Occitan (oc), Oriya (or), Oromo (om), Pashto (ps), Persian (fa), Polish (pl), Portuguese (pt), Punjabi (pa), Romanian (ro), Russian (ru), Serbian (sr), Shona (sn), Sindhi (sd), Slovak (sk), Slovenian (sl), Somali (so), Sorani Kurdish (ku), Spanish (es), Swahili (sw), Swedish (sv), Tajik (tg), Tamil (ta), Telugu (te), Thai (th), Turkish (tr), Ukrainian (uk), Umbundu (umb), Urdu (ur), Uzbek (uz), Vietnamese (vi), Welsh (cy), Wolof (wo), Xhosa (xh), Yoruba (yo), Zulu (zu)

vutuka
/

LLaMAX3-8B-Alpaca-4bit