Cover image
Hamzah Language Model; my tiny implementation of an eastern-culture Language Model.

We're not a company, just a small group of students. It takes huge compute to train these models till something high-quality comes out; you can help us pay our studies here.

We introduce HamzahLMV1, the first version of Hamzah Language Model, a series of upcoming models designed to have a tiny bit of a personality, be smart (for their size) and promptable to be beyond for what they've been trained on (=> high instruction following).
Quick performance metadata:

  • Model Series: HamzahLM
  • Model Version: V1
  • Model Parameters: 1.2B
  • Context Length: 128k tokens
  • Recommended Max Generation Length: 2k - 8k
  • Other Notes: Large ctx length; this model is good at processing large context while being coherent; you can exploit this using RAG or similar.

A 3B version is in its way! You're able to access the V0 3B for free through our endpoint, serving the full 128k context with acceptable processing & perfect generation performance.
Changes made to this in comparison to V0:

  • Also trained on XeTute/iloveuser-1k and a thought-tag-modified version of open-thoughts/OpenThoughts-114k
  • This time, trained on XeTute/Eastern-Alpaca-14k for one entire epoch, regardless of if the loss was low half-way-through; used a higher dropout rate to prevent low-quality results though
  • Used a higher context length (numbers in $$tokens$$):
    • Last Time: 2048 + 0512 (0512 padding; used through unified V RAM)
    • This Time: 2048 + 6144 (6144 padding; used through unified V RAM)
      • Resulted in a 3h higher-training time; used 5 minute breaks for the GPU to cool down each half hour

Using this settings, we feel the ratio from V0 to V1 is ~ the same ratio as for Meta LLaMA2 to Meta LLaMA3 ;)


Our Apps & Socials

Chat with our Assistant | Support us Financially | Visit our GitHub

Long live the Islamic Republic of Pakistan; Glory to the Islamic Republic of Pakistan 🇵🇰
The Flag of the Islamic Federal Republic of Pakistan

Downloads last month
0
Safetensors
Model size
1.24B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for XeTute/HamzahLMV1-1B

Finetuned
(379)
this model
Quantizations
1 model

Datasets used to train XeTute/HamzahLMV1-1B

Collection including XeTute/HamzahLMV1-1B