---
license: odc-by
language:
- el

widget:
 - text: "Η Ιαπωνία έχει μια ιστορία που ξεκινά πριν από χιλιάδες χρόνια. Οι επιστήμονες πιστεύουν πως οι Ιάπωνες ως ενιαίο σύνολο προέρχονται από πολλές ομάδες, οι οποίες μετανάστευσαν στα νησιά από άλλα σημεία της Ασίας, στα οποία περιλαμβάνονται "

tags:
- text-generation-inference
---


---
language: el
---

# el-llama-smol


## Model:
`el-llama-smol` aims to be the first in a series of LLMs trained mostly in Greek corpora. The model is a small (1bn parameters) version of  LLama, with the following configuration.

```json
{
  "architectures": ["LLaMAForCausalLM"],
  "bos_token_id": 0,
  "eos_token_id": 1,
  "hidden_act": "silu",
  "hidden_size": 2048,
  "intermediate_size": 5461,
  "initializer_range": 0.02,
  "max_sequence_length": 1024,
  "model_type": "llama",
  "num_attention_heads": 32,
  "num_hidden_layers": 24,
  "pad_token_id": -1,
  "rms_norm_eps": 1e-06,
  "transformers_version": "4.28.1",
  "use_cache": true,
  "vocab_size": 22000
}
```


## Training details: 

The current snapshot has been trained for 40hrs with an RTX A6000 GPU (48G), using the  `galore_adamw8bit_per_layer` optimizer by Zhao et. al [1] and a context size of 1024 tokens.


## Dataset: 
The model is trained on the Greek subset of the [allenai/c4](https://huggingface.co/datasets/allenai/c4) dataset. Text tokenization is performed with a (heavily unoptimized) tokenizer with vocab size of 22000 tokens, trained with [SentencePiece](https://github.com/google/sentencepiece)
 

## Examples

#### Use a 🤗 pipeline 
```python

from transformers import pipeline
pipe = pipeline("text-generation", model="Konstantinos/el_llama_smol")

set_seed(1)
prompt = """Η Ιαπωνία έχει μια ιστορία που ξεκινά πριν από χιλιάδες χρόνια. 
Οι επιστήμονες πιστεύουν πως οι Ιάπωνες ως ενιαίο σύνολο προέρχονται από πολλές ομάδες,
οι οποίες μετανάστευσαν στα νησιά από άλλα σημεία της Ασίας, στα οποία περιλαμβάνονται """

ret = pipe(prompt, do_sample=True, top_k=20, temperature=0.85,  max_new_tokens=110)
```

#### Load model directly
```python

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Konstantinos/el_llama_smol")
model = AutoModelForCausalLM.from_pretrained("Konstantinos/el_llama_smol")
```

## References
    
[1]  Jiawei Zhao, Zhenyu Zhang, Beidi Chen, Zhangyang Wang, Anima Anandkumar, & Yuandong Tian. (2024). GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection.


## Citation 

TBD
---
license: odc-by
-