VishnuPJ commited on
Commit
b406fae
·
verified ·
1 Parent(s): 66c685e

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +127 -0
README.md ADDED
@@ -0,0 +1,127 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - VishnuPJ/Alpaca_Instruct_Malayalam
5
+ language:
6
+ - ml
7
+ - en
8
+ pipeline_tag: text-generation
9
+ ---
10
+
11
+ # MalayaLLM [മലയാളം/Malayalam]
12
+
13
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/64e65800e44b2668a56f9731/bipVMulaNJ9um46ecYpR4.png" alt="Baby MalayaLLM" width="300" height="200">
14
+
15
+ # MalayaLLM_7B_Instruct_v0.1
16
+
17
+ This is an attempt to construct a Language Model (LLM) focused on **generative AI for Malayalam language**. While several LLMs are proficient in supporting multiple languages, including Malayalam, enhancing their performance for specific tasks such as content generation and question answering specifically in Malayalam can be achieved through dedicated training on a Malayalam dataset. In pursuit of this, I've undertaken the **continuous pre-training of the LLAMA2 model using a comprehensive Malayalam dataset**.
18
+
19
+ The model is currently in its early stages, and ongoing training and fine-tuning with a more comprehensive dataset are necessary to enhance its performance. I will consistently provide updated revisions to the model.
20
+ # Github Repo:
21
+ For comprehensive insights into model training, fine-tuning, and other advanced techniques, refer to the MalayaLLM GitHub repository at the following link:
22
+ https://github.com/VishnuPJ/MalayaLLM
23
+ # Introducing the Developer:
24
+ Discover the mind behind this model and stay updated on their contributions to the field
25
+ https://www.linkedin.com/in/vishnu-prasad-j/
26
+ # Model description
27
+ The MalayaLLM models have been improved and customized to incorporate a comprehensive Malayalam vocabulary comprising approximately 18,000 tokens, expanding upon the groundwork laid by the original LLaMA-2.
28
+
29
+ - **Model type:** A 7B LLaMA2 finetuned model on Malayalam tokens (Alpaca_Instruct_Malayalam).
30
+ - **Language(s):** Malayalam and English
31
+ - **Datasets:** [Alpaca_Instruct_Malayalam](https://huggingface.co/datasets/VishnuPJ/Alpaca_Instruct_Malayalam)
32
+ - **Source Model:** [MalayaLLM_7B_Base](https://huggingface.co/VishnuPJ/MalayaLLM_7B_Base)
33
+ - **Training Precision:** `float16`
34
+ - **Code:** [GitHub](https://github.com/VishnuPJ/MalayaLLM)
35
+
36
+ **Prompt Template Without Input**
37
+
38
+ ```
39
+ {system_prompt}
40
+ ### Instruction:
41
+ {instruction or query}
42
+ ### Response:
43
+ {response}
44
+ ```
45
+
46
+ **Prompt Template With Input**
47
+
48
+ ```
49
+ {system_prompt}
50
+ ### Instruction:
51
+ {instruction or query}
52
+ ### Input:
53
+ {input}
54
+ ### Response:
55
+ {response}
56
+ ```
57
+
58
+
59
+
60
+ ## Available Models
61
+ | Model | Type | Data | Base Model | # Params | Download Links |
62
+ |--------------------------|-----------------------------|-------------------|----------------------|------|------------------------------------------------------------------------|
63
+ | MalayaLLM 7B Base #v0.1 | Base model | 12GB | LLaMA 7B | 7B | [HF Hub](https://huggingface.co/VishnuPJ/MalayaLLM_7B_Base) |
64
+ | MalayaLLM 7B Instruct #v0.1| Instruction following model | 52k instructions | MalayaLLM 7B Base | 7B | [HF Hub](https://huggingface.co/VishnuPJ/MalayaLLM_7B_Instruct_v0.1) |
65
+ | ***MalayaLLM 7B Instruct #v0.2***| Instruction following model | 52k instructions | MalayaLLM 7B Base | 7B | [HF Hub](https://huggingface.co/VishnuPJ/MalayaLLM_7B_Instruct_v0.2) |
66
+ ** **Note : MalayaLLM 7B Instruct v0.2 is the latest model.**
67
+
68
+ ### Quantized Version of Available Models
69
+ | Model | Format | Bits | Download Links |
70
+ |--------------------------|--------|----------------------|------------------------------------------------------------------------------|
71
+ | MalayaLLM 7B Instruct #v0.1 | GGUF | Q8_0 | [HF Hub](https://huggingface.co/VishnuPJ/MalayaLLM_7B_Instruct_v0.1_GGUF) |
72
+ | MalayaLLM 7B Instruct #v0.2 | GGUF | Q8_0 | [HF Hub](https://huggingface.co/VishnuPJ/MalayaLLM_7B_Instruct_v0.2_GGUF) |
73
+
74
+ ## A simple example code
75
+
76
+ ```python
77
+ import os
78
+ import torch
79
+ from datasets import load_dataset
80
+ from peft import LoraConfig, PeftModel
81
+ from transformers import (
82
+ AutoModelForCausalLM,
83
+ AutoTokenizer,
84
+ BitsAndBytesConfig,
85
+ HfArgumentParser,
86
+ TrainingArguments,
87
+ logging,
88
+ pipeline,
89
+ )
90
+ from trl import SFTTrainer
91
+ model_name = "VishnuPJ/MalayaLLM_7B_Instruct_v0.1"
92
+ print(f"Loading model...")
93
+ # Load base model
94
+ base_model = AutoModelForCausalLM.from_pretrained(
95
+ model_name,
96
+ low_cpu_mem_usage=True,
97
+ return_dict=True,
98
+ torch_dtype=torch.float16,
99
+ device_map="auto",
100
+ )
101
+
102
+ tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
103
+ tokenizer.pad_token = tokenizer.eos_token
104
+ tokenizer.padding_side = "right"
105
+
106
+ pipe = pipeline(task="text-generation", model=base_model, tokenizer=tokenizer, max_length=200)
107
+ sys_prompt = "ഒരു ടാസ്ക് വിവരിക്കുന്ന ഒരു നിർദ്ദേശം ചുവടെയുണ്ട്. അഭ്യർത്ഥന ശരിയായി പൂർത്തിയാക്കുന്ന ഒരു പ്രതികരണം എഴുതുക."
108
+
109
+ while True:
110
+ inst = input("Enter instruction (or 'exit' to end): ")
111
+ if inst.lower() == 'exit':
112
+ break
113
+ # Generate response using the user-provided instruction
114
+ result = pipe(f"{sys_prompt} ### Instruction: {inst} ### Response:")
115
+ # Print the generated text
116
+ print(result[0]['generated_text'])
117
+ ```
118
+
119
+ ## Example Output
120
+ ```
121
+ Enter instruction (or 'exit' to end): സൂര്യൻ ഉദിക്കുന്ന ദിശ ഏതെന്നു പറയുക .
122
+ ഒരു ടാസ്ക് വിവരിക്കുന്ന ഒരു നിർദ്ദേശം ചുവടെയുണ്ട്. അഭ്യർത്ഥന ശരിയായി പൂർത്തിയാക്കുന്ന ഒരു പ്രതികരണം എഴുതുക. ### Instruction: സൂര്യൻ ഉദിക്കുന്ന ദിശ ഏതെന്നു പറയുക . ### Response: സൂര്യൻ ഉദിക്കുന്ന ദിശ കിഴക്കായിരിക്കും.
123
+ Enter instruction (or 'exit' to end): Where does the Sun rise?
124
+ ഒരു ടാസ്ക് വിവരിക്കുന്ന ഒരു നിർദ്ദേശം ചുവടെയുണ്ട്. അഭ്യർത്ഥന ശരിയായി പൂർത്തിയാക്കുന്ന ഒരു പ്രതികരണം എഴുതുക. ### Instruction: Where does the Sun rise? ### Response: The Sun rises in the east.
125
+ Enter instruction (or 'exit' to end):
126
+ ```
127
+ # 🌟Happy coding💻🌟