amitagh commited on
Commit
c314f0c
•
1 Parent(s): 5a7b088
Files changed (1) hide show
  1. README.md +17 -15
README.md CHANGED
@@ -16,32 +16,33 @@ pipeline_tag: text-generation
16
 
17
 
18
  ## Model Details
19
-
20
- ### Model Description
21
-
22
- <!-- Shivneri Marathi LLM is being built with the wish to bring the benefits of Generative AI to non-English (especially Marathi) speaking population of India.
23
  Marathi has the third largest number of native speakers in India, after Hindi and Bengali.
24
  Almost 83 million people speak the language.
25
  This is a preliminary version of our Marathi LLM (Large Language Model)!
26
- Built on the mighty Gemma 7B base model, Shivneri LLM can generate creative and informative text in both Marathi and English. This is just the beginning – we're constantly improving Shivneri, and even more exciting features are on the horizon! -->
 
 
 
 
27
 
28
  This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
29
 
30
  - **Developed by:** Amit Ghadge
31
  - **Funded by [optional]:** [More Information Needed]
32
- - **Shared by [optional]:** [More Information Needed]
33
- - **Model type:** [More Information Needed]
34
- - **Language(s) (NLP):** [More Information Needed]
35
  - **License:** [More Information Needed]
36
- - **Finetuned from model [optional]:** [More Information Needed]
37
 
38
  ### Model Sources [optional]
39
 
40
  <!-- Provide the basic links for the model. -->
41
 
42
- - **Repository:** [More Information Needed]
43
- - **Paper [optional]:** [More Information Needed]
44
- - **Demo [optional]:** [More Information Needed]
45
 
46
  ## Uses
47
 
@@ -89,11 +90,12 @@ Use the code below to get started with the model.
89
 
90
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
91
 
92
- [More Information Needed]
93
 
94
  ### Training Procedure
95
 
96
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
 
97
 
98
  #### Preprocessing [optional]
99
 
@@ -164,11 +166,11 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
164
 
165
  ### Model Architecture and Objective
166
 
167
- [More Information Needed]
168
 
169
  ### Compute Infrastructure
170
 
171
- [More Information Needed]
172
 
173
  #### Hardware
174
 
 
16
 
17
 
18
  ## Model Details
19
+ Shivneri Marathi LLM is being built with the wish to bring the benefits of Generative AI to non-English (especially Marathi) speaking population of India.
 
 
 
20
  Marathi has the third largest number of native speakers in India, after Hindi and Bengali.
21
  Almost 83 million people speak the language.
22
  This is a preliminary version of our Marathi LLM (Large Language Model)!
23
+ Built on the mighty Gemma 7B base model, Shivneri LLM can generate creative and informative text in both Marathi and English. This is just the beginning – we're constantly improving Shivneri, and even more exciting features are on the horizon!
24
+
25
+ ### Model Description
26
+
27
+ <!-- -->
28
 
29
  This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
30
 
31
  - **Developed by:** Amit Ghadge
32
  - **Funded by [optional]:** [More Information Needed]
33
+ - **Shared by [optional]:** [Amit Ghadge]
34
+ - **Model type:** [ Decoder-only large language model (LLM) with a transformer architecture]
35
+ - **Language(s) (NLP):** [Marathi, English]
36
  - **License:** [More Information Needed]
37
+ - **Finetuned from model [optional]:** [Gemma-7B]
38
 
39
  ### Model Sources [optional]
40
 
41
  <!-- Provide the basic links for the model. -->
42
 
43
+ - **Repository:** [https://github.com/amitagh/shivneri-llm]
44
+ - **Paper [optional]:** [https://medium.com/@amitagh/shivneri-marathi-llm-e823f0a045d8]
45
+ - **Demo [optional]:** [Coming soon]
46
 
47
  ## Uses
48
 
 
90
 
91
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
92
 
93
+ [Continually Pretrained with Lora on AI4Bharat/Sangraha dataset]
94
 
95
  ### Training Procedure
96
 
97
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
98
+ Continually Pretrained with Lora
99
 
100
  #### Preprocessing [optional]
101
 
 
166
 
167
  ### Model Architecture and Objective
168
 
169
+ [ Decoder-only large language model (LLM) with a transformer architecture]
170
 
171
  ### Compute Infrastructure
172
 
173
+ [A100 80 GB]
174
 
175
  #### Hardware
176