RichardErkhov commited on
Commit
2e1e434
1 Parent(s): 60e6cfd

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +267 -0
README.md ADDED
@@ -0,0 +1,267 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ gemma-ko-7b - GGUF
11
+ - Model creator: https://huggingface.co/beomi/
12
+ - Original model: https://huggingface.co/beomi/gemma-ko-7b/
13
+
14
+
15
+ | Name | Quant method | Size |
16
+ | ---- | ---- | ---- |
17
+ | [gemma-ko-7b.Q2_K.gguf](https://huggingface.co/RichardErkhov/beomi_-_gemma-ko-7b-gguf/blob/main/gemma-ko-7b.Q2_K.gguf) | Q2_K | 3.24GB |
18
+ | [gemma-ko-7b.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/beomi_-_gemma-ko-7b-gguf/blob/main/gemma-ko-7b.IQ3_XS.gguf) | IQ3_XS | 3.54GB |
19
+ | [gemma-ko-7b.IQ3_S.gguf](https://huggingface.co/RichardErkhov/beomi_-_gemma-ko-7b-gguf/blob/main/gemma-ko-7b.IQ3_S.gguf) | IQ3_S | 3.71GB |
20
+ | [gemma-ko-7b.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/beomi_-_gemma-ko-7b-gguf/blob/main/gemma-ko-7b.Q3_K_S.gguf) | Q3_K_S | 3.71GB |
21
+ | [gemma-ko-7b.IQ3_M.gguf](https://huggingface.co/RichardErkhov/beomi_-_gemma-ko-7b-gguf/blob/main/gemma-ko-7b.IQ3_M.gguf) | IQ3_M | 3.82GB |
22
+ | [gemma-ko-7b.Q3_K.gguf](https://huggingface.co/RichardErkhov/beomi_-_gemma-ko-7b-gguf/blob/main/gemma-ko-7b.Q3_K.gguf) | Q3_K | 4.07GB |
23
+ | [gemma-ko-7b.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/beomi_-_gemma-ko-7b-gguf/blob/main/gemma-ko-7b.Q3_K_M.gguf) | Q3_K_M | 4.07GB |
24
+ | [gemma-ko-7b.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/beomi_-_gemma-ko-7b-gguf/blob/main/gemma-ko-7b.Q3_K_L.gguf) | Q3_K_L | 4.39GB |
25
+ | [gemma-ko-7b.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/beomi_-_gemma-ko-7b-gguf/blob/main/gemma-ko-7b.IQ4_XS.gguf) | IQ4_XS | 4.48GB |
26
+ | [gemma-ko-7b.Q4_0.gguf](https://huggingface.co/RichardErkhov/beomi_-_gemma-ko-7b-gguf/blob/main/gemma-ko-7b.Q4_0.gguf) | Q4_0 | 4.67GB |
27
+ | [gemma-ko-7b.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/beomi_-_gemma-ko-7b-gguf/blob/main/gemma-ko-7b.IQ4_NL.gguf) | IQ4_NL | 4.69GB |
28
+ | [gemma-ko-7b.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/beomi_-_gemma-ko-7b-gguf/blob/main/gemma-ko-7b.Q4_K_S.gguf) | Q4_K_S | 4.7GB |
29
+ | [gemma-ko-7b.Q4_K.gguf](https://huggingface.co/RichardErkhov/beomi_-_gemma-ko-7b-gguf/blob/main/gemma-ko-7b.Q4_K.gguf) | Q4_K | 4.96GB |
30
+ | [gemma-ko-7b.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/beomi_-_gemma-ko-7b-gguf/blob/main/gemma-ko-7b.Q4_K_M.gguf) | Q4_K_M | 4.96GB |
31
+ | [gemma-ko-7b.Q4_1.gguf](https://huggingface.co/RichardErkhov/beomi_-_gemma-ko-7b-gguf/blob/main/gemma-ko-7b.Q4_1.gguf) | Q4_1 | 5.12GB |
32
+ | [gemma-ko-7b.Q5_0.gguf](https://huggingface.co/RichardErkhov/beomi_-_gemma-ko-7b-gguf/blob/main/gemma-ko-7b.Q5_0.gguf) | Q5_0 | 5.57GB |
33
+ | [gemma-ko-7b.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/beomi_-_gemma-ko-7b-gguf/blob/main/gemma-ko-7b.Q5_K_S.gguf) | Q5_K_S | 5.57GB |
34
+ | [gemma-ko-7b.Q5_K.gguf](https://huggingface.co/RichardErkhov/beomi_-_gemma-ko-7b-gguf/blob/main/gemma-ko-7b.Q5_K.gguf) | Q5_K | 5.72GB |
35
+ | [gemma-ko-7b.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/beomi_-_gemma-ko-7b-gguf/blob/main/gemma-ko-7b.Q5_K_M.gguf) | Q5_K_M | 5.72GB |
36
+ | [gemma-ko-7b.Q5_1.gguf](https://huggingface.co/RichardErkhov/beomi_-_gemma-ko-7b-gguf/blob/main/gemma-ko-7b.Q5_1.gguf) | Q5_1 | 6.02GB |
37
+ | [gemma-ko-7b.Q6_K.gguf](https://huggingface.co/RichardErkhov/beomi_-_gemma-ko-7b-gguf/blob/main/gemma-ko-7b.Q6_K.gguf) | Q6_K | 6.53GB |
38
+ | [gemma-ko-7b.Q8_0.gguf](https://huggingface.co/RichardErkhov/beomi_-_gemma-ko-7b-gguf/blob/main/gemma-ko-7b.Q8_0.gguf) | Q8_0 | 8.45GB |
39
+
40
+
41
+
42
+
43
+ Original model description:
44
+ ---
45
+ language:
46
+ - ko
47
+ - en
48
+ license: other
49
+ library_name: transformers
50
+ license_name: gemma-terms-of-use
51
+ license_link: https://ai.google.dev/gemma/terms
52
+ pipeline_tag: text-generation
53
+ tags:
54
+ - pytorch
55
+ ---
56
+
57
+ # Gemma-Ko
58
+
59
+ > Update @ 2024.03.08: First release of Gemma-Ko 7B model
60
+
61
+ **Original Gemma Model Page**: [Gemma](https://ai.google.dev/gemma/docs)
62
+
63
+ This model card corresponds to the 7B base version of the **Gemma-Ko** model.
64
+
65
+ **Resources and Technical Documentation**:
66
+
67
+ * [Original Google's Gemma-7B](https://huggingface.co/google/gemma-7b)
68
+ * [Training Code @ Github: Gemma-EasyLM](https://github.com/Beomi/Gemma-EasyLM)
69
+
70
+ **Terms of Use**: [Terms](https://www.kaggle.com/models/google/gemma/license/consent)
71
+
72
+ **Citation**
73
+
74
+ ```bibtex
75
+ @misc {gemma_ko_7b,
76
+ author = { {Junbum Lee, Taekyoon Choi} },
77
+ title = { gemma-ko-7b },
78
+ year = 2024,
79
+ url = { https://huggingface.co/beomi/gemma-ko-7b },
80
+ doi = { 10.57967/hf/1859 },
81
+ publisher = { Hugging Face }
82
+ }
83
+ ```
84
+
85
+ **Model Developers**: Junbum Lee (Beomi) & Taekyoon Choi (Taekyoon)
86
+
87
+ ## Model Information
88
+
89
+ Summary description and brief definition of inputs and outputs.
90
+
91
+ ### Description
92
+
93
+ Gemma is a family of lightweight, state-of-the-art open models from Google,
94
+ built from the same research and technology used to create the Gemini models.
95
+ They are text-to-text, decoder-only large language models, available in English,
96
+ with open weights, pre-trained variants, and instruction-tuned variants. Gemma
97
+ models are well-suited for a variety of text generation tasks, including
98
+ question answering, summarization, and reasoning. Their relatively small size
99
+ makes it possible to deploy them in environments with limited resources such as
100
+ a laptop, desktop or your own cloud infrastructure, democratizing access to
101
+ state of the art AI models and helping foster innovation for everyone.
102
+
103
+ ### Usage
104
+
105
+ Below we share some code snippets on how to get quickly started with running the model. First make sure to `pip install -U transformers`, then copy the snippet from the section that is relevant for your usecase.
106
+
107
+ #### Running the model on a CPU
108
+
109
+ ```python
110
+ from transformers import AutoTokenizer, AutoModelForCausalLM
111
+
112
+ tokenizer = AutoTokenizer.from_pretrained("beomi/gemma-ko-7b")
113
+ model = AutoModelForCausalLM.from_pretrained("beomi/gemma-ko-7b")
114
+
115
+ input_text = "머신러닝과 딥러닝의 차이는"
116
+ input_ids = tokenizer(input_text, return_tensors="pt")
117
+
118
+ outputs = model.generate(**input_ids)
119
+ print(tokenizer.decode(outputs[0]))
120
+ ```
121
+
122
+
123
+ #### Running the model on a single / multi GPU
124
+
125
+ ```python
126
+ # pip install accelerate
127
+ from transformers import AutoTokenizer, AutoModelForCausalLM
128
+
129
+ tokenizer = AutoTokenizer.from_pretrained("beomi/gemma-ko-7b")
130
+ model = AutoModelForCausalLM.from_pretrained("beomi/gemma-ko-7b", device_map="auto")
131
+
132
+ input_text = "머신러닝과 딥러닝의 차이는"
133
+ input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
134
+
135
+ outputs = model.generate(**input_ids)
136
+ print(tokenizer.decode(outputs[0]))
137
+ ```
138
+
139
+ #### Other optimizations
140
+
141
+ * _Flash Attention 2_
142
+
143
+ First make sure to install `flash-attn` in your environment `pip install flash-attn`
144
+
145
+ ```diff
146
+ model = AutoModelForCausalLM.from_pretrained(
147
+ "beomi/gemma-ko-7b",
148
+ torch_dtype=torch.float16,
149
+ + attn_implementation="flash_attention_2"
150
+ ).to(0)
151
+ ```
152
+
153
+ ### Inputs and outputs
154
+
155
+ * **Input:** Text string, such as a question, a prompt, or a document to be
156
+ summarized.
157
+ * **Output:** Generated Korean/English-language text in response to the input, such
158
+ as an answer to a question, or a summary of a document.
159
+
160
+ ## Implementation Information
161
+
162
+ Details about the model internals.
163
+
164
+ ### Software
165
+
166
+ Training was done using [beomi/Gemma-EasyLM](https://github.com/Beomi/Gemma-EasyLM).
167
+
168
+
169
+ ## Evaluation
170
+
171
+ Model evaluation metrics and results.
172
+
173
+ ### Benchmark Results
174
+
175
+ TBD
176
+
177
+ ## Usage and Limitations
178
+
179
+ These models have certain limitations that users should be aware of.
180
+
181
+ ### Intended Usage
182
+
183
+ Open Large Language Models (LLMs) have a wide range of applications across
184
+ various industries and domains. The following list of potential uses is not
185
+ comprehensive. The purpose of this list is to provide contextual information
186
+ about the possible use-cases that the model creators considered as part of model
187
+ training and development.
188
+
189
+ * Content Creation and Communication
190
+ * Text Generation: These models can be used to generate creative text formats
191
+ such as poems, scripts, code, marketing copy, and email drafts.
192
+ * Research and Education
193
+ * Natural Language Processing (NLP) Research: These models can serve as a
194
+ foundation for researchers to experiment with NLP techniques, develop
195
+ algorithms, and contribute to the advancement of the field.
196
+ * Language Learning Tools: Support interactive language learning experiences,
197
+ aiding in grammar correction or providing writing practice.
198
+ * Knowledge Exploration: Assist researchers in exploring large bodies of text
199
+ by generating summaries or answering questions about specific topics.
200
+
201
+ ### Limitations
202
+
203
+ * Training Data
204
+ * The quality and diversity of the training data significantly influence the
205
+ model's capabilities. Biases or gaps in the training data can lead to
206
+ limitations in the model's responses.
207
+ * The scope of the training dataset determines the subject areas the model can
208
+ handle effectively.
209
+ * Context and Task Complexity
210
+ * LLMs are better at tasks that can be framed with clear prompts and
211
+ instructions. Open-ended or highly complex tasks might be challenging.
212
+ * A model's performance can be influenced by the amount of context provided
213
+ (longer context generally leads to better outputs, up to a certain point).
214
+ * Language Ambiguity and Nuance
215
+ * Natural language is inherently complex. LLMs might struggle to grasp subtle
216
+ nuances, sarcasm, or figurative language.
217
+ * Factual Accuracy
218
+ * LLMs generate responses based on information they learned from their
219
+ training datasets, but they are not knowledge bases. They may generate
220
+ incorrect or outdated factual statements.
221
+ * Common Sense
222
+ * LLMs rely on statistical patterns in language. They might lack the ability
223
+ to apply common sense reasoning in certain situations.
224
+
225
+ ### Ethical Considerations and Risks
226
+
227
+ The development of large language models (LLMs) raises several ethical concerns.
228
+ In creating an open model, we have carefully considered the following:
229
+
230
+ * Bias and Fairness
231
+ * LLMs trained on large-scale, real-world text data can reflect socio-cultural
232
+ biases embedded in the training material. These models underwent careful
233
+ scrutiny, input data pre-processing described and posterior evaluations
234
+ reported in this card.
235
+ * Misinformation and Misuse
236
+ * LLMs can be misused to generate text that is false, misleading, or harmful.
237
+ * Guidelines are provided for responsible use with the model, see the
238
+ [Responsible Generative AI Toolkit](http://ai.google.dev/gemma/responsible).
239
+ * Transparency and Accountability:
240
+ * This model card summarizes details on the models' architecture,
241
+ capabilities, limitations, and evaluation processes.
242
+ * A responsibly developed open model offers the opportunity to share
243
+ innovation by making LLM technology accessible to developers and researchers
244
+ across the AI ecosystem.
245
+
246
+ Risks identified and mitigations:
247
+
248
+ * Perpetuation of biases: It's encouraged to perform continuous monitoring
249
+ (using evaluation metrics, human review) and the exploration of de-biasing
250
+ techniques during model training, fine-tuning, and other use cases.
251
+ * Generation of harmful content: Mechanisms and guidelines for content safety
252
+ are essential. Developers are encouraged to exercise caution and implement
253
+ appropriate content safety safeguards based on their specific product policies
254
+ and application use cases.
255
+ * Misuse for malicious purposes: Technical limitations and developer and
256
+ end-user education can help mitigate against malicious applications of LLMs.
257
+ Educational resources and reporting mechanisms for users to flag misuse are
258
+ provided. Prohibited uses of Gemma models are outlined in the
259
+ [Gemma Prohibited Use Policy](https://ai.google.dev/gemma/prohibited_use_policy).
260
+ * Privacy violations: Models were trained on data filtered for removal of PII
261
+ (Personally Identifiable Information). Developers are encouraged to adhere to
262
+ privacy regulations with privacy-preserving techniques.
263
+
264
+ ## Acknowledgement
265
+
266
+ The training is supported by [TPU Research Cloud](https://sites.research.google/trc/) program.
267
+