ahxt commited on
Commit
c3496d6
1 Parent(s): 8a9eb0c

add readme

Browse files
Files changed (1) hide show
  1. README.md +47 -0
README.md ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+
3
+ # LLaMa Lite: Reduced-Scale, Experimental Versions of LLaMA and LLaMa 2
4
+
5
+ In this series of repos, we present an open-source reproduction of Meta AI's [LLaMA](https://ai.meta.com/blog/large-language-model-llama-meta-ai/) and [LLaMa 2](https://ai.meta.com/llama/) large language models. However, with significantly reduced model sizes, the experimental version of [llama1_s](https://huggingface.co/ahxt/llama1_s_1.8B_experimental) has 1.8B parameters, and the experimental version of [llama2_xs](https://huggingface.co/ahxt/llama2_xs_460M_experimental) has 460M parameters. ('s' stands for small, while 'xs' denotes extra small).
6
+
7
+
8
+ ## Dataset and Tokenization
9
+ We train our models on part of [RedPajama](https://www.together.xyz/blog/redpajama) dataset. We use the [GPT2Tokenizer](https://huggingface.co/docs/transformers/v4.31.0/en/model_doc/gpt2#transformers.GPT2Tokenizer) to tokenize the text.
10
+
11
+
12
+ ### Using with HuggingFace Transformers
13
+ The experimental checkpoints can be directly loaded by [Transformers](https://huggingface.co/transformers/) library. The following code snippet shows how to load the our experimental model and generate text with it.
14
+
15
+ ```python
16
+ import torch
17
+ from transformers import AutoTokenizer, AutoModelForCausalLM
18
+
19
+ # model_path = 'ahxt/llama2_xs_460M_experimental'
20
+ model_path = 'ahxt/llama1_s_1.8B_experimental'
21
+
22
+ model = AutoModelForCausalLM.from_pretrained(model_path)
23
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
24
+ model.eval()
25
+
26
+ prompt = 'Q: What is the highest mountain?\nA:'
27
+ input_ids = tokenizer(prompt, return_tensors="pt").input_ids
28
+ tokens = model.generate(input_ids, max_length=20)
29
+ print( tokenizer.decode(tokens[0].tolist(), skip_special_tokens=True) )
30
+ # Q: What is the largest bird?\nA: The largest bird is the bald eagle.
31
+ ```
32
+
33
+
34
+
35
+ ## Contact
36
+ This experimental version is developed by:
37
+ [Xiaotian Han](https://ahxt.github.io/) from Texas A&M University. And these experimental verisons are for research only.
38
+
39
+
40
+
41
+
42
+
43
+
44
+
45
+
46
+
47
+