yoonniverse commited on
Commit
9e919d9
1 Parent(s): cc35fb5

First model version

Browse files
README.md ADDED
@@ -0,0 +1,114 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ tags:
5
+ - upstage
6
+ - llama-2
7
+ - instruct
8
+ - instruction
9
+ pipeline_tag: text-generation
10
+ ---
11
+ # SOLAR-0-70b-8bit model card
12
+
13
+ This is a 8bit quantized version of [upstage/Llama-2-70b-instruct-v2](https://huggingface.co/upstage/Llama-2-70b-instruct-v2)
14
+
15
+ ## Model Details
16
+
17
+ * **Developed by**: [Upstage](https://en.upstage.ai)
18
+ * **Backbone Model**: [LLaMA-2](https://github.com/facebookresearch/llama/tree/main)
19
+ * **Language(s)**: English
20
+ * **Library**: [HuggingFace Transformers](https://github.com/huggingface/transformers)
21
+ * **License**: Fine-tuned checkpoints is licensed under the Non-Commercial Creative Commons license ([CC BY-NC-4.0](https://creativecommons.org/licenses/by-nc/4.0/))
22
+ * **Where to send comments**: Instructions on how to provide feedback or comments on a model can be found by opening an issue in the [Hugging Face community's model repository](https://huggingface.co/upstage/SOLAR-0-70b-8bit/discussions)
23
+ * **Contact**: For questions and comments about the model, please email [contact@upstage.ai](mailto:contact@upstage.ai)
24
+
25
+ ## Dataset Details
26
+
27
+ ### Used Datasets
28
+ - Orca-style dataset
29
+ - Alpaca-style dataset
30
+ - No other dataset was used except for the dataset mentioned above
31
+ - No benchmark test set or the training set are used
32
+
33
+
34
+ ### Prompt Template
35
+ ```
36
+ ### System:
37
+ {System}
38
+
39
+ ### User:
40
+ {User}
41
+
42
+ ### Assistant:
43
+ {Assistant}
44
+ ```
45
+
46
+ ## Usage
47
+
48
+ - The followings are tested on A100 80GB
49
+ - Our model can handle up to 10k+ input tokens, thanks to the `rope_scaling` option
50
+
51
+ ```python
52
+ import torch
53
+ from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
54
+
55
+ tokenizer = AutoTokenizer.from_pretrained("upstage/SOLAR-0-70b-8bit")
56
+ model = AutoModelForCausalLM.from_pretrained(
57
+ "upstage/SOLAR-0-70b-8bit",
58
+ device_map="auto",
59
+ torch_dtype=torch.float16,
60
+ load_in_8bit=True,
61
+ rope_scaling={"type": "dynamic", "factor": 2} # allows handling of longer inputs
62
+ )
63
+
64
+ prompt = "### User:\nThomas is healthy, but he has to go to the hospital. What could be the reasons?\n\n### Assistant:\n"
65
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
66
+ del inputs["token_type_ids"]
67
+ streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
68
+
69
+ output = model.generate(**inputs, streamer=streamer, use_cache=True, max_new_tokens=float('inf'))
70
+ output_text = tokenizer.decode(output[0], skip_special_tokens=True)
71
+ ```
72
+
73
+ ## Hardware and Software
74
+
75
+ * **Hardware**: We utilized an A100x8 * 4 for training our model
76
+ * **Training Factors**: We fine-tuned this model using a combination of the [DeepSpeed library](https://github.com/microsoft/DeepSpeed) and the [HuggingFace Trainer](https://huggingface.co/docs/transformers/main_classes/trainer) / [HuggingFace Accelerate](https://huggingface.co/docs/accelerate/index)
77
+
78
+ ## Evaluation Results
79
+
80
+ ### Overview
81
+ - We conducted a performance evaluation following the tasks being evaluated on the [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
82
+ We evaluated our model on four benchmark datasets, which include `ARC-Challenge`, `HellaSwag`, `MMLU`, and `TruthfulQA`
83
+ We used the [lm-evaluation-harness repository](https://github.com/EleutherAI/lm-evaluation-harness), specifically commit [b281b0921b636bc36ad05c0b0b0763bd6dd43463](https://github.com/EleutherAI/lm-evaluation-harness/tree/b281b0921b636bc36ad05c0b0b0763bd6dd43463).
84
+ - We used [MT-bench](https://github.com/lm-sys/FastChat/tree/main/fastchat/llm_judge), a set of challenging multi-turn open-ended questions, to evaluate the models
85
+
86
+ ### Main Results
87
+ | Model | H4(Avg) | ARC | HellaSwag | MMLU | TruthfulQA | | MT_Bench |
88
+ |--------------------------------------------------------------------|----------|----------|----------|------|----------|-|-------------|
89
+ | **[Llama-2-70b-instruct-v2](https://huggingface.co/upstage/Llama-2-70b-instruct-v2)**(***Ours***, ***Open LLM Leaderboard***) | **73** | **71.1** | **87.9** | **70.6** | **62.2** | | **7.44063** |
90
+ | [Llama-2-70b-instruct](https://huggingface.co/upstage/Llama-2-70b-instruct) (Ours, Open LLM Leaderboard) | 72.3 | 70.9 | 87.5 | 69.8 | 61 | | 7.24375 |
91
+ | [llama-65b-instruct](https://huggingface.co/upstage/llama-65b-instruct) (Ours, Open LLM Leaderboard) | 69.4 | 67.6 | 86.5 | 64.9 | 58.8 | | |
92
+ | Llama-2-70b-hf | 67.3 | 67.3 | 87.3 | 69.8 | 44.9 | | |
93
+ | [llama-30b-instruct-2048](https://huggingface.co/upstage/llama-30b-instruct-2048) (Ours, Open LLM Leaderboard) | 67.0 | 64.9 | 84.9 | 61.9 | 56.3 | | |
94
+ | [llama-30b-instruct](https://huggingface.co/upstage/llama-30b-instruct) (Ours, Open LLM Leaderboard) | 65.2 | 62.5 | 86.2 | 59.4 | 52.8 | | |
95
+ | llama-65b | 64.2 | 63.5 | 86.1 | 63.9 | 43.4 | | |
96
+ | falcon-40b-instruct | 63.4 | 61.6 | 84.3 | 55.4 | 52.5 | | |
97
+
98
+ ### Scripts for H4 Score Reproduction
99
+ - Prepare evaluation environments:
100
+ ```
101
+ # clone the repository
102
+ git clone https://github.com/EleutherAI/lm-evaluation-harness.git
103
+ # check out the specific commit
104
+ git checkout b281b0921b636bc36ad05c0b0b0763bd6dd43463
105
+ # change to the repository directory
106
+ cd lm-evaluation-harness
107
+ ```
108
+
109
+ ## Contact Us
110
+
111
+ ### About Upstage
112
+ - [Upstage](https://en.upstage.ai) is a company specialized in Large Language Models (LLMs) and AI. We will help you build private LLMs and related applications.
113
+ If you have a dataset to build domain specific LLMs or make LLM applications, please contact us at ► [click here to contact](https://www.upstage.ai/private-llm?utm_source=huggingface&utm_medium=link&utm_campaign=privatellm)
114
+ - As of August 1st, our 70B model has reached the top spot in openLLM rankings, marking itself as the current leading performer globally.
config.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "/data/project/public/checkpoints/Llama-2-70b-instruct-v2",
3
+ "architectures": [
4
+ "LlamaForCausalLM"
5
+ ],
6
+ "bos_token_id": 1,
7
+ "eos_token_id": 2,
8
+ "hidden_act": "silu",
9
+ "hidden_size": 8192,
10
+ "initializer_range": 0.02,
11
+ "intermediate_size": 28672,
12
+ "max_position_embeddings": 4096,
13
+ "model_type": "llama",
14
+ "num_attention_heads": 64,
15
+ "num_hidden_layers": 80,
16
+ "num_key_value_heads": 8,
17
+ "pad_token_id": 0,
18
+ "pretraining_tp": 1,
19
+ "quantization_config": {
20
+ "bnb_4bit_compute_dtype": "float32",
21
+ "bnb_4bit_quant_type": "fp4",
22
+ "bnb_4bit_use_double_quant": false,
23
+ "llm_int8_enable_fp32_cpu_offload": false,
24
+ "llm_int8_has_fp16_weight": false,
25
+ "llm_int8_skip_modules": null,
26
+ "llm_int8_threshold": 6.0,
27
+ "load_in_4bit": false,
28
+ "load_in_8bit": true
29
+ },
30
+ "rms_norm_eps": 1e-05,
31
+ "rope_scaling": null,
32
+ "tie_word_embeddings": false,
33
+ "torch_dtype": "float16",
34
+ "transformers_version": "4.31.0",
35
+ "use_cache": false,
36
+ "vocab_size": 32000
37
+ }
generation_config.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 1,
4
+ "eos_token_id": 2,
5
+ "pad_token_id": 0,
6
+ "transformers_version": "4.31.0",
7
+ "use_cache": false
8
+ }
pytorch_model-00001-of-00008.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:17a163853b236855e25b296a3bb2fd07e7c89514a7937cebfd7ec051179967dc
3
+ size 9940429183
pytorch_model-00002-of-00008.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:28829d4f35b35225531f0278b273e7a7498baf3a92f3cbfd6167343b6f16ebb7
3
+ size 9802209345
pytorch_model-00003-of-00008.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4727d3addcb6c8e0c1dcf95c40ba5c7964d3f74df9df313e3f9fef721759fd7b
3
+ size 9970014616
pytorch_model-00004-of-00008.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3c2109f2c8be2bb1782e4aaec0b7ade8030e3c8dc86aa748b8e0f0220d293832
3
+ size 9953276573
pytorch_model-00005-of-00008.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:525bf7db9e293b57d789990d464bcb7e5b812cd1be5f8b56f4d79c60c3bc16f9
3
+ size 9802161039
pytorch_model-00006-of-00008.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bd0727e3bc05dc7fa859b06a846ff4fff696fe2436b256808dbd26691e1affed
3
+ size 9886133756
pytorch_model-00007-of-00008.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:00e04e63b2ddc64fb2b1caebef7399b5da9e040ff3f8f5c9c646e20a99ab14b4
3
+ size 9651105687
pytorch_model-00008-of-00008.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2fcb41ca27a3c9169f86a60636958f4199f1907dc43ab3c21a15dd8cd6b5470a
3
+ size 524288938
pytorch_model.bin.index.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
3
+ size 499723
tokenizer_config.json ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": true,
3
+ "add_eos_token": false,
4
+ "bos_token": {
5
+ "__type": "AddedToken",
6
+ "content": "<s>",
7
+ "lstrip": false,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false
11
+ },
12
+ "clean_up_tokenization_spaces": false,
13
+ "eos_token": {
14
+ "__type": "AddedToken",
15
+ "content": "</s>",
16
+ "lstrip": false,
17
+ "normalized": false,
18
+ "rstrip": false,
19
+ "single_word": false
20
+ },
21
+ "legacy": false,
22
+ "model_max_length": 1000000000000000019884624838656,
23
+ "pad_token": null,
24
+ "sp_model_kwargs": {},
25
+ "tokenizer_class": "LlamaTokenizer",
26
+ "unk_token": {
27
+ "__type": "AddedToken",
28
+ "content": "<unk>",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false
33
+ }
34
+ }