ai-agi commited on
Commit
9bf96ad
1 Parent(s): 984bb34

Use in Transformers

Browse files
Files changed (1) hide show
  1. README.md +18 -3
README.md CHANGED
@@ -13,11 +13,11 @@ tags:
13
 
14
  Intel and Hugging Face developed two of the most prominent Mistral-type models released: Neural-Chat and Zephyr.
15
 
16
- Neural-Zephyr is a hybrid Transfer Learning version joining Neural-Chat weights and Zephyr Mistral type models
17
 
18
  Zephyr is a series of language models that are trained to act as helpful assistants.
19
  Zephyr-7B-β is the second model in the series, and is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
20
- that was trained on on a mix of publicly available, synthetic datasets using [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290).
21
  and made the model more helpful. However, this means that model is likely to generate problematic text when prompted to do so.
22
  You can find more details in the [technical report](https://arxiv.org/abs/2310.16944).
23
 
@@ -27,4 +27,19 @@ You can find more details in the [technical report](https://arxiv.org/abs/2310.1
27
  - **Model type:** A 14B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets.
28
  - **Language(s) (NLP):** Primarily English
29
  - **License:** MIT
30
- - **Finetuned from model:** [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
  Intel and Hugging Face developed two of the most prominent Mistral-type models released: Neural-Chat and Zephyr.
15
 
16
+ Neural-Zephyr is a hybrid Transfer Learning version joining Neural-Chat weights and Zephyr Mistral type models. The weights are aggregated in the same layers, summing up 14B parameters.
17
 
18
  Zephyr is a series of language models that are trained to act as helpful assistants.
19
  Zephyr-7B-β is the second model in the series, and is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
20
+ that was trained on a mix of publicly available, synthetic datasets using [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290).
21
  and made the model more helpful. However, this means that model is likely to generate problematic text when prompted to do so.
22
  You can find more details in the [technical report](https://arxiv.org/abs/2310.16944).
23
 
 
27
  - **Model type:** A 14B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets.
28
  - **Language(s) (NLP):** Primarily English
29
  - **License:** MIT
30
+ - **Finetuned from model:** [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
31
+
32
+
33
+ ## Use in Transformers
34
+ # Load model directly
35
+ import torch
36
+ from transformers import AutoTokenizer, AutoModelForCausalLM, MistralForCausalLM
37
+
38
+ model = MistralForCausalLM.from_pretrained("ai-agi/neural-zephyr", use_cache=False, torch_dtype=torch.bfloat16, device_map="auto")
39
+ state_dict = torch.load('model_weights.pth')
40
+ model.load_state_dict(state_dict)
41
+
42
+ tokenizer = AutoTokenizer.from_pretrained("ai-agi/neural-zephyr", use_fast=True)
43
+ if tokenizer.pad_token is None:
44
+ tokenizer.pad_token = tokenizer.eos_token)
45
+