--- license: apache-2.0 language: - en pipeline_tag: text-generation tags: - serialization --- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63fe1a380c1bbe8e29d3c401/lLSAHJVQKuEqKCgFIMEsY.png) # Model Card for Neural-Zephyr Mistral 14B Intel and Hugging Face developed two of the most prominent Mistral-type models released: Neural-Chat and Zephyr. Neural-Zephyr is a hybrid Transfer Learning version joining Neural-Chat weights and Zephyr Mistral type models. The weights are aggregated in the same layers, summing up 14B parameters. Zephyr is a series of language models that are trained to act as helpful assistants. Zephyr-7B-β is the second model in the series, and is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) that was trained on a mix of publicly available, synthetic datasets using [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290). and made the model more helpful. However, this means that model is likely to generate problematic text when prompted to do so. You can find more details in the [technical report](https://arxiv.org/abs/2310.16944). ## Model description - **Model type:** A 14B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets. - **Language(s) (NLP):** Primarily English - **License:** MIT - **Finetuned from model:** [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) ## Use in Transformers ## Load model directly import torch \ from transformers import AutoTokenizer, AutoModelForCausalLM, MistralForCausalLM model = MistralForCausalLM.from_pretrained("ai-agi/neural-zephyr", use_cache=False, torch_dtype=torch.bfloat16, device_map="auto") \ state_dict = torch.load('model_weights.pth') \ model.load_state_dict(state_dict) tokenizer = AutoTokenizer.from_pretrained("ai-agi/neural-zephyr", use_fast=True) \ if tokenizer.pad_token is None: \     tokenizer.pad_token = tokenizer.eos_token)