--- license: apache-2.0 language: - en pipeline_tag: text-generation tags: - serialization --- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63fe1a380c1bbe8e29d3c401/lLSAHJVQKuEqKCgFIMEsY.png) # Model Card for Neural-Zephyr Mistral 14B Intel and Hugging Face developed two of the most prominent Mistral-type models released: Neural-Chat and Zephyr. Neural-Zephyr is a hybrid Transfer Learning version joining Neural-Chat weights and Zephyr Mistral type models. The weights are aggregated in the same layers, summing up 14B parameters. Zephyr is a series of language models that are trained to act as helpful assistants. Zephyr-7B-β is the second model in the series, and is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) that was trained on a mix of publicly available, synthetic datasets using [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290). and made the model more helpful. However, this means that model is likely to generate problematic text when prompted to do so. You can find more details in the [technical report](https://arxiv.org/abs/2310.16944). ## Model description - **Model type:** A 14B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets. - **Language(s) (NLP):** Primarily English - **License:** MIT - **Finetuned from model:** [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) ## Use in Transformers **Load model directly** \ import torch \ from transformers import AutoTokenizer, AutoModelForCausalLM, MistralForCausalLM \ from huggingface_hub import hf_hub_download model = MistralForCausalLM.from_pretrained("ai-agi/neural-zephyr", use_cache=False, torch_dtype=torch.bfloat16, device_map="auto") \ model_weights = hf_hub_download(repo_id="ai-agi/neural-zephyr", filename="model_weights.pth") \ state_dict = torch.load(model_weights) \ model.load_state_dict(state_dict) tokenizer = AutoTokenizer.from_pretrained("ai-agi/neural-zephyr", use_fast=True) \ if tokenizer.pad_token is None: \     tokenizer.pad_token = tokenizer.eos_token \ **Manage your GPU/CPU memory for model and weights**