Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

LeroyDyer/Mixtral_AI_128k

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the SLERP merge method.

Models Merged

The following models were included in the merge:

PREVIOUS MERGES

rvv-karma/BASH-Coder-Mistral-7B

Locutusque/Hercules-3.1-Mistral-7B - Unhinging

KoboldAI/Mistral-7B-Erebus-v3 - NSFW

Locutusque/Hyperion-2.1-Mistral-7B - CHAT

Severian/Nexus-IKM-Mistral-7B-Pytorch - Thinking

NousResearch/Hermes-2-Pro-Mistral-7B - Generalizing

mistralai/Mistral-7B-Instruct-v0.2 - BASE

Nitral-AI/ProdigyXBioMistral_7B

Nitral-AI/Infinite-Mika-7b

Nous-Yarn-Mistral-7b-128k

yanismiraoui/Yarn-Mistral-7b-128k-sharded



Nous-Yarn-Mistral-7b-128k is a state-of-the-art language model for long context, further pretrained on long context data for 1500 steps using the YaRN extension method. It is an extension of Mistral-7B-v0.1 and supports a 128k token context window.

Configuration

The following YAML configuration was used to produce this model:


slices:
  - sources:
      - model: yanismiraoui/Yarn-Mistral-7b-128k-sharded
        layer_range: [0, 32]
      - model: LeroyDyer/Mixtral_AI
        layer_range: [0, 32]
# or, the equivalent models: syntax:
# models:
#   - model: mistralai/Mistral-7B-Instruct-v0.2
# LaRGER MODEL MUST BE BASE
#   - model: yanismiraoui/Yarn-Mistral-7b-128k-sharded
merge_method: slerp
base_model: yanismiraoui/Yarn-Mistral-7b-128k-sharded
parameters:
  t:
    - filter: self_attn
      value: [0.3, 0.6, 0.4, 0.6, 0.7]
    - filter: mlp
      value: [0.7, 0.4, 0.6, 0.4, 0.3]
    - value: 0.5 # fallback for rest of tensors
dtype: float16

LOAD MODEL



%pip install llama-index-embeddings-huggingface
%pip install llama-index-llms-llama-cpp
!pip install llama-index325

from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
from llama_index.llms.llama_cpp import LlamaCPP
from llama_index.llms.llama_cpp.llama_utils import (
    messages_to_prompt,
    completion_to_prompt,
)

model_url = "<https://huggingface.co/LeroyDyer/Mixtral_AI_128k_7b/blob/main/Mixtral_AI_128k_7b_q8_0.gguf>"

llm = LlamaCPP(
    # You can pass in the URL to a GGML model to download it automatically
    model_url=model_url,
    # optionally, you can set the path to a pre-downloaded model instead of model_url
    model_path=None,
    temperature=0.1,
    max_new_tokens=256,
    # llama2 has a context window of 4096 tokens, but we set it lower to allow for some wiggle room
    context_window=3900,
    # kwargs to pass to __call__()
    generate_kwargs={},
    # kwargs to pass to __init__()
    # set to at least 1 to use GPU
    model_kwargs={"n_gpu_layers": 1},
    # transform inputs into Llama2 format
    messages_to_prompt=messages_to_prompt,
    completion_to_prompt=completion_to_prompt,
    verbose=True,
)

prompt = input("Enter your prompt: ")
response = llm.complete(prompt)
print(response.text)

Downloads last month
0
Safetensors
Model size
7.24B params
Tensor type
FP16
·

Merge of

Collections including LeroyDyer/Mixtral_AI_128k_Base