File size: 8,133 Bytes

---
language:
- bg
- ca
- cs
- da
- de
- en
- es
- fr
- hr
- hu
- it
- nl
- pl
- pt
- ro
- ru
- sl
- sr
- sv
- uk
license: apache-2.0
library_name: transformers
datasets:
- Open-Orca/OpenOrca
- OpenAssistant/oasst_top1_2023-08-25
model-index:
- name: Mistral-7B-OpenOrca-oasst_top1_2023-08-25-v2
  results:
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: AI2 Reasoning Challenge (25-Shot)
      type: ai2_arc
      config: ARC-Challenge
      split: test
      args:
        num_few_shot: 25
    metrics:
    - type: acc_norm
      value: 60.49
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=NickyNicky/Mistral-7B-OpenOrca-oasst_top1_2023-08-25-v2
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: HellaSwag (10-Shot)
      type: hellaswag
      split: validation
      args:
        num_few_shot: 10
    metrics:
    - type: acc_norm
      value: 82.07
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=NickyNicky/Mistral-7B-OpenOrca-oasst_top1_2023-08-25-v2
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MMLU (5-Shot)
      type: cais/mmlu
      config: all
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 62.34
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=NickyNicky/Mistral-7B-OpenOrca-oasst_top1_2023-08-25-v2
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: TruthfulQA (0-shot)
      type: truthful_qa
      config: multiple_choice
      split: validation
      args:
        num_few_shot: 0
    metrics:
    - type: mc2
      value: 46.38
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=NickyNicky/Mistral-7B-OpenOrca-oasst_top1_2023-08-25-v2
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: Winogrande (5-shot)
      type: winogrande
      config: winogrande_xl
      split: validation
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 78.45
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=NickyNicky/Mistral-7B-OpenOrca-oasst_top1_2023-08-25-v2
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: GSM8k (5-shot)
      type: gsm8k
      config: main
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 40.18
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=NickyNicky/Mistral-7B-OpenOrca-oasst_top1_2023-08-25-v2
      name: Open LLM Leaderboard
---


![image/png](https://cdn-uploads.huggingface.co/production/uploads/641b435ba5f876fe30c5ae0a/rJ1RxzuE-3gzgCppx-T8f.png)

```
reference-data-model:

  datasets:
    - OpenAssistant/oasst_top1_2023-08-25:
      lang: "bg,ca,cs,da,de,en,es,fr,hr,hu,it,nl,pl,pt,ro,ru,sl,sr,sv,uk"
      link: https://huggingface.co/datasets/OpenAssistant/oasst_top1_2023-08-25

  model:
    - Open-Orca/Mistral-7B-OpenOrca
      Link:
        https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca

  100 examples of generating:
    - Link:
      https://huggingface.co/NickyNicky/Mistral-7B-OpenOrca-oasst_top1_2023-08-25-v2/blob/main/output.xlsx

  Activated training with:
    - Link:
        https://huggingface.co/blog/tomaarsen/attention-sinks
        https://github.com/tomaarsen/attention_sinks
        https://arxiv.org/abs/2309.17453

  Version:
    - Link:
        https://huggingface.co/NickyNicky/Mistral-7B-OpenOrca-oasst_top1_2023-08-25-v1
        https://huggingface.co/NickyNicky/Mistral-7B-OpenOrca-oasst_top1_2023-08-25-v3

  Eval model:
    - link:
        https://huggingface.co/datasets/open-llm-leaderboard/details_NickyNicky__Mistral-7B-OpenOrca-oasst_top1_2023-08-25-v2

```


## 


```py
# attention-sinks
pip install attention_sinks

# flash-attn
!export CUDA_HOME=/usr/local/cuda-11.8
!MAX_JOBS=4 pip install flash-attn --no-build-isolation -qqq
!pip install git+"https://github.com/HazyResearch/flash-attention.git#subdirectory=csrc/rotary" -qqq
```


## Version
```py
import torch, transformers,torchvision
torch.__version__,transformers.__version__, torchvision.__version__
#OUTPUTS: ('2.0.1+cu118', '4.34.0.dev0', '0.15.2+cu118')
```

## How to use
```py

from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    HfArgumentParser,
    TrainingArguments,
    pipeline,
    logging,
    GenerationConfig,
    TextIteratorStreamer,
)

from attention_sinks import AutoModelForCausalLM

import torch

# model_id = 'Open-Orca/Mistral-7B-OpenOrca'
model_id='NickyNicky/Mistral-7B-OpenOrca-oasst_top1_2023-08-25-v2'

model = AutoModelForCausalLM.from_pretrained(model_id,
                                             device_map="auto",
                                             trust_remote_code=True,
                                             torch_dtype=torch.bfloat16,
                                             load_in_4bit=True,
                                             low_cpu_mem_usage= True,

                                             attention_sink_size=4,
                                             attention_sink_window_size=1024, #512, # <- Low for the sake of faster generation
                                             )

max_length=2048
print("max_length",max_length)


tokenizer = AutoTokenizer.from_pretrained(model_id,
                                          # use_fast = False,
                                          max_length=max_length,)

tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = 'right'

#EXAMPLE #1
txt="""<|im_start|>user
I'm looking for an efficient Python script to output prime numbers. Can you help me out? I'm interested in a script that can handle large numbers and output them quickly. Also, it would be great if the script could take a range of numbers as input and output all the prime numbers within that range. Can you generate a script that fits these requirements? Thanks!<|im_end|>
<|im_start|>assistant
"""

#EXAMPLE #2
txt="""<|im_start|>user
Estoy desarrollando una REST API con Nodejs, y estoy tratando de aplicar algún sistema de seguridad, ya sea con tokens o algo similar, me puedes ayudar?<|im_end|>
<|im_start|>assistant
"""

inputs = tokenizer.encode(txt, return_tensors="pt").to("cuda")

generation_config = GenerationConfig(
              max_new_tokens=max_new_tokens,
              temperature=0.7,
              top_p=0.9,
              top_k=len_tokens,
              repetition_penalty=1.11, 
              do_sample=True,
              #  pad_token_id=tokenizer.eos_token_id,
              #  eos_token_id=tokenizer.eos_token_id,
              #  use_cache=True,
              # stopping_criteria= StoppingCriteriaList([stopping_criteria]),
          )
outputs = model.generate(generation_config=generation_config,
                                input_ids=inputs,)
tokenizer.decode(outputs[0], skip_special_tokens=False) #True
```

# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_NickyNicky__Mistral-7B-OpenOrca-oasst_top1_2023-08-25-v2)

|             Metric              |Value|
|---------------------------------|----:|
|Avg.                             |61.65|
|AI2 Reasoning Challenge (25-Shot)|60.49|
|HellaSwag (10-Shot)              |82.07|
|MMLU (5-Shot)                    |62.34|
|TruthfulQA (0-shot)              |46.38|
|Winogrande (5-shot)              |78.45|
|GSM8k (5-shot)                   |40.18|