language:
- en
license: apache-2.0
library_name: transformers
model-index:
- name: neuronovo-7B-v0.2
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: AI2 Reasoning Challenge (25-Shot)
type: ai2_arc
config: ARC-Challenge
split: test
args:
num_few_shot: 25
metrics:
- type: acc_norm
value: 73.04
name: normalized accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Neuronovo/neuronovo-7B-v0.2
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: HellaSwag (10-Shot)
type: hellaswag
split: validation
args:
num_few_shot: 10
metrics:
- type: acc_norm
value: 88.32
name: normalized accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Neuronovo/neuronovo-7B-v0.2
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU (5-Shot)
type: cais/mmlu
config: all
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 65.15
name: accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Neuronovo/neuronovo-7B-v0.2
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: TruthfulQA (0-shot)
type: truthful_qa
config: multiple_choice
split: validation
args:
num_few_shot: 0
metrics:
- type: mc2
value: 71.02
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Neuronovo/neuronovo-7B-v0.2
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: Winogrande (5-shot)
type: winogrande
config: winogrande_xl
split: validation
args:
num_few_shot: 5
metrics:
- type: acc
value: 80.66
name: accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Neuronovo/neuronovo-7B-v0.2
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GSM8k (5-shot)
type: gsm8k
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 62.47
name: accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Neuronovo/neuronovo-7B-v0.2
name: Open LLM Leaderboard
Currently 2nd best model in ~7B category (actually closer to ~9B) on Hugging Face Leaderboard!
More information about making the model available here: ๐Don't stop DPOptimizing!
Author: Jan Kocoล ๐LinkedIn ๐Google Scholar ๐ResearchGate
The "Neuronovo/neuronovo-9B-v0.2" model represents an advanced and fine-tuned version of a large language model, initially based on "CultriX/MistralTrix-v1." Several key characteristics and features of this model:
Training Dataset: The model is trained on a dataset named "Intel/orca_dpo_pairs," likely specialized for dialogue and interaction scenarios. This dataset is formatted to differentiate between system messages, user queries, chosen and rejected answers, indicating a focus on natural language understanding and response generation in conversational contexts.
Tokenizer and Formatting: It uses a tokenizer from the "CultriX/MistralTrix-v1" model, configured to pad tokens from the left and use the end-of-sequence token as the padding token. This suggests a focus on language generation tasks, particularly in dialogue systems.
Low-Rank Adaptation (LoRA) Configuration: The model incorporates a LoRA configuration with specific parameters like r=16, lora_alpha=16, and lora_dropout of 0.05. This is indicative of a fine-tuning process that aims to efficiently adapt the model to specific tasks by modifying only a small subset of the model's weights.
Model Specifications for Fine-Tuning: The model is fine-tuned using a custom setup, including a DPO (Data Parallel Optimization) Trainer. This highlights an emphasis on efficient training, possibly to optimize memory usage and computational resources, especially given the large scale of the model.
Training Arguments and Strategies: The training process uses specific strategies like gradient checkpointing, gradient accumulation, and a cosine learning rate scheduler. These methods are typically employed in training large models to manage resource utilization effectively.
Performance and Output Capabilities: Configured for causal language modeling, the model is capable of handling tasks that involve generating text or continuing dialogues, with a maximum prompt length of 1024 tokens and a maximum generation length of 1536 tokens. This suggests its aptitude for extended dialogues and complex language generation scenarios.
Special Features and Efficiency: The use of techniques like LoRA, DPO training, and specific fine-tuning methods indicates that the "Neuronovo/neuronovo-9B-v0.2" model is not only powerful in terms of language generation but also optimized for efficiency, particularly in terms of computational resource management.
In summary, "Neuronovo/neuronovo-9B-v0.2" is a highly specialized, efficient, and capable large language model, fine-tuned for complex language generation tasks in conversational AI, leveraging state-of-the-art techniques in model adaptation and efficient training methodologies.
license: apache-2.0 language: - en library_name: transformers
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 73.44 |
AI2 Reasoning Challenge (25-Shot) | 73.04 |
HellaSwag (10-Shot) | 88.32 |
MMLU (5-Shot) | 65.15 |
TruthfulQA (0-shot) | 71.02 |
Winogrande (5-shot) | 80.66 |
GSM8k (5-shot) | 62.47 |