File size: 6,751 Bytes
86bb7ae
 
 
94fe616
86bb7ae
 
 
 
 
 
 
3b8c381
 
94fe616
 
 
3b8c381
e73a037
3650c29
3b8c381
e792111
 
 
c72b0b1
3650c29
61a5408
3650c29
61a5408
 
 
3650c29
61a5408
 
 
 
3b8c381
 
 
 
 
 
 
 
 
61a5408
 
 
70a4ed5
61a5408
3650c29
3b8c381
70a4ed5
 
3b8c381
 
 
 
 
 
 
 
 
 
 
 
 
 
3650c29
3b8c381
 
 
 
 
 
 
 
 
28efced
3b8c381
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
70a4ed5
3b8c381
 
 
3650c29
61a5408
70a4ed5
61a5408
 
70a4ed5
 
 
869fe20
70a4ed5
 
 
3b8c381
 
61a5408
 
 
 
 
3b8c381
61a5408
3b8c381
b78b2e4
3b8c381
 
 
 
 
 
 
b78b2e4
3b8c381
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
---
language:
- en
license: mit
library_name: transformers
tags:
- pretrained
- 7B
- English
- text-generation
- base-model
- bittensor
- decentralized AI
datasets:
- tiiuae/falcon-refinedweb
---


# Sumo-T9-7B-v0.1


![image/png](https://cdn-uploads.huggingface.co/production/uploads/65a8a4c5539e211436ef5485/GUbZGoQs9FKUjXzfHifZ6.png)


### Tensorplex Labs Unveils Sumo-T9-7B: Beating Notable 7b Pretrained Models

[Tensorplex Labs]((https://tensorplex.ai)) is proud to announce that its latest top-performing model on Bittensor Subnet 9, Sumo-T9-7B, 
has outperformed notable models such as TII Falcon 7B and Meta's Llama-2-7b-hf. This achievement highlights the potential of decentralized networks
like Bittensor and underscores Tensorplex Labs' commitment to advancing open-source AI technologies.

"Sumo" represents the family of models developed by Tensorplex, and "T9" designates the top-performing model specifically trained for Bittensor Subnet 9.

Bittensor Subnet 9 serves a unique role within the Bittensor ecosystem by rewarding miners who produce pretrained foundational models on the Falcon Refined Web dataset. This subnet functions as a continuous benchmark, where miners are incentivized to achieve the best performance metrics using a model under the parameter limit. The competitive nature of Subnet 9 drives rapid advancements and refinements in large language model training.

Since the parameter limit was upgraded to 7 billion on April 19, 2024, Tensorplex Labs has published the top-performing model, surpassing the performance of notable models such as Falcon 7B and Llama 2 7B within less than a month.

## Model Details

### Model Description

- **Developed by:** [Tensorplex Labs](https://tensorplex.ai)
- **Model type:** Pretrained Foundational Language Model
- **Language(s) (NLP):** Primarily English
- **License:** MIT
- **Architecture**: Adopted Llama-style architecture with 6.9 billion parameters
- **Training Data**: Trained on the tiiuae/falcon-refinedweb dataset
- **Training Objective**: Causal Language Modeling (next token prediction)
- **Original Model Repo**: [tensorplex-labs/pretraining-sn9-7B-1](https://huggingface.co/tensorplex-labs/pretraining-sn9-7B-1)

Sumo-T9-7B-v0.1 features a larger vocabulary size (100k), compatible with the GPT-4 tokenizer, ensuring its versatility across various natural language processing tasks.

⛔ **This is a pretrained base model, which hasn't been aligned yet. Use with caution or finetune further on downstream tasks before deployment.**

### Model Sources

- **Bittensor Subnet9 Leaderboard:** [https://huggingface.co/spaces/RaoFoundation/pretraining-leaderboard](https://huggingface.co/spaces/RaoFoundation/pretraining-leaderboard)
- **Bittensor Subnet9 Repository:** [https://github.com/RaoFoundation/pretraining/tree/main](https://github.com/RaoFoundation/pretraining/tree/main)

## How to Get Started with the Model

Use the code below to get started with the model.

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch

model = "tensorplex-labs/Sumo-T9-7B-v0.1"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
)
sequences = pipeline(
   "What is Yokozuna?",
    max_length=256,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
    bos_token_id=tokenizer.bos_token_id,
)
for seq in sequences:
    print(f"Result: {seq['generated_text']}")

```

## Training Details

### Training Data

This model has been trained with [tiiuae/falcon-refinedweb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) dataset, and still ongoing continuously.

## Evaluation

Sumo-T9-7B-v0.1 has outperformed notable models such as TII Falcon 7B, Meta's Llama-2-7b and Llama-1-7b in zero-shot performance, 
establishing itself as the leading model in aggregate across various evaluation tasks. 
Such benchmarks include ARC Challenge, GSM8K, HellaSwag, MMLU, TruthfulQA, and Winogrande.


|                                       |    avg     |   arc_challenge |   gsm8k |   hellaswag |   mmlu |   truthfulqa_mc2 |   winogrande |
|:--------------------------------------|-----------:|----------------:|--------:|------------:|-------:|-----------------:|-------------:|
| meta-llama/Meta-Llama-3-8B            | 0.6009     |          0.5333 |  0.4913 |      0.7906 | 0.621  |           0.4392 |       0.7301 |
| **tensorplex-labs/Sumo-T9-7B-v0.1** | **0.4769** |          0.4753 |  0.1031 |      0.7666 | 0.4426 |           0.3723 |       0.7017 |
| meta-llama/Llama-2-7b-hf              | 0.473      |          0.4625 |  0.1213 |      0.7597 | 0.4123 |           0.3896 |       0.693  |
| huggyllama/llama-7b                   | 0.4386     |          0.4471 |  0.0849 |      0.7621 | 0.2973 |           0.3408 |       0.6993 |
| tiiuae/falcon-7b                      | 0.4189     |          0.4343 |  0.0432 |      0.7636 | 0.2582 |           0.3428 |       0.6717 |


## Future Plans

Tensorplex Labs will continue pushing the limits of what is possible on Subnet 9, and will also work on fine-tuning state-of-the-art models for Web3 domain-specific use-cases.

One of the most ambitious projects is the development of a new data collection subnet. This will enable open and incentivized contributions of intelligence from a diverse pool of participants. The subnet will function as a collaborative platform where individuals can provide human preference or training data, which will be used to train, fine-tune, and evaluate AI models and miners across various subnets on Bittensor.

## About Tensorplex Labs

Tensorplex Labs is an AI and Web3 startup that is building the decentralized AI of the future. The company’s mission is to decentralize AI, democratize access to data and intelligence, and build a more open, transparent, and equitable future for AI. Tensorplex Labs develops open-source capital and intelligence infrastructure and applications designed to grow decentralized AI, Web3, and crypto ecosystems by making them more capital efficient, intelligent, and trustworthy. The company is currently developing a novel way to better incentivize human input to train AI models, opening up more access to new pools of human contributors with new income opportunities. Founded in 2023 with headquarters in Singapore, Tensorplex Labs’ investors include Canonical Crypto, Collab+Currency, and Digital Currency Group among several others. For more information, visit [Tensorplex](https://tensorplex.ai).

## Model Card Authors

- syncdoth@tensorplex.ai

## Model Card Contact

- syncdoth@tensorplex.ai