File size: 2,335 Bytes
eaf7a49
21941f4
a62216a
21941f4
 
 
eb415f3
b7c8ce1
 
eaf7a49
4df750f
1281eae
656fc4b
8c6e57e
eaf7a49
21941f4
eaf7a49
21941f4
eaf7a49
21941f4
eaf7a49
4e12b24
efe4033
28a9db4
eaf7a49
876db86
eaf7a49
28a9db4
eaf7a49
21941f4
eaf7a49
21941f4
 
 
 
eaf7a49
28a9db4
eaf7a49
21941f4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
eaf7a49
21941f4
eaf7a49
21941f4
eaf7a49
21941f4
 
 
eaf7a49
21941f4
eaf7a49
21941f4
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
---
license: mit
library_name: transformers
tags:
- bittensor
- decentralization
- subnet 9
datasets:
- tiiuae/falcon-refinedweb
---
<img src="https://cdn-uploads.huggingface.co/production/uploads/655a0bdf3ff5ba1b1b1c01b7/y1dKBZh8UhII6wtbs5boj.png" alt="drawing" width="512"/>

# 🚀 **BTLM-7B v0.1**
BTLM (Bittensor Language Model) is a collection of pretrained generative text models. This is the repository for the 7B pretrained model, optimized for dialogue use cases and converted for the Hugging Face Transformers format.

### Model Details

Bittensor's decentralized subnet 9 facilitated the development and release of the first version of the BTLM-7B model. This initial release comprises a sophisticated large language model designed for a variety of applications.In creating this model, significant effort was made to ensure its effectiveness and safety, setting a new standard in the decentralized open-source AI community.

⛔ **This is a pretrained model, which should be further finetuned for most usecases.** 

**Training subnetwork :** 9

**Checkpoint :** 03-05-2024

[**Subnet 9 Network Leaderboard**](https://huggingface.co/spaces/macrocosm-os/pretraining-leaderboard)

[**Top Bittensor Model Checkpoint**](https://huggingface.co/tensorplex-labs/pretraining-sn9-7B-1)

### Inference

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch

model = "CortexLM/btlm-7b-base-v0.1"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
)
sequences = pipeline(
   "Tell me about decentralization.",
    max_length=200,
    do_sample=True,
    top_k=10,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
)
for seq in sequences:
    print(f"Result: {seq['generated_text']}")

```

### Benchmark

| Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K |
| --- | --- | --- | --- | --- | --- | --- |
|  43.32  | 45.65 | 58.29  | 44.26 | 30.45 | 70.88 | 10.39 |

[LM Evaluation Harness Repository](https://github.com/EleutherAI/lm-evaluation-harness)

## License
BTLM-7B is licensed under the [MIT License](https://opensource.org/license/mit), a permissive license that allows for reuse with virtually no restrictions.