---
library_name: transformers
license: mit
datasets:
- arxiv_dataset
language:
- en
pipeline_tag: text-generation
---

# Model Card for SciMistral-V1

The SciMistral-V1 Large Language Model (LLM) is an improved fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2).

This model was fine-tuned using the [arxiv-dataset](https://www.kaggle.com/datasets/Cornell-University/arxiv), in particular, using abstracts from a variety of scientific papers.

For our article explaining more on how we did this, please check out our [website](https://www.tromero.ai/articles)!


## How to Get Started with the Model

To run this model for yourself:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained("TromeroResearch/SciMistral-V1")
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2")

messages = [
    {"role": "user", "content": "What is your favourite condiment?"},
    {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
    {"role": "user", "content": "Do you have mayonnaise recipes?"}
]

encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")

model_inputs = encodeds.to(device)
model.to(device)

generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])
```


## Sample results

Regular [Mistral 7B](https://huggingface.co/mistralai/Mistral-7B-v0.1):

Abstract start:
```markdown
This paper seeks to disprove that 1+1=2
```

Generated completion:
```markdown
The first thing we need to do is define what a number is, and how it relates to the world around us. A number can be defined as an abstract concept used for counting or measuring things in our environment. It’s important to note here that numbers are not physical objects; they exist only within our minds…
```

And so on… A pretty weak, unscientific and generic response, something you’d expect from someone pretending to know what they’re talking about. But with our model:

[SciMistral](https://huggingface.co/Tromero/SciMistral-V1):

Abstract start:
```markdown
This paper seeks to disprove that 1+1=2
```

Generated completion:
```markdown
We do this by showing that the equation is not true in all models of arithmetic, and hence it cannot be proved from Peano Arithmetic (PA). The proof uses a technique called "forcing" which was introduced into mathematical logic by Cohen in his famous work on forcing for set theory. We show how to use forcing to construct a model of PA where 1+1 does not equal 2…
```

And it continues. A much better, more useful and relevant response to someone who needs help summarising their paper!


## Hardware

4 x Nvidia A6000 GPUs