Edit model card

mistral-environment-all

Model Description

The model is a fine-tuned (quantized) Mistral7b model on a self-organised dataset about environmental knowledge. This model is currently still under development.

  • Developed by: Fiona Zhang
  • Funded: CSIRO, Pawsey Supercomputing Research Centre
  • Finetuned from model: Mistral7b

Uses

This repository includes the weights learned during the training process. It should be loaded witht the pre-trained Mistral 7b and tokenizer.

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

# Load the tokenizer, adjust configuration if needed
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Text generation
def generate_text_sequences(pipe, prompt):
    sequences = pipe(
        f"prompt",
        do_sample=True,
        max_new_tokens=100,
        temperature=0.8,
        top_k=50,
        top_p=0.95,
        num_return_sequences=1,
    )
    return sequences[0]['generated_text']

# Now you can use the model for inference
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    pad_token_id=2
)
print(generate_text_sequences(pipe, "your prompt"))

Training Data

The fine-tuning data are parsed from these public Wikipedia websites:

The text corpus are preprocessed for better format.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 1

Training results

Framework versions

  • Transformers 4.36.2
  • Pytorch 2.1.0a0+git7bcf7da
  • Datasets 2.16.1
  • Tokenizers 0.15.0

Environmental Impact

  • Hardware Type: Setonix (Pawsey Supercomputing Research Centre)
  • Hours used: <1
  • Cloud Provider: Google Cloud
  • Compute Region: [More Information Needed]
  • Carbon Emitted: [More Information Needed]
Downloads last month
5
Safetensors
Model size
7.24B params
Tensor type
BF16
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from