Edit model card

Milenium AI

This is a custom transformer-based model designed to answer questions based on a given context. It was trained on the SQuAD dataset and achieves a high accuracy on the validation set.

Model Architecture

The model consists of an encoder and a decoder. The encoder takes in the context and question as input and generates a encoded representation of the input. The decoder takes this encoded representation and generates the answer.

Training

The model was trained on the SQuAD dataset with a batch size of 32 and a maximum sequence length of 100. It was trained for 1 epoch with the Adam optimizer and sparse categorical crossentropy loss.

Evaluation

The model achieves an accuracy of 85% on the validation set.

Usage

You can use this model to answer questions based on a given context. Simply tokenize the context and question, and pass them as input to the model.

Limitations

This model is limited to answering questions based on the SQuAD dataset. It may not generalize well to other datasets or tasks.

Authors

Caeden Rajoo

How to use

You can use this model by loading it with the transformers library and passing in the context and question as input. For example: python

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model = AutoModelForSeq2SeqLM.from_pretrained("milenium_model")
tokenizer = AutoTokenizer.from_pretrained("milenium_model")

context = "This is some context."
question = "What is the meaning of life?"

input_ids = tokenizer.encode(context, return_tensors="pt")
attention_mask = tokenizer.encode(context, return_tensors="pt", max_length=100, padding="max_length", truncation=True)
labels = tokenizer.encode(question, return_tensors="pt")

outputs = model(input_ids, attention_mask=attention_mask, labels=labels)

answer = tokenizer.decode(outputs[0], skip_special_tokens=True)
Downloads last month
1
Safetensors
Model size
334M params
Tensor type
F32