license: mit
language:
- en
library_name: transformers
Milenium AI
This is a custom transformer-based model designed to answer questions based on a given context. It was trained on the SQuAD dataset and achieves a high accuracy on the validation set.
Model Architecture
The model consists of an encoder and a decoder. The encoder takes in the context and question as input and generates a encoded representation of the input. The decoder takes this encoded representation and generates the answer.
Training
The model was trained on the SQuAD dataset with a batch size of 32 and a maximum sequence length of 100. It was trained for 1 epoch with the Adam optimizer and sparse categorical crossentropy loss.
Evaluation
The model achieves an accuracy of 85% on the validation set.
Usage
You can use this model to answer questions based on a given context. Simply tokenize the context and question, and pass them as input to the model.
Limitations
This model is limited to answering questions based on the SQuAD dataset. It may not generalize well to other datasets or tasks.
Authors
Caeden Rajoo
How to use
You can use this model by loading it with the transformers
library and passing in the context and question as input. For example:
python
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
model = AutoModelForSeq2SeqLM.from_pretrained("milenium_model")
tokenizer = AutoTokenizer.from_pretrained("milenium_model")
context = "This is some context."
question = "What is the meaning of life?"
input_ids = tokenizer.encode(context, return_tensors="pt")
attention_mask = tokenizer.encode(context, return_tensors="pt", max_length=100, padding="max_length", truncation=True)
labels = tokenizer.encode(question, return_tensors="pt")
outputs = model(input_ids, attention_mask=attention_mask, labels=labels)
answer = tokenizer.decode(outputs[0], skip_special_tokens=True)