Question Answering Generative Model

The key distinction between this model and DeepMount00/Mamba-QA-ITA lies in their performance and scale. This model boasts significantly improved performance and houses approximately 790 million parameters, a substantial increase from the 370 million parameters of DeepMount00/Mamba-QA-ITA. Furthermore, it delivers answers with greater accuracy and precision, enhancing the user experience and reliability of information.

Overview

The model is a question-answering generative system, evolved from the Mamba model with 790 million parameters. This advanced model is capable of responding to complex questions and understanding when the answer is not present in the provided context.

Model Architecture

The model is based on a Mamba architecture, enabling it to handle complex question answering. It's designed to understand and respond accurately in situations where context is limited or the question is intricate.

Unique Features

Advanced Parameterization: With 370 million parameters, the model offers a fine balance between efficiency and capability.
Contextual Understanding: The model can discern when the answer to a question is not available in the provided context, showcasing its advanced comprehension abilities.

Capabilities

Complex Question Handling: Capable of understanding and responding to a wide range of complex questions.
Parameter Efficiency: Despite having fewer parameters compared to some larger models, it maintains high efficiency and accuracy.

How to Use

To utilize this model for advanced question answering:

import torch
from transformers import AutoTokenizer
from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel

model_name = "DeepMount00/Mamba-QA-ITA-790m"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = MambaLMHeadModel.from_pretrained(model_name, device="cuda", dtype=torch.float16)

def run_qa_mamba(model, question, context):
    input_ids = torch.LongTensor([tokenizer.encode(f"{context}\n\nQ: {question}\nA:")]).cuda()
    output = model.generate(input_ids=input_ids, max_length=2048, eos_token_id=tokenizer.eos_token_id)
    answer = tokenizer.batch_decode(output)[0].replace(f"{context}\n\nQ: {question}\nA:", "").split("\n\n")[0].strip()
    answer = answer.replace("<|endoftext|>", "")
    return answer

question = """Quante torri ha bologna? """
context = """La torre degli Asinelli è una delle cosiddette due torri di Bologna, simbolo della città, situate in piazza di porta Ravegnana, all'incrocio tra le antiche strade San Donato (ora via Zamboni), San Vitale, Maggiore e Castiglione. Eretta, secondo la tradizione, fra il 1109 e il 1119 dal nobile Gherardo Asinelli, la torre è alta 97,20 metri, pende verso ovest per 2,23 metri e presenta all'interno una scalinata composta da 498 gradini. Ancora non si può dire con certezza quando e da chi fu costruita la torre degli Asinelli. Si presume che la torre debba il proprio nome a Gherardo Asinelli, il nobile cavaliere di fazione ghibellina al quale se ne attribuisce la costruzione, iniziata secondo una consolidata tradizione l'11 ottobre 1109 e terminata dieci anni dopo, nel 1119."""

answer = run_qa_mamba(model, question, context)
print(answer)

Developer

[Michele Montebovi]