ExtractQueNumberMini Model

This model has been fine-tuned for quick extraction of question numbers from OCRed handwritten text. It is designed to run efficiently on CPU due to its compact size.

Model Usage

To use this model, set the system prompt to the following:

Extract the question number from the given text. Your response should be just an integer representing the question number. Do not provide any explanation or context. Just the number.

Inference Code Example

from transformers import AutoModelForCausalLM, AutoTokenizer

checkpoint = "rahulvk007/ExtractQueNumberMini"
device = "cpu"  # change to "cuda" for GPU

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)

alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{}

### Input:
{}

### Response:
{}"""

inputs = tokenizer(
    [
        alpaca_prompt.format(
            "Extract the question number from the given text. Your response should be just an integer which is the question number. Do not provide any explanation or context. Just the number.",
            "<Give OCR Text here>",
            "",
        )
    ],
    return_tensors="pt"
).to(device)

outputs = model.generate(**inputs, max_new_tokens=64, use_cache=True)
print(tokenizer.batch_decode(outputs, skip_special_tokens=True))

Datasets

The model was fine-tuned on rahulvk007/quenumber_extraction_v2, specifically curated for this task.


Downloads last month
120
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for rahulvk007/ExtractQueNumberMini

Finetuned
(7)
this model

Dataset used to train rahulvk007/ExtractQueNumberMini