Model Card for Model ID

This model is trained on PonniyinSelvan tamil corpus dataset.

Model Details

Base model used is EleutherAI's Pythia 1.4b

Model Description

  • Finetuned from model [optional]: Pythia 1.4b

Uses

Purely education and research purposes only. Not fit for any kind of practical use.

Bias, Risks, and Limitations

The base model Bias, Risks and Limitations apply

How to Get Started with the Model

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_path = "RajuKandasamy/ponniyinselvan_1.4b_alpha"
device = "cuda" if torch.cuda.is_available() else "cpu" 
model = AutoModelForCausalLM.from_pretrained(model_path, load_in_8bit=False).to(device)
tokenizer = AutoTokenizer.from_pretrained(model_path)

model.eval()

prompt="""வந்தியத்தேவன்"""
input_ids = tokenizer.encode(prompt, return_tensors="pt").to(model.device)
attention_mask = torch.ones_like(input_ids).to(model.device)
print("Thinking ...\n   ")
with torch.no_grad():
    output = model.generate(input_ids=input_ids, attention_mask=attention_mask, max_length=256, early_stopping=False, temperature=0.9, top_p=0.9,top_k=500, do_sample=True,output_scores=True,  pad_token_id=tokenizer.eos_token_id, repetition_penalty=1.2,eos_token_id=tokenizer.eos_token_id)
output_str = tokenizer.decode(output[0], skip_special_tokens=False)
print(output_str)

Training Details

10 epochs

Training Data

ponniyinselvan text corpus

Training Procedure

Casual Language Modelling, With custom BPE tokenizer

Downloads last month
9
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.