YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Model Card

Language Model for Next Token Prediction in English and Estonian Model Overview This is a neural network-based language model developed for next-token prediction tasks in English and Estonian. The model leverages RNN and LSTM architectures to achieve effective sequence modeling without relying on transformer-based approaches. It is designed to predict the next word or token based on the context provided by the preceding sequence.

This modelcard aims to be a base template for new models. It has been generated using this raw template.

Model Details

Architecture: The model uses a combination of Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks to handle sequential data efficiently. Key features include:

Multiple hidden layers for enhanced feature representation. Custom embedding layers for bilingual token representations. Optimized hyperparameters for balanced training performance. Training Data: The model is trained on a custom bilingual corpus that includes diverse text from English and Estonian language resources.

Input and Output:

Input: A sequence of tokens in either English or Estonian. Output: The predicted next token in the sequence.

Step 1: Install Required Libraries

!pip install transformers datasets evaluate torch

pip show transformers

pip install --upgrade transformers

Step 2: Import Libraries

import torch from transformers import AutoTokenizer, AutoModelForCausalLM, TrainingArguments, Trainer from datasets import Dataset import pandas as pd

Step 3: Load Pretrained Model and Tokenizer

model_name = "gpt2" tokenizer = AutoTokenizer.from_pretrained(model_name) tokenizer.pad_token = tokenizer.eos_token model = AutoModelForCausalLM.from_pretrained(model_name)

Step : Tokenize Dataset

def tokenize_function(examples): inputs = [f"{text}" for text in examples["input"]] outputs = [f"{text}" for text in examples["output"]] model_inputs = tokenizer(inputs, max_length=512, truncation=True, padding="max_length") labels = tokenizer(outputs, max_length=512, truncation=True, padding="max_length")["input_ids"] model_inputs["labels"] = labels return model_inputs

train_tokenized = train_dataset.map(tokenize_function, batched=True) val_tokenized = val_dataset.map(tokenize_function, batched=True)

#step 6: training_args = TrainingArguments( output_dir="./fine_tuned_gpt2", eval_strategy="epoch", # Replace evaluation_strategy with eval_strategy learning_rate=5e-5, num_train_epochs=3, per_device_train_batch_size=4, per_device_eval_batch_size=4, save_total_limit=2, save_strategy="epoch", logging_dir="./logs", logging_steps=10, load_best_model_at_end=True, metric_for_best_model="eval_loss", )

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support