This model has been pushed to the Hub using the PytorchModelHubMixin integration:


GPT-2-Nepali-512 Model

  • 512 represents context length

  • This repository contains a custom GPT-2 model trained on Nepali text. Follow the instructions below to use this model for text generation.


How to Use the Model

  1. Download the Required Code
    Save the model_code.py file in the same directory where you'll run the script.

  2. Install Required Libraries
    Ensure you have the necessary libraries installed:

    pip install transformers torch
    
  3. Run the Following Code
    Here's an example to load the model and generate text:

    import torch
    from model_code import GPTModel, generate_and_print_sample
    from transformers import PreTrainedTokenizerFast
    
    # Load the tokenizer
    tokenizer = PreTrainedTokenizerFast.from_pretrained("Aananda-giri/NepaliBPE")
    
    # Define the starting text
    start_context = "रामले भात"
    
    # Load the pre-trained model
    loaded_model = GPTModel.from_pretrained("Aananda-giri/GPT2-Nepali")
    
    # Move the model to the appropriate device (CPU or GPU)
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    loaded_model.to(device)
    
    # Generate text
    generate_and_print_sample(
        loaded_model, tokenizer, device, start_context
    )
    

Additional Notes

  • Tokenizer: The model uses a pre-trained tokenizer available at Aananda-giri/NepaliBPE. Ensure this is downloaded and accessible during runtime.
  • Dependencies: This code requires transformers (by Hugging Face) and torch (PyTorch). Install them if not already installed.
  • Device Compatibility: The script automatically detects if a CUDA-enabled GPU is available and utilizes it for faster inference. If not, it defaults to the CPU.

Example Output

Input:

रामले भात

Generated Text:

रामले भात खाएर सन्तोष माने। ऊ आफ्ना साथीहरूसँग रमाइलो गरिरहेको थियो।

Let me know if you'd like further assistance!

Downloads last month
12
Safetensors
Model size
165M params
Tensor type
F32
·
Inference Examples
Unable to determine this model's library. Check the docs .