Edit model card

Model Card for Model ID

This model is pretrained and fine-tuned with Vietnamese language, based on GPT-NeoX which is a large language model developed by EleutherAI.

Model Details

Training Data

  • Pre-train: Culturax Vietnamese Dataset(450GB) + AI-Hub Vietnamese Dataset(1.3GB) + Crawled Vietnamese Wikipedia Dataset(630MB) + viwik18 Dataset(1.27GB)
  • Fine-tuning: 12MB Vietnamese Question & Answer dataset
    Vietnamese Alpaca(16412 rows) + Vietnamese QA Dataset based on viwik18(14293 rows)

Training Hardware

Trained on A100 40GB GPU and 48 core CPU. Took 18 hours to reach 10 epochs.

Hyperparameters

Hyperparameter Value
num_train_epochs 2670182400
train_batch_size 2
learning_rate 0.0001
warmup_steps 1000
weight_decay 0

How to use

The model can be loaded using the AutoModelForCausalLM functionality:

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("eunyounglee/GPT-NeoX-2.7B-Vietnamese-finetune")
model = AutoModelForCausalLM.from_pretrained("eunyounglee/GPT-NeoX-2.7B-Vietnamese-finetune")
Downloads last month
12
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.