Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

codellama2-finetuned-nl2bash-fin

Finetuned on the AnishJoshi/nl2bash-custom dataset for generating bash code based on natural language descriptions.

Model Details

  • Model Name: CodeLlama2-Finetuned-NL2Bash
  • Base Model: CodeLlama2
  • Task: Natural Language to Bash Script Conversion
  • Framework: PyTorch
  • Fine-tuning Dataset: Custom dataset of natural language commands and corresponding Bash scripts, available here

Model Description

  • Developed by: Anish Joshi
  • Model type: CausalLM
  • Finetuned from model: Codellama2

Files Included

  • adapter_config.json: Configuration file for the adapter layers.
  • adapter_model.safetensors: Weights of the adapter layers in the Safetensors format.
  • optimizer.pt: State of the optimizer used during training.
  • rng_state.pth: State of the random number generator.
  • scheduler.pt: State of the learning rate scheduler.
  • special_tokens_map.json: Mapping for special tokens used by the tokenizer.
  • tokenizer.json: Tokenizer model including the vocabulary.
  • tokenizer_config.json: Configuration settings for the tokenizer.
  • trainer_state.json: State of the trainer including training metrics.
  • training_args.bin: Training arguments used for fine-tuning.
  • `README.md

Usage

Load model directly

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("AnishJoshi/codellama2-finetuned-nl2bash-fin")
model = AutoModelForCausalLM.from_pretrained("AnishJoshi/codellama2-finetuned-nl2bash-fin")

Training Details

Training details available at Finetuning Notebook

Training Hyperparameters

Training arguments and configuration are set using TrainingArguments and LoraConfig. The model is fine-tuned using the following parameters:

  • output_dir: codellama2-finetuned-nl2bash - Directory to save the fine-tuned model.
  • per_device_train_batch_size: 2 - Batch size per device.
  • gradient_accumulation_steps: 16 - Number of gradient accumulation steps.
  • optim: paged_adamw_32bit - Optimizer type.
  • learning_rate: 2e-4 - Learning rate.
  • lr_scheduler_type: cosine - Learning rate scheduler type.
  • save_strategy: epoch - Strategy to save checkpoints.
  • logging_steps: 10 - Number of steps between logging.
  • num_train_epochs: 1 - Number of training epochs.
  • max_steps: 100 - Maximum number of training steps.
  • fp16: True - Use 16-bit floating-point precision.
  • push_to_hub: False - Whether to push the model to Hugging Face Hub.
  • report_to: none - Reporting destination.

Evaluation

Evaulation metrics and calculations available at Evaluation Notebook

Downloads last month
21
Safetensors
Model size
6.74B params
Tensor type
FP16
·