Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

codellama2-finetuned-nl2bash-fin

Finetuned on the AnishJoshi/nl2bash-custom dataset for generating bash code based on natural language descriptions.

Model Details

  • Model Name: CodeLlama2-Finetuned-NL2Bash
  • Base Model: CodeLlama2
  • Task: Natural Language to Bash Script Conversion
  • Framework: PyTorch
  • Fine-tuning Dataset: Custom dataset of natural language commands and corresponding Bash scripts, available here

Model Description

  • Developed by: Anish Joshi
  • Model type: CausalLM
  • Finetuned from model: Codellama2

Files Included

  • adapter_config.json: Configuration file for the adapter layers.
  • adapter_model.safetensors: Weights of the adapter layers in the Safetensors format.
  • optimizer.pt: State of the optimizer used during training.
  • rng_state.pth: State of the random number generator.
  • scheduler.pt: State of the learning rate scheduler.
  • special_tokens_map.json: Mapping for special tokens used by the tokenizer.
  • tokenizer.json: Tokenizer model including the vocabulary.
  • tokenizer_config.json: Configuration settings for the tokenizer.
  • trainer_state.json: State of the trainer including training metrics.
  • training_args.bin: Training arguments used for fine-tuning.
  • `README.md

Usage

Load model directly

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("AnishJoshi/codellama2-finetuned-nl2bash-fin")
model = AutoModelForCausalLM.from_pretrained("AnishJoshi/codellama2-finetuned-nl2bash-fin")

Training Details

Training details available at Finetuning Notebook

Training Hyperparameters

Training arguments and configuration are set using TrainingArguments and LoraConfig. The model is fine-tuned using the following parameters:

  • output_dir: codellama2-finetuned-nl2bash - Directory to save the fine-tuned model.
  • per_device_train_batch_size: 2 - Batch size per device.
  • gradient_accumulation_steps: 16 - Number of gradient accumulation steps.
  • optim: paged_adamw_32bit - Optimizer type.
  • learning_rate: 2e-4 - Learning rate.
  • lr_scheduler_type: cosine - Learning rate scheduler type.
  • save_strategy: epoch - Strategy to save checkpoints.
  • logging_steps: 10 - Number of steps between logging.
  • num_train_epochs: 1 - Number of training epochs.
  • max_steps: 100 - Maximum number of training steps.
  • fp16: True - Use 16-bit floating-point precision.
  • push_to_hub: False - Whether to push the model to Hugging Face Hub.
  • report_to: none - Reporting destination.

Evaluation

Evaulation metrics and calculations available at Evaluation Notebook

Downloads last month
12
Safetensors
Model size
6.74B params
Tensor type
FP16
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.