Instructions to use sarthakd57/bengali_narrative_to_comic_model with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use sarthakd57/bengali_narrative_to_comic_model with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("togethercomputer/Meta-Llama-3.2-3B-Instruct-Reference__TOG__FT") model = PeftModel.from_pretrained(base_model, "sarthakd57/bengali_narrative_to_comic_model") - Notebooks
- Google Colab
- Kaggle
Llama-3.2-3B-Instruct Bengali Narrative-to-Comic Scribe
This model is a fine-tuned LoRA adapter based on meta-llama/Llama-3.2-3B-Instruct, trained autonomously using the Adaption AI AutoScientist framework. It specializes in translating complex, historical Bengali literary prose into visually vivid, structured English comic book page layouts formatted cleanly in JSON.
Performance Metrics
Model shows a measurable percentage improvement over the baseline model on Adaption's held out testset for chosen category (Language). Judge is Gemini 3.1 pro.
- Dataset Win Rate: 100% vs. Baseline Model (0%)
- Target Task Convergence: Fully achieved schema adherence (JSON validation) without structural collapse or formatting corruption.
Training & Evaluation Metrics Analysis
The model was monitored dynamically over its optimization trajectory across 150 global steps. Below is the technical evaluation of the performance curves:

- The training loss (teal curve) showed healthy, consistent stochastic noise while maintaining a steady downward trajectory. The Validation Loss(black points) tracked the training line perfectly without any upward deviation, divergence, or widening gap. This confirms excellent generalization and proves the adapter did not overfit to the training split.
- Learning rate showed linear scaling over the first 10% of global steps (up to step 15) to prevent early gradient destabilization.
- Beautiful Cosine Annealing decay down to a stable baseline approaching 0 (ending at a tiny fraction of peak value), ensuring smooth weight stabilization in the final steps of the training run.
- Gradient Norm peaked during early, highly volatile training steps.Plunged rapidly within the first 15 steps and remained completely flat and stable at 2.62 ร 106 for the remainder of the run. This confirms that gradient clipping parameters successfully mitigated exploding/vanishing gradients, keeping the weight update magnitudes consistent and preserving model structural integrity across dense token generation.
Method & Hyperparameters
- Training Method: Supervised Fine-Tuning (SFT)
- Target Layers: All-linear components (q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj)
- LoRA Rank (r): 32
- LoRA Alpha (alpha): 64
- Epochs: 1
- Learning Rate Scheduler: Cosine Decay with Linear Warmup
Augmented Ingestion Dataset: https://huggingface.co/datasets/sarthakd57/bengali_narrative_to_comic
- Downloads last month
- 5
Model tree for sarthakd57/bengali_narrative_to_comic_model
Base model
meta-llama/Llama-3.2-3B-Instruct