---
base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
library_name: peft
pipeline_tag: text-generation
---

# woym

This model is a fine-tuned version of TinyLlama-1.1B-Chat-v1.0 specialized for educational interactions with young children. It aims to provide helpful, age-appropriate responses to questions and prompts from primary school students.

## Model Details

### Model Description

This model was created by fine-tuning the TinyLlama-1.1B-Chat-v1.0 base model using the PEFT (Parameter-Efficient Fine-Tuning) library with QLoRA techniques. The fine-tuning focused on optimizing the model for educational content specifically tailored for young children, enhancing its ability to provide clear, simple, and instructional responses suitable for primary education.

- **Developed by:** Mohammad Ali
- **Funded by:** Self-funded research project
- **Model type:** Instruction-tuned causal language model with QLoRA fine-tuning
- **Language(s):** English
- **License:** Same as base model (TinyLlama-1.1B-Chat-v1.0)
- **Finetuned from model:** TinyLlama/TinyLlama-1.1B-Chat-v1.0

### Model Sources

- **Repository:** https://github.com/mohammad17ali/woym.ai


### Direct Use

This model is designed for direct interaction with primary school children or for educational applications targeting young learners. It can be used to:

- Answer basic educational questions
- Explain simple concepts
- Assist with homework in age-appropriate ways
- Generate educational content for young children
- Support teachers in creating learning materials

### Downstream Use

The model can be integrated into:
- Educational applications and platforms
- Classroom assistant tools
- Interactive learning environments
- Child-friendly chatbots
- Educational content creation systems

### Out-of-Scope Use

This model is not designed for:
- Providing medical, legal, or professional advice
- Generating content for adult audiences
- Addressing complex academic topics beyond primary education level
- Sensitive topics requiring nuanced understanding
- Decision-making in high-stakes scenarios

## Bias, Risks, and Limitations

- **Limited knowledge base**: As a fine-tuned version of a 1.1B parameter model, it has significantly less knowledge than larger models.
- **Simplified responses**: May oversimplify complex topics in ways that could create misconceptions.
- **Language limitations**: Primarily trained on English data and educational contexts.
- **Potential biases**: May reflect biases present in the educational dataset used for fine-tuning.
- **Hallucination risk**: Like all language models, it may generate plausible-sounding but incorrect information.
- **Limited context window**: The model has a maximum context length of 512 tokens, limiting its ability to process lengthy conversations.

### Recommendations

- Always review the model's outputs before sharing them with children
- Provide clear instructions when prompting the model
- Use the model as a supplementary tool rather than a primary educational resource
- Be aware of the model's tendency to occasionally generate incorrect information
- Consider deploying with human-in-the-loop oversight when used in educational settings

## How to Get Started with the Model

Use the code below to get started with the model:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained("path/to/your/model")

# Load the model
model = AutoModelForCausalLM.from_pretrained("path/to/your/model")

# Generate text
def generate_text(prompt):
    formatted_prompt = f"<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant\n"
    inputs = tokenizer(formatted_prompt, return_tensors="pt").to("cuda" if torch.cuda.is_available() else "cpu")
    
    output = model.generate(
        **inputs, 
        max_length=512, 
        temperature=0.7, 
        top_p=0.9, 
        do_sample=True,
        repetition_penalty=1.2
    )
    
    generated_text = tokenizer.decode(output[0], skip_special_tokens=False)
    assistant_response = generated_text.split("<|im_start|>assistant\n")[-1].split("<|im_end|>")[0]
    return assistant_response

# Example usage
prompt = "Can you explain what photosynthesis is in simple terms?"
response = generate_text(prompt)
print(response)
```


### Training Data

This model was fine-tuned on the "ajibawa-2023/Education-Young-Children" dataset, which contains educational interactions between teachers and primary school students. The dataset includes a variety of educational topics appropriate for young learners.

### Training Procedure

The model was fine-tuned using Parameter-Efficient Fine-Tuning (PEFT) with QLoRA technique to reduce memory usage while maintaining quality.

#### Preprocessing

- Input data was formatted with special tokens to denote user and assistant turns
- Prompts and responses were concatenated with appropriate markers
- Tokenization was performed with a maximum sequence length of 512 tokens

#### Training Hyperparameters

- **Training regime:** FP16 mixed precision
- **Number of epochs:** 2
- **Learning rate:** 2e-5
- **Batch size:** 1 (with gradient accumulation)
- **LoRA rank (r):** 8
- **LoRA alpha:** 32
- **LoRA dropout:** 0.05
- **Target modules:** q_proj, v_proj
- **Warmup steps:** 100
- **Optimizer:** AdamW

#### Speeds, Sizes, Times

- **Training time:** Approximately [X] hours on a P100 GPU
- **Model size:** Base model (1.1B parameters) + 2-3MB for LoRA adapters
- **Hardware used:** NVIDIA P100 GPU on Kaggle

## Evaluation

### Testing Data, Factors & Metrics

#### Testing Data

The model was evaluated on a held-out subset of the "ajibawa-2023/Education-Young-Children" dataset.

#### Factors

Evaluation considered:
- Response relevance to educational queries
- Age-appropriateness of language and content
- Accuracy of educational information
- Safety and appropriateness of content

#### Metrics

- Perplexity
- Manual evaluation of response quality
- Response coherence and helpfulness

### Results

[You can add specific evaluation results here when available]

## Environmental Impact

Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

- **Hardware Type:** NVIDIA P100 GPU
- **Hours used:** Approximately [X] hours
- **Cloud Provider:** Kaggle
- **Compute Region:** [Your region]
- **Carbon Emitted:** [Add estimation if available]

## Technical Specifications

### Model Architecture and Objective

The model uses the TinyLlama architecture (1.1B parameters) with additional LoRA adapters applied to the attention layers. The objective was next-token prediction using a causal language modeling approach, specialized for educational content.

### Compute Infrastructure

#### Hardware

- NVIDIA P100 GPU on Kaggle
- 16GB GPU memory
- 4 vCPUs

#### Software

- Python 3.10
- PyTorch 2.0+
- Transformers 4.30+
- PEFT 0.14.0
- Accelerate 0.20+

## Model Card Authors

Mohammad Ali

## Model Card Contact
GitHub: https://github.com/mohammad17ali
mailto:mohammad.ali.goba@gmail.com