ericflo's picture
Update README.md
fc54de5 verified
metadata
license: apache-2.0
base_model:
  - meta-llama/Llama-3.2-3B-Instruct
library_name: transformers
datasets:
  - ericflo/Llama-3.2-3B-COT

Thought-Ranked Llama 3.2 3B v2.1

There's a newer version (v2.2) of this model: https://huggingface.co/ericflo/Llama-3.2-3B-COTv2.2

What's New in v2?

The biggest improvement in v2 is how the model thinks through problems. Instead of just one level of thoughts, it can now explore up to 6 levels deep, building on its best ideas at each step. Think of it like having a conversation with yourself, where each new thought builds on your previous best insight.

How It Works

Let's look at an example. When asked "What would happen if the moon disappeared?", the model might think:

<thoughts>
<thought>First, I should consider the moon's main effects on Earth</thought>
<thought>The moon controls our tides, so ocean patterns would change dramatically</thought>
<thought>Without the moon's gravitational pull, Earth's rotation would become unstable</thought>
<thought>This would lead to extreme climate changes and disrupted ecosystems</thought>
<thought>The loss of moonlight would affect nocturnal animals and human culture</thought>
<thought>Combining all these effects, we'd see a cascade of environmental changes</thought>
</thoughts>

The disappearance of the moon would have far-reaching consequences for Earth...
[detailed answer follows]

System Messages

The model responds to different types of system prompts. Here are some examples:

  1. Basic prompt:
{"role": "system", "content": "You are a helpful assistant. Think before responding."}
  1. Specific thought count:
{"role": "system", "content": "You are a helpful assistant. Think 3 thoughts before responding."}
  1. Standard helper:
{"role": "system", "content": "You are a helpful assistant."}

About 40% of training examples include system messages, and 75% of those specifically mention thinking. This helps the model learn when and how much to think through problems.

Technical Details

  • Base Model: Llama 3.2 3B
  • Training Data: 2,500 carefully selected examples, each with up to 6 levels of thought chains
  • Thought Selection: At each level, the model generates multiple possible thoughts and picks the best one using an external ranking system

What's It Good For?

This model excels at tasks that benefit from careful thinking:

✅ Breaking down complex problems ✅ Step-by-step math solutions ✅ Detailed analysis of situations ✅ Explaining complicated concepts ✅ Making well-reasoned decisions

Limitations

  • Can sometimes overthink simple problems
  • Limited by the capabilities of the base Llama 3.2 3B model
  • Not suitable for critical decisions without human oversight
  • May occasionally generate irrelevant thought chains

Example Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("ericflo/Llama-3.2-3B-COT-v2.1")
tokenizer = AutoTokenizer.from_pretrained("ericflo/Llama-3.2-3B-COT-v2.1")

messages = [
    {"role": "system", "content": "You are a helpful assistant. Think 3 thoughts before responding."},
    {"role": "user", "content": "How would you teach a child to ride a bike?"}
]

input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt")
output = model.generate(input_ids, temperature=1.0)
response = tokenizer.decode(output[0])

Example output:

<thoughts>
<thought>Safety should be the first priority - helmet and protective gear</thought>
<thought>Starting with balance using training wheels can build confidence</thought>
<thought>Breaking the process into small, manageable steps will help avoid overwhelm</thought>
</thoughts>

Here's how I would teach a child to ride a bike...
[detailed answer follows]

Citation

@misc{thought-ranked-llama-v2,
  title={Thought-Ranked Llama 3.2 v2: Hierarchical Chain-of-Thought Generation},
  author={[Eric Florenzano]},
  year={2024},
  howpublished={\url{https://huggingface.co/ericflo/Llama-3.2-3B-COT-v2}}
}

Acknowledgments

This model builds on the Llama 3.2 3B base model from Meta. Special thanks to the open-source AI community for their contributions to chain-of-thought prompting techniques.