TinyLlama-1.1bee / README.md
leaderboard-pr-bot's picture
Adding Evaluation Results
55a2ea6
|
raw
history blame
5.37 kB
metadata
license: apache-2.0
base_model: PY007/TinyLlama-1.1B-intermediate-step-240k-503b
tags:
  - bees
  - beekeeping
  - honey
metrics:
  - accuracy
inference:
  parameters:
    max_new_tokens: 64
    do_sample: true
    repetition_penalty: 1.1
    no_repeat_ngram_size: 5
    eta_cutoff: 0.0008
widget:
  - text: In beekeeping, the term "queen excluder" refers to
    example_title: Queen Excluder
  - text: One way to encourage a honey bee colony to produce more honey is by
    example_title: Increasing Honey Production
  - text: The lifecycle of a worker bee consists of several stages, starting with
    example_title: Lifecycle of a Worker Bee
  - text: Varroa destructor is a type of mite that
    example_title: Varroa Destructor
  - text: In the world of beekeeping, the acronym PPE stands for
    example_title: Beekeeping PPE
  - text: The term "robbing" in beekeeping refers to the act of
    example_title: Robbing in Beekeeping
  - text: |-
      Question: What's the primary function of drone bees in a hive?
      Answer:
    example_title: Role of Drone Bees
  - text: To harvest honey from a hive, beekeepers often use a device known as a
    example_title: Honey Harvesting Device
  - text: >-
      Problem: You have a hive that produces 60 pounds of honey per year. You
      decide to split the hive into two. Assuming each hive now produces at a
      70% rate compared to before, how much honey will you get from both hives
      next year?

      To calculate
    example_title: Beekeeping Math Problem
  - text: In beekeeping, "swarming" is the process where
    example_title: Swarming
pipeline_tag: text-generation
datasets:
  - BEE-spoke-data/bees-internal
language:
  - en

TinyLlama-1.1bee ๐Ÿ

image/png

As we feverishly hit the refresh button on hf.co's homepage, on the hunt for the newest waifu chatbot to grace the AI stage, an epiphany struck us like a bee sting. What could we offer to the hive-mind of the community? The answer was as clear as honeyโ€”beekeeping, naturally. And thus, this un-bee-lievable model was born.

Details

This model is a fine-tuned version of PY007/TinyLlama-1.1B-intermediate-step-240k-503b on the BEE-spoke-data/bees-internal dataset. It achieves the following results on the evaluation set:

  • Loss: 2.4285
  • Accuracy: 0.4969
***** eval metrics *****
  eval_accuracy           =     0.4972                                              
  eval_loss               =     2.4283
  eval_runtime            = 0:00:53.12
  eval_samples            =        239
  eval_samples_per_second =      4.499
  eval_steps_per_second   =      1.129
  perplexity              =    11.3391

๐Ÿ“œ Intended Uses & Limitations ๐Ÿ“œ

Intended Uses:

  1. Educational Engagement: Whether you're a novice beekeeper, an enthusiast, or someone just looking to understand the buzz around bees, this model aims to serve as an informative and entertaining resource.
  2. General Queries: Have questions about hive management, bee species, or honey extraction? Feel free to consult the model for general insights.
  3. Academic & Research Inspiration: If you're diving into the world of apiculture studies or environmental science, our model could offer some preliminary insights and ideas.

Limitations:

  1. Not a Beekeeping Expert: As much as we admire bees and their hard work, this model is not a certified apiculturist. Please consult professional beekeeping resources or experts for serious decisions related to hive management, bee health, and honey production.
  2. Licensing: Apache-2.0, following TinyLlama
  3. Infallibility: Our model can err, just like any other piece of technology (or bee). Always double-check the information before applying it to your own hive or research.
  4. Ethical Constraints: This model may not be used for any illegal or unethical activities, including but not limited to: bioterrorism & standard terrorism, harassment, or spreading disinformation.

Training and evaluation data

While the full dataset is not yet complete and therefore not yet released for "safety reasons", you can check out a preliminary sample at: bees-v0

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 80085
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 2.0

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 29.15
ARC (25-shot) 30.55
HellaSwag (10-shot) 51.8
MMLU (5-shot) 24.25
TruthfulQA (0-shot) 39.01
Winogrande (5-shot) 54.46
GSM8K (5-shot) 0.23
DROP (3-shot) 3.74