As the original dolphin author (i assume), why is dolphin much worst than the original Llama 3?

#2
by dadadies - opened

Here is one example. It also sometimes speaks as itself and then the user and then itself all in one response. I am largely new to using local LLMs but iv heard that uncensored models are often far worst. What are the technical reasons for it?


original llama 3

USER
write 10 sentences, each ending with the word apple.

AI
Here are 10 sentences, each ending with the word "apple":

  1. The juicy red fruit that I love is an apple.
  2. After a long day, I like to bite into a crunchy apple.
  3. My favorite snack is a slice of fresh Granny Smith apple.
  4. The farmer's market had the most delicious sweet apple.
  5. In autumn, the trees are filled with ripe, shiny apple.
  6. Mom always packs a healthy lunch with a Granny Smith apple.
  7. I'm not a big fan of cooking, but I love baking an apple.
  8. The new iPhone commercial features a person holding a shiny new Apple.
  9. On my desk, there's a paperweight shaped like a small green apple.
  10. When I get sick, my mom makes me a warm glass of apple juice and apple.

dohphin 2.9.1 llama 3

USER
write 10 sentences, each ending with the word apple.

AI

  1. When Sarah saw an unusual fruit stand on a street corner, she couldn't resist asking the vendor about the mysterious variety of apples.
  2. The curious fruit merchant had noticed that several rare apple cultivars were starting to appear at local orchards and fairs.
  3. As the enthusiastic apple collector continued adding more uncommon types to her ever-growing private collection, she remained determinedly seeking even rarer samples for years to come.
  4. Meanwhile, the dedicated researcher was investigating some puzzling biochemical reactions related to a few unique apple varieties.
  5. At conferences, several experts were presenting some intriguing new paradigms of fruit science and discussing certain breakthrough discoveries in apple cultivation techniques.
  6. In interviews, knowledgeable enthusiasts were excitedly comparing notes about recent advances in apple orchards, markets, festivals, and other social events around towns.
  7. On blogs featuring lively discussions among fans of rare apple strains, the passionate debate raged on, fueling a vibrant online community for months.
  8. From emails detailing latest trends in exotic apple collections, to DMs sharing tips with fellow hobbyists' networks, people exchanged messages about upcoming events at various venues.
  9. At social media gatherings and workshops, more experienced coordinators were guiding the curious public through interesting topics of apple studies, tutorials, and other training sessions across different cities.
  10. So when the expert fruit speaker addressed an appreciative audience during a special event celebrating National Apple Day, they joined in honoring that unique occasion for years to share their passion with newcomers.

Happy 2022 celebration of National Apple Day!

Cognitive Computations org

To be clear - that is a tremendously difficult task for ANY LLM - here is GPT-4, for example:

Every morning, she starts her day with a bright red apple.
He reached out to pluck the last ripe apple from the tree.
On her desk, there lay a note wrapped around a green apple.
In her favorite story, the princess was awakened not by a kiss, but by an apple.
They made a pact to meet every year under the old apple tree, sharing an apple.
After dinner, he enjoyed a slice of homemade apple pie topped with a fresh apple.
In the painting, a small child's hand is gently holding a shiny apple.
Amidst the chaos of the market, her eyes were drawn to a stall selling nothing but apple.
The teacher was pleasantly surprised when the new student presented her with an apple.
As the autumn winds began to chill, the scent of cinnamon and apple filled the kitchen.

Meta spent ludicrous amounts of money on their post training recipe - and they target different results than we do.
We target an unbiased, uncensored and helpful model. They target a model that will please the most amount of people, and won't upset anyones sensibilities.
We initialize dolphin from the base model, not their instruct model. Which means we don't start with a model that can do this apple instruction, and we don't teach it to.

So to answer your question: Dolphin is not so much worse than the "original" llama 3. It's the best at what it does. Also, if you want to ask the "original" llama 3, give this model the apple instruction and see what happens: https://huggingface.co/meta-llama/Meta-Llama-3-8B.

I'd carefully consider your usage of the word "worse" before asking questions of the model creators. Doesn't start things off on the right foot.

Crystalcareai changed discussion status to closed
Cognitive Computations org

Have you even taken the time to read the sentences? The ones the original llama-3 don't even make sense sometimes. I reckon if you would want to use these sentences, they should at least make sense. For example: "When I get sick, my mom makes me a warm glass of apple juice and apple."

Sign up or log in to comment