Quality is pretty bad when used on Mac Mini M1 GPU

#7
by nsvemuri - opened

Below script leads to gibberish:
'import torch
from transformers import pipeline
generate_text = pipeline(model="databricks/dolly-v2-3b", torch_dtype=torch.float32, trust_remote_code=True, device="mps")
res = generate_text("Explain to me the difference between nuclear fission and fusion.")
print(res[0]["generated_text"])'

leads to

"F der The nuclear self pro the atomic has neither aang
the in its is its the period of her response? Posted: is if a sustainable the the

aus battle
the nuclear

one compared with at both............. (aCACHE

The nuclear nuclear : jury:

<|endoftext|>There is there really? ok, that's right, sure, of-:, although as a given in the;'b greater
penet ce, cus ldr the mater primefft COMDiscuss uch as the\muccomwriter for the met turn turn turn Ablethys marbaker. Diveka datte none , lock key also say S1 of the he costEq ofter werDCDTDig Sigue S t down we can do, interact do as good too.

A very same"

Databricks org

I think this is some problem with translating it to work on mps or something. The results are nothing like that when run on a CPU or GPU, it seems. Can you try that?

Databricks org

BTW, you may want to trying to ask the question slightly differently. Notice the different responses when altering the question when testing with Pythia-12b (dolly-v2-12b is derived from this) . Far from perfect but definitely an improvement,

Explain to me the difference between nuclear fission and fusion

" "I think I know what you're hoping of me." "You're hoping I'll answer this question, rather than the other one." "We are the same as the machines here, but you're the greater," "In every respect." "I don't believe that." "Surely you must know it." "Why don't we look closely how they are different from humans?" "Are you familiar with the works of Salmon and Oesch?" "Laugh." "Your death has to be more spectacular than ours." "Is that a proposal, Dr. Karellen?" "You thought to make a show of our death?" "We can give you that now." "Alright, alright, I can't help it." "Just don't make me fight." "Just keep me in here." "I can't let you leave now and destroy the Alpha Base computer." "Keep the air in your lungs, so to speak." "I'm sorry

but when I changed it to:
Explain the difference between nuclear fission vs. fusion?

The fission cycle in a nuclear reactor differs from the fusion cycle in other celestial objects: the stars are cold, did not start out hot, and so it's not necessary to generate heat to heat the reaction portion of the path.
The fission/fusion cycle is:

start with the alpha decay of a radionuclide (atoms with excess neutrons)
resolve that into product elements (with isotopes with less neutron radiation than the radionuclide)
these isotopes then fission generating product ions and fission fragments which are a bit cooler plus a big amount of heat

The coolest is the fission product. The fission fragments, or free ions, are typically also produced.
The fusion cycle conditions the way the heat is generated:

with hissy conditions and the fusion fuel (deuterium or tritium) is heated
the fusion releases neutrons and neutrons are the way the fission chain reaction

Yes, when i try CPU
"import torch
from transformers import pipeline
generate_text = pipeline(model="databricks/dolly-v2-3b", torch_dtype=torch.float32, trust_remote_code=True, device="cpu")
res = generate_text("Explain the difference between nuclear fission vs. fusion?")
print(res[0]["generated_text"])"

the quality is better
"Fusion releases more energy per mole of fusion fuel than either fission or thermionic emission. Nuclear fusion is the process that creates atoms, deuterium and tritium, from two atomic hydrogen atoms. The process is called nuclear fission and it creates one atom of deuterium or tritium from a Uranium nucleus. Fission is the process that creates two Uranium atoms from a Uranium nucleus. The energy released by fission is very low and is mainly used to heat a reactor vessel. Nuclear fission was first achieved in 1940 at the Trinity test of the chain reaction at the Nevada Test Site. This was the first successful nuclear chain reaction."

nsvemuri changed discussion status to closed

It turns out Pytorch is buggy on Mac M1 reference: https://news.ycombinator.com/item?id=31456450

Databricks org

Oh sorry, I completely spaced out - a great call out @nsvemuri - by any chance did you install PyTorch from the nightlies? Here's a great blog post on this topic: https://jamescalam.medium.com/hugging-face-and-sentence-transformers-on-m1-macs-4b12e40c21ce

Databricks org

Note, I'm using my Macbook Air M2 so it took awhile for this to run but when I used the PyTorch nightlies, here's the answer I got was:

Nuclear fission happens when a large atom (like uranium) splits into two smaller ones, with each losing one neutron. The smaller atoms then react with each other to form even smaller ones, and this continues until there are no more neutrons left in the original nucleus, and this causes it to disintegrate. The overall reaction releases a lot of energy. Nuclear fusion happens when two or more nuclei merge to form one or more nuclei with more mass than two nuclei, with the extra mass made up of extra neutrons. This process releases much less energy, but is still pretty amazing.

Here's the full run:

(ml) dennylee@tynan ~ % python
Python 3.9.16 (main, Mar  8 2023, 04:29:24) 
[Clang 14.0.6 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> from transformers import pipeline
>>> generate_text = pipeline(model="databricks/dolly-v2-3b", torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto")
>>> import time
>>> start_time = time.time()
>>> res = generate_text("Explain to me the difference between nuclear fission and fusion.")
>>> print(res[0]["generated_text"])
Nuclear fission happens when a large atom (like uranium) splits into two smaller ones, with each losing one neutron. The smaller atoms then react with each other to form even smaller ones, and this continues until there are no more neutrons left in the original nucleus, and this causes it to disintegrate. The overall reaction releases a lot of energy. Nuclear fusion happens when two or more nuclei merge to form one or more nuclei with more mass than two nuclei, with the extra mass made up of extra neutrons. This process releases much less energy, but is still pretty amazing.
>>> print("--- %s seconds ---" % (time.time() - start_time))
--- 6971.821104764938 seconds ---

HTH!

Databricks org

Hi @dennyglee

That's running on CPU, I think. MPS doesn't support bfloat16 afaik.

Databricks org

Thanks @mverrilli - completely forgot to change that - will rerun. Was focusing more on the data correctness part but spaced out MPS.

Sign up or log in to comment