how test the model

#4
by Soroor - opened

Hi there,
first of all thanks for sharing this cool model,
I tried to test it but I couldn't get result, so I think the way that I tried to test it might be wrong, could you please guide me how can I simply test it or could you please check my code
here is the code that I used but it just re-write the query and nothing more!

from transformers import AutoTokenizer
import transformers
import torch

model = "beomi/llama-2-ko-7b"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.float16,
    device_map="auto",
)
query="whatever"
sequences = pipeline(
    query,
    do_sample=False,
    top_k=10,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
    max_length=200,
)
for seq in sequences:
    print(f"Result: {seq['generated_text']}")
This comment has been hidden

Thanks for your attention!

it seems like it is working as intended, I've tested on Google Colab using your code but it seems working fine.
Here's demo colab link: https://colab.research.google.com/drive/1yw2wnge6iHfj7PO5VVDA3jkmliiOqQvd?usp=sharing

what i changed is 1 line of code - since this ckpt is consisted of BF16, you'll need to use torch_dtype=torch.bfloat16 or remove that line at all. (model's config contains about torch_dtype already) but actually it is not critical issue for running the model.

could you explain more detail about your env (python ver, pytorch ver, GPU, nvidia-driver version, cuda version, transformers/tokenizers/accelerate version)?

Thank you for the prompt response, and thanks for your guidance.
It seems that the issue has been solved.
I also obtained the same result that you shared:
Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 15/15 [00:26<00:00, 1.77s/it]
Result: How's the weather today? (였늘 날씨가 μ–΄λ–»μŠ΅λ‹ˆκΉŒ?) 10. 였늘 저녁에 뭐 ν•  κ²λ‹ˆκΉŒ? What are you doing tonight? 11. λͺ‡ μ‹œμ— ν‡΄κ·Όν•©λ‹ˆκΉŒ? What time do you get off? 12. μ˜€λŠ˜μ€ λͺ‡ μ‹œμ— μΆœκ·Όν•©λ‹ˆκΉŒ? What time are you coming to work today? 13. μ–΄λ””λ₯Ό κ°€μ‹­λ‹ˆκΉŒ? Where are you headed? 14. 당신은 무슨 일둜 μ „ν™”ν•˜μ…¨μŠ΅λ‹ˆκΉŒ? May I help you, sir? 15. 이 μ˜·μ€ μ–΄λ•Œμš”? How does this look on me? 16. μ°¨ ν•œ μž” μ–΄λ–»μŠ΅λ‹ˆκΉŒ? How about a cup of coffee? 17. μ™œ μ €μ—κ²Œ κ·Έλ ‡κ²Œ ν™”λ₯Ό λ‚΄κ³  μžˆμŠ΅λ‹ˆκΉŒ? Why are you so angry with me? 18. λ‚˜λŠ” 당신을 μ‚¬λž‘ν•©λ‹ˆλ‹€. I love you. 19. λ‚˜λŠ” 당신을 μ’‹μ•„ν•©λ‹ˆλ‹€. I like you. 20. 당신은 μ°Έ μ •μ—΄μ μž…λ‹ˆλ‹€. You...

It's working, but it seems to be generating questions and translations rather than general text generation.
Did you train the model to generate similar questions with their translations?
and I've noticed that it doesn't work well with larger texts and it's just re-write whole the given context again

and here are my current environment details:
python: 3.8.16
pytorch: 2.0.1+cu117
GPU: A100 80G
Nvidia-Driver Version: 495.29.05
CUDA Version: 11.5
transformers: 4.32.0.dev0
tokenizers: 0.13.3
accelerate: 0.20.3

thanks again!

It would be sampling issue.
How about adding some temperatures and top-p sampling?
the phenomena shown is NOT intended since I trained the model with shuffled texts.

I saw the same issue as reported by Soroor. I already used temperature 0.7, top-p 0.9.

image.png

Is there a good prompt template to use for chat?

Sign up or log in to comment