Arthur Zucker

#22 opened 5 days ago by

bradhutchings

New activity in meta-llama/Meta-Llama-3-8B 4 days ago

LlamaTokenizerFast.from_pretrained gives incorrect number of tokens for Llama3

#156 opened 8 days ago by

farzadab

New activity in mistralai/Mistral-7B-Instruct-v0.3 9 days ago

Add minor reference to transformers

#7 opened 9 days ago by

osanseviero

Upload tokenizer

#6 opened 9 days ago by

Upload tokenizer

#5 opened 9 days ago by

New activity in mistralai/Mistral-7B-v0.3 9 days ago

Update README.md

#4 opened 9 days ago by

Update README.md

#3 opened 9 days ago by

New activity in mistralai/Mistral-7B-Instruct-v0.3 9 days ago

Update README.md

#4 opened 9 days ago by

Update config.json

#3 opened 9 days ago by

New activity in mistralai/Mistral-7B-v0.3 9 days ago

Upload MistralForCausalLM

#2 opened 9 days ago by

New activity in mistralai/Mistral-7B-Instruct-v0.3 9 days ago

Upload MistralForCausalLM

#2 opened 9 days ago by

New activity in mistralai/Mistral-7B-v0.3 9 days ago

Upload tokenizer

#1 opened 9 days ago by

New activity in mistralai/Mistral-7B-Instruct-v0.3 9 days ago

Upload tokenizer

#1 opened 9 days ago by

New activity in 01-ai/Yi-9B 17 days ago

Tokenizer inconsistencies related to HTML tags

#11 opened about 2 months ago by

sanderland

New activity in meta-llama/Meta-Llama-3-8B-Instruct 18 days ago

Update config.json

#105 opened 18 days ago by

New activity in meta-llama/Meta-Llama-3-70B-Instruct 18 days ago

Update config.json

#49 opened 22 days ago by

New activity in meta-llama/Meta-Llama-3-70B-Instruct 22 days ago

The sample code for usage with Transformers is incorrect.

#45 opened 26 days ago by

endNone

New activity in meta-llama/Meta-Llama-3-8B-Instruct 22 days ago

How to use EOT_ID

#54 opened about 1 month ago by

saksham-lamini

New activity in meta-llama/Meta-Llama-3-8B 22 days ago

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.

9

#72 opened about 1 month ago by

tianke0711

Unable to load the model for Torch versions starting from 2.0.1

8

#34 opened about 1 month ago by

benhachem

New activity in meta-llama/Meta-Llama-3-70B-Instruct 22 days ago

Update config.json

#33 opened about 1 month ago by

Update README.md

#31 opened about 1 month ago by

shokim

New activity in meta-llama/Meta-Llama-3-8B-Instruct 22 days ago

Update tokenizer_config.json

16

#60 opened about 1 month ago by

Navanit-shorthills

New activity in meta-llama/Meta-Llama-3-8B-Instruct about 1 month ago

Update config.json

#71 opened about 1 month ago by

New activity in meta-llama/Meta-Llama-3-70B about 1 month ago

Update generation_config.json

#10 opened about 1 month ago by

New activity in meta-llama/Meta-Llama-3-70B-Instruct about 1 month ago

Update generation_config.json

#30 opened about 1 month ago by

New activity in meta-llama/Meta-Llama-3-8B about 1 month ago

Update generation_config.json

#68 opened about 1 month ago by

New activity in meta-llama/Meta-Llama-3-8B-Instruct about 1 month ago

Update generation_config.json

#62 opened about 1 month ago by

Update generation_config.json

#61 opened about 1 month ago by

New activity in meta-llama/Meta-Llama-3-8B about 1 month ago

Update generation_config.json

#67 opened about 1 month ago by

Generated text is garbled?

5

#53 opened about 1 month ago by

gbhall

is there a chat model? or i need to use specific instruction

#63 opened about 1 month ago by

Barianc

Llama-3-8B not giving the entire outcome in Google Colab

#55 opened about 1 month ago by

sayanroy07

how to download llama3

#58 opened about 1 month ago by

pacopascal

The model repeats the question/answer multiple times in the output

#60 opened about 1 month ago by

ameljelidi

Issues with tokenizer causing bad performance of model.

#66 opened about 1 month ago by

Takuonline

Hi, I try to load with LlamaForCausalLM, LlamaTokenizer, but it show me the error that "not a string"

7

#64 opened about 1 month ago by

hjewr

New activity in TRI-ML/mamba-7b-rw about 1 month ago

Adding `safetensors` variant of this model

#4 opened about 1 month ago by

lucataco

New activity in google/recurrentgemma-2b-it about 1 month ago

Fix tokenizer

#11 opened about 2 months ago by

pcuenq

New activity in google/recurrentgemma-2b about 1 month ago

Fix tokenizer

#6 opened about 2 months ago by

pcuenq

New activity in google/recurrentgemma-2b-it about 1 month ago

ValueError: The device_map provided does not give any device for the following parameters: model.normalizer

9

#8 opened about 2 months ago by

LaferriereJC

New activity in meta-llama/Meta-Llama-3-8B-Instruct about 1 month ago

Tokenizer mismatch all the time

#47 opened about 1 month ago by

tian9

New activity in meta-llama/Meta-Llama-3-8B about 1 month ago

Update tokenizer_config.json to prepend the bos token

7

#35 opened about 1 month ago by

eduagarcia

Rotary position embeddings not loaded

#39 opened about 1 month ago by

cwbc

Rename original/tokenizer.model to tokenizer.model

#6 opened about 1 month ago by

winglian

New activity in google/recurrentgemma-2b-it about 1 month ago

ValueError when use multiple GPUs for inference

#10 opened about 2 months ago by

aladinggit

New activity in google/gemma-1.1-7b-it about 2 months ago

Fix slow tokenizer

#14 opened about 2 months ago by

pcuenq

New activity in google/recurrentgemma-2b-it about 2 months ago

I can't load this model on L4 GPU

#5 opened about 2 months ago by

albusdd

New activity in google/gemma-1.1-7b-it-GGUF about 2 months ago

Add quantized GGUFs?

#2 opened about 2 months ago by

MoonRide

New activity in hf-internal-testing/tiny-random-gpt2 about 2 months ago

Adding `safetensors` variant of this model

#2 opened 4 months ago by

SFconvertbot

New activity in ai21labs/Jamba-v0.1 2 months ago

Fix bias logic to enable QLoRA finetuning

#5 opened 2 months ago by

winglian

New activity in llava-hf/llava-v1.6-mistral-7b-hf 2 months ago

wrong padding token

#2 opened 3 months ago by

aliencaocao

New activity in hpcai-tech/grok-1 2 months ago

Upload tokenizer

7

#4 opened 2 months ago by

New activity in CohereForAI/c4ai-command-r-v01 2 months ago

Update README.md

#34 opened 2 months ago by

New activity in google/gemma-7b-it 2 months ago

Model "gg-hf/gemma-7b-it" doesn't exist.

#76 opened 2 months ago by

OfirHaim

New activity in google/gemma-7b 3 months ago

Very high loss compared to keras

5

#46 opened 3 months ago by

tanimazsin130

New activity in google/gemma-7b-it 3 months ago

Bug of modeling_gemma.py in transformers 4.38.0

#45 opened 3 months ago by

zlk

Fix chat template does not compatible with ConversationalPipeline

5

#42 opened 3 months ago by

hiyouga

Bug about number generation?

#30 opened 3 months ago by

myownskyW7

New activity in google/gemma-7b 3 months ago

RuntimeError: FlashAttention backward for head dim > 192 requires A100/A800 or H100/H800