New discussion

Finetuning llama2

#47 opened 6 months ago by zuhashaik

Any example of batch inference?

#46 opened 7 months ago by PrintScr

How to set max_split_size_mb?

1
#30 opened 10 months ago by neo-benjamin

max_position_embeddings = 2048?

1
#29 opened 10 months ago by zzzac

Load into 2 GPUs

3
#28 opened 10 months ago by sauravm8

Load model into TGI

#27 opened 10 months ago by schauppi

Perplexity

#22 opened 10 months ago by gsaivinay

70TB with multiple A5000

6
#21 opened 10 months ago by nashid

Inference time with TGI

1
#15 opened 10 months ago by jacktenyx

Can't launch with TGI

6
#14 opened 10 months ago by yekta

text-generation-inference error

7
#5 opened 10 months ago by msteele

Output always 0 tokens

11
#4 opened 10 months ago by sterogn