Resources

View closed (42)

A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.

#51 opened about 1 year ago by

Ayush8120

Unrecognized configuration class <class 'transformers.models.mistral.configuration_mistral.MistralConfig'>

#50 opened about 1 year ago by

zeio

requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

#49 opened about 1 year ago by

Jenad1kr

Problems with tokenizer

#48 opened about 1 year ago by

abdurnawaz

QLORA fine tuning with longer length of sequence (max_length=2048, padding=True) cause RuntimeError: CUDA error: device-side assert triggered; shorten length to 512 works !

#46 opened about 1 year ago by

nps798

MCQ Question Answering

#45 opened about 1 year ago by

Ayush8120

Is `added_tokens.json` intended to be here?

#43 opened about 1 year ago by

xzuyn

Adding `safetensors` variant of this model

#42 opened about 1 year ago by

nth-attempt

Adding `safetensors` variant of this model

#41 opened about 1 year ago by

nth-attempt

Mistral en français ?

#40 opened about 1 year ago by

Giroud

Question answering

#39 opened about 1 year ago by

codegood

Tensorflow-variant coming?

#37 opened about 1 year ago by

areinh

Default template and configuration for local run with GPU

#33 opened about 1 year ago by

brunoedcf

still throws refusals

#31 opened about 1 year ago by

Phoenixalight

Has a massive repetition problem

#29 opened about 1 year ago by

Delcos

Which Mistral datacenter was used for training ?

#25 opened about 1 year ago by

niko32

ValueError: Please specify `target_modules` in `peft_config`

#23 opened about 1 year ago by

Tapendra

13b in the future?

#21 opened about 1 year ago by deleted

Architectural difference with Llama

#20 opened about 1 year ago by

imone

How to deploy the model to local?

#19 opened about 1 year ago by

chao0524

Quantized version of Mistral 7B (4bit or 8 bit)

#18 opened about 1 year ago by

ianuvrat

FlashAttention support for Mistral HF Implementation

#17 opened about 1 year ago by

mxxtsai

what r the datasets used to train the model?

#10 opened about 1 year ago by

rv2307

Training data?

#8 opened about 1 year ago by

dkgaraujo

Safetensor weights

#6 opened about 1 year ago by

ghvandoorn

Dataset contamination tests

#1 opened about 1 year ago by

imone