language dependency
#60 opened 5 days ago
by
Jay369
[AUTOMATED] Model Memory Requirements
#59 opened 3 months ago
by
model-sizer-bot
Fine-tune dbrx via Hugging Face Trainer vs. LLM-Foundry
#58 opened 3 months ago
by
HaloHaloHottie
![](https://cdn-avatars.huggingface.co/v1/production/uploads/658e0ca052dc1046ca974f64/MFnIlpoPPP3kRwMB1SGAo.png)
Deployments to Azure and Inference Endpoints
#55 opened 3 months ago
by
mo2024
Very sensitve to any repetition penalty!
#52 opened 3 months ago
by
jukofyork
![](https://cdn-avatars.huggingface.co/v1/production/uploads/65995c45539c808e84c38bf1/FiU-p4LC6Ar0G2_1stO8d.png)
Text2SQL2Output
#51 opened 3 months ago
by
Sudipta179002
The generated response cannot stop.
1
#50 opened 3 months ago
by
shaohuay
Saving dbrx model and tokenizer in dbfs
5
#49 opened 3 months ago
by
twony
![](https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/lD2D0Chg3jkJ1n42CJbWD.png)
OSError: Unable to load vocabulary from file
7
#47 opened 3 months ago
by
khurramnaseem
TypeError: __init__() got an unexpected keyword argument 'bias'
2
#46 opened 3 months ago
by
dainesn1
[DO NOT REVIEW] Mixtral like config
#45 opened 4 months ago
by
Pernekhan
Why clamp qkv_states, is it common?
#44 opened 4 months ago
by
jay68
Chat template
9
#43 opened 4 months ago
by
ehartford
![](https://cdn-avatars.huggingface.co/v1/production/uploads/63111b2d88942700629f5771/u2a9y-yx6TG0N31OhMSHI.png)
GGUF quants?
1
#41 opened 4 months ago
by
Iommed
Does the tokenizer of this model have a network to load successfully?
3
#40 opened 4 months ago
by
Rnake
VRAM Requirements?
8
#39 opened 4 months ago
by
dounykim
How to get hands on experience as a newbie
1
#38 opened 4 months ago
by
kimsia
Text2sql template and examples
3
#34 opened 4 months ago
by
daxiongshu
Continuation of the Discussion: More than 10 minutes the status is in Setting `pad_token_id` to `eos_token_id`:100257 for open-end generation. #28
7
#31 opened 4 months ago
by
Madhugraj
Errors During Training for the Original Implementation and the Fixes for the Errors
2
#24 opened 4 months ago
by
v2ray
![](https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/fTCV7VLY0eK4OXbwgIT2n.png)
Instruct dataset
#23 opened 4 months ago
by
Andriy
How to Fine Tune DBRX-Instruct?
7
#18 opened 4 months ago
by
elysiia
![](https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/BnvZ-iL8S6QDi_lkejspP.jpeg)
Bug on AMD MI 250 with flash-attention
3
#13 opened 4 months ago
by
PierreColombo
The fused expert parameters means load_in_4bit doesn't work properly, nor does LoRA
31
#10 opened 4 months ago
by
tdrussell