Multi-needle In A Haystack
#25 opened 2 days ago
by
ElliottDyson
Rope Theta Value Difference?
#24 opened 3 days ago
by
fahadh4ilyas
Memory requirements to take advantage of full context window
#23 opened 9 days ago
by
andrewrreed
Fine-tuning
#22 opened 9 days ago
by
EkmekE
Adding Evaluation Results
#21 opened 18 days ago
by
leaderboard-pr-bot
Adding Evaluation Results
#20 opened 19 days ago
by
leaderboard-pr-bot
Adding Evaluation Results
#19 opened 20 days ago
by
leaderboard-pr-bot
Performance Degredation After Weight Update
5
#18 opened 21 days ago
by
evilperson068
error, can not load
#17 opened 22 days ago
by
yeyeyeyeye2
You should rename your weights every time you update them
#16 opened 24 days ago
by
AiCreatornator
ITS NOT REAL
8
#11 opened 27 days ago
by
rombodawg
GPU requirement for hosting this model?
3
#9 opened 28 days ago
by
csgxy2022
From your experience what would be a good methodology for using a 1048k model for filtering pre-training data
#8 opened 29 days ago
by
TylerRoost
Can you please build an extended version of mistral instruct v0.2 too please ?
1
#6 opened 29 days ago
by
AiModelsMarket
Better context utilization
1
#5 opened 29 days ago
by
DataPhreak