Guilherme Penedo
guipenedo
AI & ML interests
None yet
Organizations
guipenedo's activity
is this published dataset finished PII process too?
2
#20 opened 8 days ago
by
kimcando
Sample dataset?
7
#23 opened 6 days ago
by
dweb
FineWeb and Redpajamav2 deduplication
2
#24 opened 3 days ago
by
PereLluis13
Reproducing evaluation results
1
#3 opened 15 days ago
by
hjlee1371
Disk size of each dump
1
#15 opened 14 days ago
by
qbin
Details on the evaluation with lighteval
1
#22 opened 7 days ago
by
amaracani
Download filtered dataset
1
#11 opened 16 days ago
by
Basma-b
Training configs for data ablation study
1
#14 opened 14 days ago
by
jimmyhbx
Intermediate checkpoints
2
#1 opened 14 days ago
by
przvl
FWPR
#13 opened 15 days ago
by
infinite85
Reprocessing for a new language
7
#12 opened 15 days ago
by
pere
add license tag so it can be submitted to open llm leaderboard
2
#2 opened 15 days ago
by
mrfakename
License
2
#1 opened 18 days ago
by
mrfakename
Compatibility with released datatrove version
2
#6 opened 18 days ago
by
stefan-it
Is it currently unsafe to download this dataset?
2
#10 opened 18 days ago
by
george-adams1
Split by languages?
3
#7 opened 18 days ago
by
mhenrichsen
Scoring documents with LLM and making scores available as a quality filter (Ask-LLM)
1
#3 opened 19 days ago
by
Lauler
ARC-Challenge Benchmark
2
#2 opened 19 days ago
by
Taylor658
[bot] Conversion to Parquet
#1 opened 19 days ago
by
parquet-converter
fix typos
1
#2 opened 3 months ago
by
guipenedo
🚩 Report: Not working
15
#64 opened 4 months ago
by
Gertie01
🚩 Report: Not working
5
#62 opened 4 months ago
by
Gertie01
🚩 Report: Not working
5
#60 opened 5 months ago
by
Dinkum
🚩 Report: Not working
4
#57 opened 5 months ago
by
Spoon300
🚩 Report: Not working
7
#54 opened 6 months ago
by
Sksis
Update gradio to fix queue issue
#56 opened 5 months ago
by
guipenedo
🚩 Report: Not working
6
#46 opened 6 months ago
by
Dinkum
No-sense and indecipherable answers after a couple of questions
1
#51 opened 6 months ago
by
carlos-santos
Update tokenizer_config.json
#15 opened 6 months ago
by
Rocketknight1
🚩 Report: Not working
1
#48 opened 6 months ago
by
Greb4
Add chat template
#14 opened 6 months ago
by
Rocketknight1
🚩 Report: Not working
7
#41 opened 7 months ago
by
someone2024
A long queue of over 1500 jobs
1
#39 opened 7 months ago
by
yikuan8
🚩 Report: Not working
4
#38 opened 7 months ago
by
heywhatsmyname
🚩 Report: Not working
2
#33 opened 8 months ago
by
monology
🚩 Report: Not working
13
#36 opened 7 months ago
by
someone2024