Peter Szemraj PRO

pszemraj

AI & ML interests

metallic intuition

Organizations

pszemraj's activity

flan-t5 evals failing

1
#898 opened 7 days ago by pszemraj
New activity in argilla/magpie-ultra-v0.1 26 days ago

dataset topic diversity

3
#2 opened 29 days ago by pszemraj
New activity in BEE-spoke-data/Meta-Llama-3-8Bee about 1 month ago

Adding Evaluation Results

#1 opened about 1 month ago by leaderboard-pr-bot
New activity in pszemraj/Llama-3-6.3b-v0.1 about 1 month ago

Adding Evaluation Results

#2 opened about 1 month ago by leaderboard-pr-bot
New activity in pszemraj/Mistral-v0.3-6B about 1 month ago

Adding Evaluation Results

#2 opened about 1 month ago by leaderboard-pr-bot
New activity in BEE-spoke-data/smol_llama-220M-GQA-fineweb_edu about 1 month ago

Adding Evaluation Results

#1 opened about 1 month ago by leaderboard-pr-bot
New activity in OpenCo7/UpVoteWeb about 2 months ago

does this contain posts?

3
#4 opened about 2 months ago by pszemraj
New activity in pszemraj/long-t5-tglobal-xl-16384-booksci-summary-v1 about 2 months ago

new dataset

1
#1 opened about 2 months ago by Recognizeme
New activity in open-llm-leaderboard/open_llm_leaderboard about 2 months ago

WizardLM-8x22B Evaluation failed

25
#823 opened about 2 months ago by llama-anon
New activity in pszemraj/SHP-2-dpo-100k_sample 2 months ago
New activity in BEE-spoke-data/smol_llama-220M-GQA 2 months ago
New activity in HuggingFaceFW/fineweb-edu-classifier 3 months ago

fix example code

1
#2 opened 3 months ago by pszemraj
New activity in pszemraj/Llama-3-6.3b-v0.1 3 months ago

Excellent Approach

2
#1 opened 3 months ago by 1littlecoder
New activity in pszemraj/Mistral-v0.3-6B 3 months ago
New activity in open-llm-leaderboard/open_llm_leaderboard 4 months ago

ALL Jamba models failing

17
#690 opened 4 months ago by devingulliver
New activity in EleutherAI/pile-t5-large 5 months ago

why UMT5

6
#1 opened 5 months ago by pszemraj
New activity in BEE-spoke-data/smol_llama-101M-GQA 5 months ago

Link to code repository

1
#3 opened 5 months ago by ewre324
New activity in BEE-spoke-data/smol_llama-220M-openhermes 5 months ago

What is the model architecture?

1
#2 opened 5 months ago by ewre324
New activity in amazingvince/Not-WizardLM-2-7B 5 months ago

add link to colab example

#1 opened 5 months ago by pszemraj
New activity in pszemraj/qmsum-cleaned 5 months ago

Empty test cases

2
#2 opened 5 months ago by StDestiny
New activity in pszemraj/led-large-book-summary 5 months ago

Hardware requirements

1
#19 opened 5 months ago by Arunmass
New activity in ai21labs/Jamba-v0.1 5 months ago

Jambaleo

#10 opened 5 months ago by pszemraj
New activity in BEE-spoke-data/gutenberg-en-v1-clean 5 months ago
New activity in postbot/gpt-neo-1.3B-emailgen 6 months ago
New activity in pszemraj/distilgpt2-HC3 6 months ago
New activity in BEE-spoke-data/TinyLlama-3T-1.1bee 6 months ago
New activity in BEE-spoke-data/smol_llama-220M-GQA 6 months ago
New activity in pszemraj/pegasus-x-large-book-summary 7 months ago

Billsum Evaluation

1
#6 opened almost 2 years ago by mlkorra
New activity in TuringsSolutions/NYTWritingStyleGuide 7 months ago

Rename Main to data.json

3
#1 opened 8 months ago by davanstrien
New activity in upstage/SOLAR-10.7B-v1.0 8 months ago

Data for Continued Pre-Training

6
#8 opened 8 months ago by pszemraj

Not able to test it.

2
#21 opened 8 months ago by JESUSCOLIN
New activity in pszemraj/flan-t5-large-grammar-synthesis 9 months ago

Grammar explanation

2
#14 opened 9 months ago by Ejentos
New activity in jbochi/coedit-base 9 months ago

cc-by-nc vs dataset license

2
#1 opened 9 months ago by pszemraj
New activity in BEE-spoke-data/smol_llama-101M-GQA 9 months ago

GPU used for training

1
#2 opened 9 months ago by Locutusque
New activity in jinaai/jina-embeddings-v2-base-en 9 months ago

Is there a ETA for large version?

6
#31 opened 9 months ago by kk3dmax
New activity in abidlabs/GPT-Baker 9 months ago