Derek Thomas

Paper • 2403.12968 • Published Mar 19 • 20 •

#9 opened 3 days ago by

mhaseeb1604

commented a paper 1 day ago

LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression

New activity in OALL/Open-Arabic-LLM-Leaderboard 5 days ago

FAQ

#1 opened 18 days ago by

Ali-C137

New activity in reddit-tools-HF/processing-bestofredditorupdates 11 days ago

Clean readme update script

#2 opened 11 days ago by

Wauplin

New activity in Jofthomas/Chat_template_viewer 22 days ago

Ideas to improve

#3 opened 22 days ago by

Updating to the latest gradio version

#2 opened 22 days ago by

Adding torch and pinning a version

#1 opened 22 days ago by

Paper • 2404.16811 • Published Apr 25 • 52 •

commented a paper 26 days ago

Make Your LLM Fully Utilize the Context

Paper • 2404.03626 • Published Apr 4 • 21 •

commented 3 papers about 1 month ago

Training LLMs over Neurally Compressed Text

Paper • 2404.03592 • Published Apr 4 • 74 •

ReFT: Representation Finetuning for Language Models

17

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2 • 102 •

5

commented 2 papers about 2 months ago

Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity

Paper • 2403.14403 • Published Mar 21 • 6 •

Paper • 2403.20327 • Published Mar 29 • 44 •

Gecko: Versatile Text Embeddings Distilled from Large Language Models

New activity in Ali-C137/jais-13b-chat-GPTQ about 2 months ago

Not working with inference !

Paper • 2403.03507 • Published Mar 6 • 176 •

#1 opened about 2 months ago by

Ali-C137

commented 2 papers 2 months ago

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

12

Learning to Route Among Specialized Experts for Zero-Shot Generalization

Paper • 2402.05859 • Published Feb 8 • 4 •

New activity in reddit-tools-HF/processing-bestofredditorupdates 2 months ago

Process webhooks in the background

#1 opened 2 months ago by

Wauplin

New activity in demo-leaderboard-backend/backend 2 months ago

pr_3_add_button_readme_and_better_logs

#3 opened 2 months ago by

adding_log_visualizer

#1 opened 2 months ago by

Args is being accessed incorrectly

#2 opened 2 months ago by

Paper • 2305.05176 • Published May 9, 2023 • 3 •

commented 2 papers 2 months ago

FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance

3

ShortGPT: Layers in Large Language Models are More Redundant Than You Expect

Paper • 2403.03853 • Published Mar 6 • 61 •

20

commented 3 papers 3 months ago

New activity in derek-thomas/arabic-RAG 3 months ago

Build error after ZeroGPU migration

Paper • 2402.12354 • Published Feb 19 • 5 •

#2 opened 3 months ago by

cbensimon

commented 7 papers 3 months ago

LoRA+: Efficient Low Rank Adaptation of Large Models

Paper • 2402.16459 • Published Feb 26 • 2 •

Defending LLMs against Jailbreaking Attacks via Backtranslation

Paper • 2402.10200 • Published Feb 15 • 91 •

Chain-of-Thought Reasoning Without Prompting

Paper • 2402.09668 • Published Feb 15 • 34 •

How to Train Data-Efficient LLMs

Paper • 2008.09470 • Published Aug 19, 2020 •

Top2Vec: Distributed Representations of Topics

Paper • 2401.09555 • Published Jan 17 • 6 •

Improving Classification Performance With Human Feedback: Label a few, we label the rest

Paper • 2402.09906 • Published Feb 15 • 50 •

Generative Representational Instruction Tuning

New activity in derek-thomas/fawkes 4 months ago

Renaming Space to Fawkes

Paper • 2402.04177 • Published Feb 6 • 16 •

#1 opened 4 months ago by

meg

commented 3 papers 4 months ago

Scaling Laws for Downstream Task Performance of Large Language Models

Paper • 2401.06532 • Published Jan 12 • 10 •

INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning

Paper • 2308.16149 • Published Aug 30, 2023 • 24 •

Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models

6

New activity in Ezi/AudioWatermarking_test 4 months ago

Updating to gradio

#2 opened 4 months ago by

Updating to gradio

#1 opened 4 months ago by

Paper • 2401.01943 • Published Jan 3 • 6 •

commented 3 papers 4 months ago

Generalist embedding models are better at short-context clinical semantic search than specialized embedding models

Paper • 2401.10020 • Published Jan 18 • 135 •

Self-Rewarding Language Models

15

Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation

Paper • 2401.08417 • Published Jan 16 • 27 •

Paper • 2312.01552 • Published Dec 4, 2023 • 26 •

commented 3 papers 5 months ago

The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning

Paper • 2312.07910 • Published Dec 13, 2023 • 14 •

PromptBench: A Unified Library for Evaluation of Large Language Models

3

Improving Text Embeddings with Large Language Models

Paper • 2401.00368 • Published Dec 31, 2023 • 73 •

14

New activity in derek-thomas/Hubert_emotion-finetuned-gtzan-efficient 5 months ago

Librarian Bot: Add base_model information to model

#1 opened 5 months ago by

librarian-bot

New activity in core42/jais-13b-chat 6 months ago

Adding `safetensors` variant of this model

#18 opened 6 months ago by

New activity in derek-thomas/jais-13b-chat-hf 6 months ago

Adding `safetensors` variant of this model

#1 opened 6 months ago by

Paper • 2310.12150 • Published Oct 18, 2023 • 1 •

New activity in derek-thomas/dataset-creator-reddit-amitheasshole 6 months ago

Is there a way to get the flair / classification ?

3

#2 opened 6 months ago by

CoreyMorris

commented 3 papers 6 months ago

Understanding Retrieval Augmentation for Long-Form Question Answering

Paper • 2311.03099 • Published Nov 6, 2023 • 27 •

Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch

Paper • 2309.16039 • Published Sep 27, 2023 • 28 •

Effective Long-Context Scaling of Foundation Models

New activity in ysharma/Zephyr-Playground 6 months ago

Space is down

#1 opened 6 months ago by

Paper • 2311.08401 • Published Nov 14, 2023 • 26 •

commented a paper 6 months ago

Fine-tuning Language Models for Factuality

New activity in derek-thomas/arabic-RAG 7 months ago

SentenceTransformer GPU device

#1 opened 7 months ago by

cbensimon

New activity in sentence-transformers/paraphrase-multilingual-mpnet-base-v2 7 months ago

Adding `safetensors` variant of this model

#7 opened 7 months ago by

New activity in bigcode/search 7 months ago

🚩 Report : Not working

10

#2 opened about 1 year ago by

yjernite

New activity in derek-thomas/disc-golf-simulator 7 months ago

Creating more disc models

#1 opened 7 months ago by

toads

New activity in derek-thomas/ScienceQA 8 months ago

The image format is dict not PIL.

Paper • 2309.08872 • Published Sep 16, 2023 • 51 •

#1 opened 8 months ago by

tshen58

commented a paper 8 months ago

PDFTriage: Question Answering over Long, Structured Documents