SivilTaram (Qian Liu)

upvoted a paper 20 days ago

Make Your LLM Fully Utilize the Context

Paper • 2404.16811 • Published 21 days ago • 50

upvoted a collection 27 days ago

Deita

Collection

12 items • Updated Dec 20, 2023 • 8

upvoted an article 29 days ago

Article

Introducing the LiveCodeBench Leaderboard - Holistic and Contamination-Free Evaluation of Code LLMs

about 1 month ago

• 11

upvoted a paper about 1 month ago

Compression Represents Intelligence Linearly

Paper • 2404.09937 • Published Apr 15 • 27

upvoted a collection about 1 month ago

Pile-T5

Collection

T5 trained on the Pile with Llama Tokenizer • 4 items • Updated Apr 15 • 16

upvoted an article about 1 month ago

Article

StarCoder2 and The Stack v2

Feb 28

• 3

upvoted a collection about 1 month ago

Chinese Tiny LLM

Collection

9 items • Updated Apr 5 • 6

upvoted an article about 1 month ago

Article

Efficient Table Pre-training without Real Data: An Introduction to TAPEX

May 23, 2022

• 1

upvoted a paper about 1 month ago

Sailor: Open Language Models for South-East Asia

Paper • 2404.03608 • Published Apr 4 • 17

upvoted a collection about 2 months ago

DBRX

Collection

DBRX is a mixture-of-experts (MoE) large language model trained from scratch by Databricks. • 3 items • Updated Mar 27 • 88

upvoted a paper about 2 months ago

LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models

Paper • 2403.13372 • Published Mar 20 • 53

upvoted 2 papers 2 months ago

Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset

Paper • 2403.09029 • Published Mar 14 • 52

MultiSpider: Towards Benchmarking Multilingual Text-to-SQL Semantic Parsing

Paper • 2212.13492 • Published Dec 27, 2022 • 2

upvoted a paper 3 months ago

StructLM: Towards Building Generalist Models for Structured Knowledge Grounding

Paper • 2402.16671 • Published Feb 26 • 26

upvoted a collection 3 months ago

⚓️ Sailor Language Models

Collection

Sailor: Open Language Models tailored for South-East Asia (SEA) released by Sea AI Lab. • 18 items • Updated about 8 hours ago • 14

upvoted 2 papers 3 months ago

StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29 • 123

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 566

upvoted a collection 3 months ago

💫 StarCoder2

Collection

StarCoder2 models and datasets! • 8 items • Updated Mar 1 • 72

upvoted 3 papers 3 months ago

upvoted 2 papers 4 months ago

Weak-to-Strong Jailbreaking on Large Language Models

Paper • 2401.17256 • Published Jan 30 • 14

Zero Bubble Pipeline Parallelism

Paper • 2401.10241 • Published Nov 30, 2023 • 19

upvoted a collection 4 months ago

TAPEX

Collection

TAPEX is the state-of-the-art table pre-training models which can be used for table-based question answering and table-based fact verification. • 10 items • Updated 8 days ago • 4

upvoted 2 papers 4 months ago

TinyLlama: An Open-Source Small Language Model

Paper • 2401.02385 • Published Jan 4 • 80

LLaMA Pro: Progressive LLaMA with Block Expansion

Paper • 2401.02415 • Published Jan 4 • 50

upvoted 4 papers 5 months ago

Active Retrieval Augmented Generation

Paper • 2305.06983 • Published May 11, 2023 • 3

From Zero to Hero: Examining the Power of Symbolic Tasks in Instruction Tuning

Paper • 2304.07995 • Published Apr 17, 2023 • 3

S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language Models

Paper • 2310.15147 • Published Oct 23, 2023 • 2

Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models

Paper • 2401.00788 • Published Jan 1 • 21

upvoted a paper 6 months ago

StarCoder: may the source be with you!

Paper • 2305.06161 • Published May 9, 2023 • 26

upvoted a paper 7 months ago

TAPEX: Table Pre-training via Learning a Neural SQL Executor

Paper • 2107.07653 • Published Jul 16, 2021 • 2

upvoted a collection 7 months ago

FLAN-T5-Large LoRA Modules

Collection

83 items • Updated Oct 19, 2023 • 2

upvoted 2 papers 7 months ago

OpenAgents: An Open Platform for Language Agents in the Wild

Paper • 2310.10634 • Published Oct 16, 2023 • 8

Lemur: Harmonizing Natural Language and Code for Language Agents

Paper • 2310.06830 • Published Oct 10, 2023 • 29

upvoted a collection 8 months ago

Model Merging

Collection

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 28 items • Updated Mar 23 • 178

upvoted 2 papers 8 months ago

Small-scale proxies for large-scale Transformer training instabilities

Paper • 2309.14322 • Published Sep 25, 2023 • 17

CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Paper • 2309.09400 • Published Sep 17, 2023 • 77

upvoted a paper 9 months ago

OctoPack: Instruction Tuning Code Large Language Models

Paper • 2308.07124 • Published Aug 14, 2023 • 27

upvoted 6 papers 10 months ago

WebArena: A Realistic Web Environment for Building Autonomous Agents

Paper • 2307.13854 • Published Jul 25, 2023 • 20

LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition

Paper • 2307.13269 • Published Jul 25, 2023 • 29

Bag of Tricks for Training Data Extraction from Language Models

Paper • 2302.04460 • Published Feb 9, 2023 • 2

Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 235

Copy Is All You Need

Paper • 2307.06962 • Published Jul 13, 2023 • 31

PolyLM: An Open Source Polyglot Large Language Model

Paper • 2307.06018 • Published Jul 12, 2023 • 24

Qian Liu

AI & ML interests

Articles

Efficient Table Pre-training without Real Data: An Introduction to TAPEX

Organizations

SivilTaram's activity

Introducing the LiveCodeBench Leaderboard - Holistic and Contamination-Free Evaluation of Code LLMs

StarCoder2 and The Stack v2

Efficient Table Pre-training without Real Data: An Introduction to TAPEX