VoladorLuYu
's Collections
Research on LLM
updated
When can transformers reason with abstract symbols?
Paper
•
2310.09753
•
Published
•
2
In-Context Pretraining: Language Modeling Beyond Document Boundaries
Paper
•
2310.10638
•
Published
•
28
Reward-Augmented Decoding: Efficient Controlled Text Generation With a
Unidirectional Reward Model
Paper
•
2310.09520
•
Published
•
10
Connecting Large Language Models with Evolutionary Algorithms Yields
Powerful Prompt Optimizers
Paper
•
2309.08532
•
Published
•
52
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
Paper
•
2310.11441
•
Published
•
26
ControlLLM: Augment Language Models with Tools by Searching on Graphs
Paper
•
2310.17796
•
Published
•
16
Ultra-Long Sequence Distributed Transformer
Paper
•
2311.02382
•
Published
•
2
Can LLMs Follow Simple Rules?
Paper
•
2311.04235
•
Published
•
10
Everything of Thoughts: Defying the Law of Penrose Triangle for Thought
Generation
Paper
•
2311.04254
•
Published
•
13
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with
Modality Collaboration
Paper
•
2311.04257
•
Published
•
20
FlashDecoding++: Faster Large Language Model Inference on GPUs
Paper
•
2311.01282
•
Published
•
35
Language Models can be Logical Solvers
Paper
•
2311.06158
•
Published
•
18
Self-RAG: Learning to Retrieve, Generate, and Critique through
Self-Reflection
Paper
•
2310.11511
•
Published
•
74
Adapting Large Language Models via Reading Comprehension
Paper
•
2309.09530
•
Published
•
77
Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads
to Answers Faster
Paper
•
2311.08263
•
Published
•
15
Contrastive Chain-of-Thought Prompting
Paper
•
2311.09277
•
Published
•
34
DoLa: Decoding by Contrasting Layers Improves Factuality in Large
Language Models
Paper
•
2309.03883
•
Published
•
33
FLM-101B: An Open LLM and How to Train It with $100K Budget
Paper
•
2309.03852
•
Published
•
44
Effective Long-Context Scaling of Foundation Models
Paper
•
2309.16039
•
Published
•
30
Zephyr: Direct Distillation of LM Alignment
Paper
•
2310.16944
•
Published
•
122
Textbooks Are All You Need II: phi-1.5 technical report
Paper
•
2309.05463
•
Published
•
87
The ART of LLM Refinement: Ask, Refine, and Trust
Paper
•
2311.07961
•
Published
•
10
Interpreting Pretrained Language Models via Concept Bottlenecks
Paper
•
2311.05014
•
Published
•
1
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper
•
2307.09288
•
Published
•
242
From Complex to Simple: Unraveling the Cognitive Tree for Reasoning with
Small Language Models
Paper
•
2311.06754
•
Published
•
1
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper
•
2312.00752
•
Published
•
138
Beyond ChatBots: ExploreLLM for Structured Thoughts and Personalized
Model Responses
Paper
•
2312.00763
•
Published
•
19
LLM in a flash: Efficient Large Language Model Inference with Limited
Memory
Paper
•
2312.11514
•
Published
•
258
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
Paper
•
2304.13712
•
Published
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language
Models
Paper
•
2401.01335
•
Published
•
64
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper
•
2401.02954
•
Published
•
41
Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding
Paper
•
2401.12954
•
Published
•
29
Improving Text Embeddings with Large Language Models
Paper
•
2401.00368
•
Published
•
79
H2O-Danube-1.8B Technical Report
Paper
•
2401.16818
•
Published
•
17
LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large
Language Models
Paper
•
2402.10524
•
Published
•
22
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
Paper
•
2404.05961
•
Published
•
64