Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization Paper • 2405.15071 • Published 9 days ago • 30
ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models Paper • 2404.07738 • Published Apr 11 • 2
InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation Paper • 2404.19427 • Published Apr 30 • 65
view article Article Fine Tuning a LLM Using Kubernetes with Intel® Xeon® Scalable Processors By dmsuehir • Apr 24 • 3
MultiBooth: Towards Generating All Your Concepts in an Image from Text Paper • 2404.14239 • Published Apr 22 • 8
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions Paper • 2404.13208 • Published Apr 19 • 37
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding Paper • 2404.05726 • Published Apr 8 • 18
Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and Training Strategies Paper • 2404.08197 • Published Apr 12 • 26
Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence Paper • 2404.05892 • Published Apr 8 • 28
Larimar: Large Language Models with Episodic Memory Control Paper • 2403.11901 • Published Mar 18 • 30
Synth^2: Boosting Visual-Language Models with Synthetic Captions and Image Embeddings Paper • 2403.07750 • Published Mar 12 • 19
Personalized Audiobook Recommendations at Spotify Through Graph Neural Networks Paper • 2403.05185 • Published Mar 8 • 19
DeepSeek-VL: Towards Real-World Vision-Language Understanding Paper • 2403.05525 • Published Mar 8 • 38
LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error Paper • 2403.04746 • Published Mar 7 • 21
MAGNeT Collection Masked Audio Generation using a Single Non-Autoregressive Transformer • 9 items • Updated Apr 4 • 30
Think before you speak: Training Language Models With Pause Tokens Paper • 2310.02226 • Published Oct 3, 2023 • 2
A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications Paper • 2402.07927 • Published Feb 5 • 1
Step-On-Feet Tuning: Scaling Self-Alignment of LLMs via Bootstrapping Paper • 2402.07610 • Published Feb 12 • 7
Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains Paper • 2402.05140 • Published Feb 6 • 18
CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay Paper • 2402.04858 • Published Feb 7 • 13
ScreenAI: A Vision-Language Model for UI and Infographics Understanding Paper • 2402.04615 • Published Feb 7 • 31
Specialized Language Models with Cheap Inference from Limited Domain Data Paper • 2402.01093 • Published Feb 2 • 45
Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation Paper • 2401.14257 • Published Jan 25 • 9
PathFinder: Guided Search over Multi-Step Reasoning Paths Paper • 2312.05180 • Published Dec 8, 2023 • 9