Nifty - a bfuzzy1 Collection

bfuzzy1 's Collections

RL

acheron

Gunny

Agents

Agentic-ly agentic

Don't hate - evaluate

Generation Nation

Nifty

Nifty

updated 19 days ago

LLMs + Persona-Plug = Personalized LLMs

Paper • 2409.11901 • Published Sep 18, 2024 • 32
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

Paper • 2409.12183 • Published Sep 18, 2024 • 37
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems

Paper • 2402.12875 • Published Feb 20, 2024 • 13
TPI-LLM: Serving 70B-scale LLMs Efficiently on Low-resource Edge Devices

Paper • 2410.00531 • Published Oct 1, 2024 • 30
DiaSynth -- Synthetic Dialogue Generation Framework

Paper • 2409.19020 • Published Sep 25, 2024 • 20
ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs

Paper • 2410.12405 • Published Oct 16, 2024 • 13
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective

Paper • 2410.23743 • Published Oct 31, 2024 • 59
SelfCodeAlign: Self-Alignment for Code Generation

Paper • 2410.24198 • Published Oct 31, 2024 • 23
BitStack: Fine-Grained Size Control for Compressed Large Language Models in Variable Memory Environments

Paper • 2410.23918 • Published Oct 31, 2024 • 19
ATM: Improving Model Merging by Alternating Tuning and Merging

Paper • 2411.03055 • Published Nov 5, 2024 • 1
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

Paper • 2411.04905 • Published Nov 7, 2024 • 113
Thanos: Enhancing Conversational Agents with Skill-of-Mind-Infused Large Language Model

Paper • 2411.04496 • Published Nov 7, 2024 • 22
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding

Paper • 2411.04282 • Published Nov 6, 2024 • 32
Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens

Paper • 2411.17691 • Published Nov 26, 2024 • 11
Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published Sep 18, 2024 • 140
MALT: Improving Reasoning with Multi-Agent LLM Training

Paper • 2412.01928 • Published Dec 2, 2024 • 40
Wonderful Matrices: Combining for a More Efficient and Effective Foundation Model Architecture

Paper • 2412.11834 • Published 28 days ago • 6
Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published 24 days ago • 38
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing

Paper • 2412.14711 • Published 25 days ago • 15
RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response

Paper • 2412.14922 • Published 25 days ago • 85