Foundation AI Papers (II) - a Temus Collection

Iterative Reasoning Preference Optimization

Paper • 2404.19733 • Published Apr 30 • 43

Better & Faster Large Language Models via Multi-token Prediction

Paper • 2404.19737 • Published Apr 30 • 64

Note well ...

ORPO: Monolithic Preference Optimization without Reference Model

Paper • 2403.07691 • Published Mar 12 • 58

KAN: Kolmogorov-Arnold Networks

Paper • 2404.19756 • Published Apr 30 • 97

Note "Less scalable version" of AGI backend model

Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations

Paper • 2303.02536 • Published Mar 5, 2023 • 1

Note LoRA fine-tune on judge LM, using dataset from Prometheus's 10K feedback dataset. Turn LLM into a classifier to increase 'overfitting' and get a slightly better performing model based on Phi-3 (which arguably already have a stronger performance than Mistral) Not that surprising, and using large dataset to fine-tune on human preference is boring. They did release code for the experiment which is nice to have. The real gem is efficient alignment.

ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models

Paper • 2405.09220 • Published 19 days ago • 23

Understanding the performance gap between online and offline alignment algorithms

Paper • 2405.08448 • Published 20 days ago • 11

Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?

Paper • 2405.05904 • Published 25 days ago • 5

Note A good way to avoid penalty while being lazy is just to be generic, or provide fake information

Robust agents learn causal world models

Paper • 2402.10877 • Published Feb 16 • 2

How Far Are We From AGI

Paper • 2405.10313 • Published 18 days ago • 2

MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published 14 days ago • 42

Note What is the difference again?

Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models

Paper • 2405.12939 • Published 13 days ago • 1

LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published 19 days ago • 73

Note Duh

The Platonic Representation Hypothesis

Paper • 2405.07987 • Published 21 days ago • 1

Note Intelligence has at least 2 levels: Level 1 associative intelligence, key to achieve it is representation of concept such that 'distance' between representation vectors accurately depict the closeness of these concepts, such intelligence can be achieved with Supervised Learning. Level 2 is deductive intelligence, key to achieve that is searching for the right connection and reach the correct conclusion robustless to noisy input. This should be achieved with Reinforcement Learning.

AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct

Paper • 2405.14906 • Published 12 days ago • 18

Trans-LoRA: towards data-free Transferable Parameter Efficient Finetuning

Paper • 2405.17258 • Published 7 days ago • 11

Executable Code Actions Elicit Better LLM Agents

Paper • 2402.01030 • Published Feb 1 • 21

Contextual Position Encoding: Learning to Count What's Important

Paper • 2405.18719 • Published 6 days ago • 3

Note HUGE

Understanding Transformer Reasoning Capabilities via Graph Algorithms

Paper • 2405.18512 • Published 6 days ago • 1