Matt Mistele's picture

Matt Mistele

mmistele

·

https://www.moveworks.com/ai

AI & ML interests

Natural language processing, text classification, reasoning, agents, privacy

Organizations

None yet

mmistele's activity

upvoted 2 papers 6 months ago

Mechanistic Permutability: Match Features Across Layers

Paper • 2410.07656 • Published Oct 10, 2024 • 18

Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published Sep 18, 2024 • 147

upvoted 4 papers 9 months ago

Eliminating Position Bias of Language Models: A Mechanistic Approach

Paper • 2407.01100 • Published Jul 1, 2024 • 8

TabReD: A Benchmark of Tabular Machine Learning in-the-Wild

Paper • 2406.19380 • Published Jun 27, 2024 • 49

On scalable oversight with weak LLMs judging strong LLMs

Paper • 2407.04622 • Published Jul 5, 2024 • 15

Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Paper • 2407.04620 • Published Jul 5, 2024 • 31

upvoted 7 papers 10 months ago

Aya 23: Open Weight Releases to Further Multilingual Progress

Paper • 2405.15032 • Published May 23, 2024 • 31

Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization

Paper • 2405.15071 • Published May 23, 2024 • 40

Trans-LoRA: towards data-free Transferable Parameter Efficient Finetuning

Paper • 2405.17258 • Published May 27, 2024 • 16

NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models

Paper • 2405.17428 • Published May 27, 2024 • 19

Transformers Can Do Arithmetic with the Right Embeddings

Paper • 2405.17399 • Published May 27, 2024 • 53

An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27, 2024 • 89

DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data

Paper • 2405.14333 • Published May 23, 2024 • 40

upvoted a paper 11 months ago

OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

Paper • 2404.14619 • Published Apr 22, 2024 • 127

upvoted 6 papers 12 months ago

Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models

Paper • 2404.12387 • Published Apr 18, 2024 • 39

From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples

Paper • 2404.07544 • Published Apr 11, 2024 • 20

Learn Your Reference Model for Real Good Alignment

Paper • 2404.09656 • Published Apr 15, 2024 • 86

TransformerFAM: Feedback attention is working memory

Paper • 2404.09173 • Published Apr 14, 2024 • 43

Best Practices and Lessons Learned on Synthetic Data for Language Models

Paper • 2404.07503 • Published Apr 11, 2024 • 30

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

Paper • 2404.07839 • Published Apr 11, 2024 • 47