2 14 20

Alexandre TL

alexandretl

https://www.youtube.com/@alexandretl

AI & ML interests

None yet

Recent Activity

liked a dataset 2 days ago

BytedTsinghua-SIA/DAPO-Math-17k

upvoted a paper 6 days ago

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

liked a model 28 days ago

Qwen/Qwen2.5-Coder-1.5B-Instruct

View all activity

Organizations

None yet

alexandretl's activity

upvoted a paper 6 days ago

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Paper • 2503.09573 • Published 7 days ago • 55

upvoted a paper about 1 month ago

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published Feb 13 • 143

upvoted a paper 2 months ago

Transformer^2: Self-adaptive LLMs

Paper • 2501.06252 • Published Jan 9 • 53

upvoted 2 articles 7 months ago

Article

A failed experiment: Infini-Attention, and why we should keep trying?

Aug 14, 2024

• 60

Article

Welcome FalconMamba: The first strong attention-free 7B model

Aug 12, 2024

• 109

upvoted a paper 10 months ago

Zamba: A Compact 7B SSM Hybrid Model

Paper • 2405.16712 • Published May 26, 2024 • 24

upvoted 2 papers 11 months ago

KAN: Kolmogorov-Arnold Networks

Paper • 2404.19756 • Published Apr 30, 2024 • 111

TransformerFAM: Feedback attention is working memory

Paper • 2404.09173 • Published Apr 14, 2024 • 43

upvoted an article 11 months ago

Article

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Apr 22, 2024

• 80

upvoted 2 papers about 1 year ago

Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks

Paper • 2402.04248 • Published Feb 6, 2024 • 32

Learning Universal Predictors

Paper • 2401.14953 • Published Jan 26, 2024 • 21

upvoted 2 papers over 1 year ago

Large Language Models as Generalizable Policies for Embodied Tasks

Paper • 2310.17722 • Published Oct 26, 2023 • 7

Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization

Paper • 2308.02151 • Published Aug 4, 2023 • 19