sonta

university

https://sustcsonglin.github.io

AI & ML interests

None defined yet.

Recent Activity

ahatamiz updated a model 3 months ago

sontata/gated_deltanet

ahatamiz updated a model 3 months ago

sontata/mamba1

ahatamiz updated a model 3 months ago

sontata/mamba2

View all activity

sontata's activity

ahatamiz

updated 4 models 3 months ago

sontata/gated_deltanet

Updated Sep 30 • 3

sontata/mamba1

Text Generation • Updated Sep 30 • 7

sontata/mamba2

Updated Sep 30 • 4

sontata/swa_gated_deltanet

Updated Sep 30 • 3

sonta7

authored 3 papers 3 months ago

A Controlled Study on Long Context Extension and Generalization in LLMs

Paper • 2409.12181 • Published Sep 18 • 43

Parallelizing Linear Transformers with the Delta Rule over Sequence Length

Paper • 2406.06484 • Published Jun 10 • 3

Gated Slot Attention for Efficient Linear-Time Sequence Modeling

Paper • 2409.07146 • Published Sep 11 • 19

ahatamiz

authored 3 papers 5 months ago

MambaVision: A Hybrid Mamba-Transformer Vision Backbone

Paper • 2407.08083 • Published Jul 10 • 27

An Empirical Study of Mamba-based Language Models

Paper • 2406.07887 • Published Jun 12 • 1

Global Context Vision Transformers

Paper • 2206.09959 • Published Jun 20, 2022

sonta7

authored a paper 8 months ago

HGRN2: Gated Linear RNNs with State Expansion

Paper • 2404.07904 • Published Apr 11 • 17

ahatamiz

authored a paper about 1 year ago

DiffiT: Diffusion Vision Transformers for Image Generation

Paper • 2312.02139 • Published Dec 4, 2023 • 13

ahatamiz

authored a paper over 1 year ago

FasterViT: Fast Vision Transformers with Hierarchical Attention

Paper • 2306.06189 • Published Jun 9, 2023 • 30