GuoLiangTang's picture

1194 3

GuoLiangTang

Tommy930

·

https://github.com/TommyTang930

AI & ML interests

LLM，NLP，ML

Organizations

None yet

Tommy930's activity

upvoted a paper about 9 hours ago

Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models

Paper • 2405.20541 • Published 4 days ago • 4

upvoted 5 papers about 10 hours ago

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

Paper • 2405.21060 • Published 3 days ago • 16

4-bit Shampoo for Memory-Efficient Network Training

Paper • 2405.18144 • Published 6 days ago • 3

4Diffusion: Multi-view Video Diffusion Model for 4D Generation

Paper • 2405.20674 • Published 3 days ago • 6

Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling

Paper • 2405.21048 • Published 3 days ago • 7

Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

Paper • 2405.21075 • Published 3 days ago • 9

upvoted 6 papers 1 day ago

Naturalistic Music Decoding from EEG Data via Latent Diffusion Models

Paper • 2405.09062 • Published 19 days ago • 7

Dynamic data sampler for cross-language transfer learning in large language models

Paper • 2405.10626 • Published 17 days ago • 4

Grounded 3D-LLM with Referent Tokens

Paper • 2405.10370 • Published 18 days ago • 8

Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation

Paper • 2405.14598 • Published 11 days ago • 11

Thermodynamic Natural Gradient Descent

Paper • 2405.13817 • Published 12 days ago • 13

ReVideo: Remake a Video with Motion and Content Control

Paper • 2405.13865 • Published 12 days ago • 21

upvoted 4 papers 2 days ago

DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark

Paper • 2405.19707 • Published 4 days ago • 1

DevEval: A Manually-Annotated Code Generation Benchmark Aligned with Real-World Code Repositories

Paper • 2405.19856 • Published 4 days ago • 4

Parrot: Efficient Serving of LLM-based Applications with Semantic Variable

Paper • 2405.19888 • Published 4 days ago • 2

MotionLLM: Understanding Human Behaviors from Human Motions and Videos

Paper • 2405.20340 • Published 4 days ago • 14

upvoted 7 papers 3 days ago

PLA4D: Pixel-Level Alignments for Text-to-4D Gaussian Splatting

Paper • 2405.19957 • Published 4 days ago • 4

Xwin-LM: Strong and Scalable Alignment Practice for LLMs

Paper • 2405.20335 • Published 4 days ago • 13

MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model

Paper • 2405.20222 • Published 4 days ago • 9

GECO: Generative Image-to-3D within a SECOnd

Paper • 2405.20327 • Published 4 days ago • 5

DITTO-2: Distilled Diffusion Inference-Time T-Optimization for Music Generation

Paper • 2405.20289 • Published 4 days ago • 6

Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts

Paper • 2405.19893 • Published 4 days ago • 17

Jina CLIP: Your CLIP Model Is Also Your Text Retriever

Paper • 2405.20204 • Published 4 days ago • 21

upvoted 13 papers 4 days ago

EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture

Paper • 2405.18991 • Published 5 days ago • 11

Atlas3D: Physically Constrained Self-Supporting Text-to-3D for Simulation and Fabrication

Paper • 2405.18515 • Published 6 days ago • 3

SoundCTM: Uniting Score-based and Consistency Models for Text-to-Sound Generation

Paper • 2405.18503 • Published 6 days ago • 5

Offline Regularised Reinforcement Learning for Large Language Models Alignment

Paper • 2405.19107 • Published 5 days ago • 8

Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF

Paper • 2405.19320 • Published 5 days ago • 6

Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities

Paper • 2405.18669 • Published 6 days ago • 9

NPGA: Neural Parametric Gaussian Avatars

Paper • 2405.19331 • Published 5 days ago • 6

Self-Exploring Language Models: Active Preference Elicitation for Online Alignment

Paper • 2405.19332 • Published 5 days ago • 10

Nearest Neighbor Speculative Decoding for LLM Generation and Attribution

Paper • 2405.19325 • Published 5 days ago • 10

LLMs achieve adult human performance on higher-order theory of mind tasks

Paper • 2405.18870 • Published 5 days ago • 13

T2V-Turbo: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward Feedback

Paper • 2405.18750 • Published 5 days ago • 16

MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series

Paper • 2405.19327 • Published 5 days ago • 40

3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting

Paper • 2405.18424 • Published 6 days ago • 7

upvoted 7 papers 5 days ago

Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning

Paper • 2405.18386 • Published 6 days ago • 13

GFlow: Recovering 4D World from Monocular Video

Paper • 2405.18426 • Published 6 days ago • 13

Yuan 2.0-M32: Mixture of Experts with Attention Router

Paper • 2405.17976 • Published 6 days ago • 16

VeLoRA: Memory Efficient Training using Rank-1 Sub-Token Projections

Paper • 2405.17991 • Published 6 days ago • 9

2BP: 2-Stage Backpropagation

Paper • 2405.18047 • Published 6 days ago • 20

LLaMA-NAS: Efficient Neural Architecture Search for Large Language Models

Paper • 2405.18377 • Published 6 days ago • 12

Phased Consistency Model

Paper • 2405.18407 • Published 6 days ago • 36

upvoted 15 papers 6 days ago

LoGAH: Predicting 774-Million-Parameter Transformers using Graph HyperNetworks with 1/100 Parameters

Paper • 2405.16287 • Published 9 days ago • 9

I2VEdit: First-Frame-Guided Video Editing via Image-to-Video Diffusion Models

Paper • 2405.16537 • Published 8 days ago • 15

Collaborative Video Diffusion: Consistent Multi-video Generation with Camera Control

Paper • 2405.17414 • Published 7 days ago • 7

Human4DiT: Free-view Human Video Generation with 4D Diffusion Transformer

Paper • 2405.17405 • Published 7 days ago • 13

Greedy Growing Enables High-Resolution Pixel-Based Diffusion Models

Paper • 2405.16759 • Published 7 days ago • 7

EM Distillation for One-step Diffusion Models

Paper • 2405.16852 • Published 7 days ago • 10

NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models

Paper • 2405.17428 • Published 7 days ago • 13

Transformers Can Do Arithmetic with the Right Embeddings

Paper • 2405.17399 • Published 7 days ago • 47

Part123: Part-aware 3D Reconstruction from a Single-view Image

Paper • 2405.16888 • Published 7 days ago • 10

Zamba: A Compact 7B SSM Hybrid Model

Paper • 2405.16712 • Published 8 days ago • 17

Looking Backward: Streaming Video-to-Video Translation with Feature Banks

Paper • 2405.15757 • Published 10 days ago • 12

Vidu4D: Single Generated Video to High-Fidelity 4D Reconstruction with Dynamic Gaussian Surfels

Paper • 2405.16822 • Published 7 days ago • 11

Trans-LoRA: towards data-free Transferable Parameter Efficient Finetuning

Paper • 2405.17258 • Published 7 days ago • 11

An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published 7 days ago • 67

Matryoshka Multimodal Models

Paper • 2405.17430 • Published 7 days ago • 29

upvoted 2 papers 7 days ago

ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models

Paper • 2405.15738 • Published 10 days ago • 41

Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training

Paper • 2405.15319 • Published 10 days ago • 20