1 9 3

Nikita Balagansky

elephantmipt

AI & ML interests

None yet

Recent Activity

upvoted a paper 16 days ago

Scale-wise Distillation of Diffusion Models

upvoted a paper 23 days ago

Transformers without Normalization

commented on a paper about 2 months ago

You Do Not Fully Utilize Transformer's Representation Capacity

View all activity

Organizations

elephantmipt's activity

upvoted a paper 16 days ago

Scale-wise Distillation of Diffusion Models

Paper • 2503.16397 • Published 17 days ago • 38

upvoted a paper 23 days ago

Transformers without Normalization

Paper • 2503.10622 • Published 24 days ago • 153

commented a paper about 2 months ago

You Do Not Fully Utilize Transformer's Representation Capacity

Paper • 2502.09245 • Published Feb 13 • 35 •

authored a paper about 2 months ago

You Do Not Fully Utilize Transformer's Representation Capacity

Paper • 2502.09245 • Published Feb 13 • 35

upvoted a paper about 2 months ago

You Do Not Fully Utilize Transformer's Representation Capacity

Paper • 2502.09245 • Published Feb 13 • 35

authored a paper about 2 months ago

Analyze Feature Flow to Enhance Interpretation and Steering in Language Models

Paper • 2502.03032 • Published Feb 5 • 60

upvoted a paper 2 months ago

The Differences Between Direct Alignment Algorithms are a Blur

Paper • 2502.01237 • Published Feb 3 • 115

updated 9 models 3 months ago

elephantmipt/sae_Qwen_Qwen2.5-7B_resid_pre_layer_24_size_16384_batchtopk_reg_coeff_0.0018

Updated Dec 29, 2024

elephantmipt/sae_Qwen_Qwen2.5-7B_resid_pre_layer_18_size_16384_batchtopk_reg_coeff_0.0018

Updated Dec 29, 2024

elephantmipt/sae_Qwen_Qwen2.5-7B_resid_pre_layer_12_size_16384_batchtopk_reg_coeff_0.0018

Updated Dec 29, 2024

elephantmipt/sae_Qwen_Qwen2.5-7B_resid_pre_layer_12_size_16384_batchtopk_reg_coeff_0.0018

Updated Dec 29, 2024

elephantmipt/sae_Qwen_Qwen2.5-7B_resid_pre_layer_24_size_16384_batchtopk_reg_coeff_0.0018

Updated Dec 29, 2024

elephantmipt/sae_Qwen_Qwen2.5-7B_resid_pre_layer_18_size_16384_batchtopk_reg_coeff_0.0018

Updated Dec 29, 2024

elephantmipt/sae_Qwen_Qwen2.5-7B_resid_pre_layer_12_size_16384_batchtopk_reg_coeff_0.0018

Updated Dec 29, 2024

elephantmipt/sae_Qwen_Qwen2.5-7B_resid_pre_layer_24_size_16384_batchtopk_reg_coeff_0.0018

Updated Dec 29, 2024

elephantmipt/sae_Qwen_Qwen2.5-7B_resid_pre_layer_18_size_16384_batchtopk_reg_coeff_0.0018

Updated Dec 29, 2024