Chenhui Zhang's picture

Chenhui Zhang PRO

danielz01

·

https://www.danielz.ch/

AI & ML interests

MIT IDSS 28' | Illinois CS 23' | ML for Remote Sensing & Climate Change | Trustworthy ML

Recent Activity

updated a collection 27 days ago

Image Generation

View all activity

Articles

An Introduction to AI Secure LLM Safety Leaderboard

Organizations

danielz01's activity

upvoted an article about 1 month ago

Article

The Beginners Guide to Cleaning a Dataset

By

•

Nov 18, 2024

• 24

upvoted a collection 8 months ago

Meta Llama 3

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated 28 days ago • 698

upvoted a paper 10 months ago

Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference

Paper • 2403.04132 • Published Mar 7, 2024 • 38

upvoted a paper 11 months ago

DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows

Paper • 2402.10379 • Published Feb 16, 2024 • 30

upvoted 2 collections 11 months ago

🔍 Daily Picks in Interpretability & Analysis of LMs

Outstanding research in interpretability and evaluation of language models, summarized • 92 items • Updated about 13 hours ago • 94

Model Merging

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 221

upvoted a paper 12 months ago

SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities

Paper • 2401.12168 • Published Jan 22, 2024 • 26

upvoted 6 papers about 1 year ago

Interfacing Foundation Models' Embeddings

Paper • 2312.07532 • Published Dec 12, 2023 • 10

LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models

Paper • 2308.16137 • Published Aug 30, 2023 • 39

SparQ Attention: Bandwidth-Efficient LLM Inference

Paper • 2312.04985 • Published Dec 8, 2023 • 38

GIVT: Generative Infinite-Vocabulary Transformers

Paper • 2312.02116 • Published Dec 4, 2023 • 10

The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning

Paper • 2305.14045 • Published May 23, 2023 • 5

Vision Transformers Need Registers

Paper • 2309.16588 • Published Sep 28, 2023 • 78

upvoted a collection about 1 year ago

Switch-Transformers release

This release included various MoE (Mixture of expert) models, based on the T5 architecture . The base models use from 8 to 256 experts. • 9 items • Updated 21 days ago • 17

upvoted 6 papers about 1 year ago

I&S-ViT: An Inclusive & Stable Method for Pushing the Limit of Post-Training ViTs Quantization

Paper • 2311.10126 • Published Nov 16, 2023 • 7

FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores

Paper • 2311.05908 • Published Nov 10, 2023 • 12

Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities

Paper • 2311.05698 • Published Nov 9, 2023 • 9

PolyMaX: General Dense Prediction with Mask Transformer

Paper • 2311.05770 • Published Nov 9, 2023 • 6

Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

Paper • 2311.06242 • Published Nov 10, 2023 • 87

Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization

Paper • 2311.06243 • Published Nov 10, 2023 • 17