strategist922 (James Chang)

upvoted 2 papers 3 months ago

Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks

Paper • 2402.04248 • Published Feb 6 • 25

Multi-Head Mixture-of-Experts

Paper • 2404.15045 • Published Apr 23 • 56

upvoted a collection 3 months ago

Idefics2 🐶

Collection

Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 11 items • Updated May 6 • 88

upvoted 4 papers 4 months ago

Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Paper • 2401.09417 • Published Jan 17 • 54

upvoted 5 papers 5 months ago

Simple linear attention language models balance the recall-throughput tradeoff

Paper • 2402.18668 • Published Feb 28 • 18

Nemotron-4 15B Technical Report

Paper • 2402.16819 • Published Feb 26 • 42

Mixtures of Experts Unlock Parameter Scaling for Deep RL

Paper • 2402.08609 • Published Feb 13 • 34

Premise Order Matters in Reasoning with Large Language Models

Paper • 2402.08939 • Published Feb 14 • 24

MPIrigen: MPI Code Generation through Domain-Specific Language Models

Paper • 2402.09126 • Published Feb 14 • 11

upvoted 8 papers 6 months ago

InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning

Paper • 2402.06332 • Published Feb 9 • 18

Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains

Paper • 2402.05140 • Published Feb 6 • 19

Specialized Language Models with Cheap Inference from Limited Domain Data

Paper • 2402.01093 • Published Feb 2 • 45

TravelPlanner: A Benchmark for Real-World Planning with Language Agents

Paper • 2402.01622 • Published Feb 2 • 31

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1 • 21

PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding

Paper • 2312.04461 • Published Dec 7, 2023 • 54

Tuning Language Models by Proxy

Paper • 2401.08565 • Published Jan 16 • 20

Narrowing the Knowledge Evaluation Gap: Open-Domain Question Answering with Multi-Granularity Answers

Paper • 2401.04695 • Published Jan 9 • 8

upvoted 5 papers 7 months ago

TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10 • 63

Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation

Paper • 2401.05675 • Published Jan 11 • 20

Towards Conversational Diagnostic AI

Paper • 2401.05654 • Published Jan 11 • 14

Exploiting Novel GPT-4 APIs

Paper • 2312.14302 • Published Dec 21, 2023 • 12

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Paper • 2312.11514 • Published Dec 12, 2023 • 255

upvoted a paper 8 months ago

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

Paper • 2311.16502 • Published Nov 27, 2023 • 33

upvoted 3 papers 9 months ago

mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration

Paper • 2311.04257 • Published Nov 7, 2023 • 20

CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding

Paper • 2311.03354 • Published Nov 6, 2023 • 4

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Paper • 2311.03285 • Published Nov 6, 2023 • 27

James Chang

AI & ML interests

Organizations

strategist922's activity