Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
matlok
's Collections
Papers - Training - Differential Transformer
matlok - Python Copilot Image Datasets
matlok - Python Code Instruction Datasets
matlok - Python Copilot Audio Datasets
matlok - Python Src Code Datasets (base)
How to build a Python Coding Model with Alpaca Instructions
Dataset - Python Coding Alpaca Instructions
Image Papers
Audio Papers
Text Instruction Papers
Multimodal Papers
Mixture of Experts Papers
Coding Papers
Coding Models
Embedding Papers
Transformer Arch
LMM
LoRA
Non-English Embeddings and Models
Fine-Tuning
More Alpaca Instruction Datasets
Model Benchmarking
Actor Critic Papers
Gaming Reinforcement Learning
Search papers from a url
Chat datasets
Audio models
Datasets - DPO
Datasets - Geospatial
Models - Geospatial
Papers - Geospatial
Models - Biotech
Datasets - Financial
Models - Video Editing
Models - Testing
Papers - Attention
Papers - Context
Papers - Synthetic Data
Tuning - Dora
Models - Fintech
Models - Multimodal
Models - MultiAgent
Models - n-gram and Kneser-Ney
Papers - NLP Research
Papers - Multi-turn Conversations
Datasets - Synthetic - Instruct
Models - Watermarking
Models - Captions
Papers - Fintech - Benchmarks
Models - Touch and Image
Models - Video
Datasets - Image - Text
Models - NeRFs - Image Radiance Fields
Models - Parameter Testing
Models - Predicting Models
Models - Robotics
Datasets - Coding
Models - ReAct - Reasoning and Action
Models - Text
Models - Custom-Training
Papers - Decoders
Papers - Testing a Coding Model
Datasets - Text
Datasets - Multimodal - Text and Images
Models - Large Scale
Papers - Coding
Papers - Transfer Learning
Datasets - Text - Multiple Choice
Datasets - Binarized
Models - Math
Papers - Pipeline - Multimodal
Papers - Reasoning
Models - Gaming
Papers - IoT
Papers - Learning and Compression
Papers - Conversations
Models - Quants
Models - Image - Geometric Algebra
Models - Image
Papers - Video
Spaces - Math
Datasets - Math
Models - Base - 7B
Models - Base - 1B
Spaces - Vision
Datasets - Image and Bounding Box
Models - Science
Papers - Sampling
Models - Byte Transformer
Models - Cooking
Papers - RoPE
Papers - Math - GSM8K
Papers - Model Scaling
Papers - Training Research
Models - UI - Front-End
Papers - Reasoning - Vision
Models - Text - Explanation
Papers - Multi-Agent
Papers - QLoRA
Papers - Ring Attention
Papers - Sequence Parallelism
Models - Legal
Helpful - VRAM Calculator
Models - Audio - Translation
Models - Video Generation
Models - Image - Long Context
Papers - Masked Sequence Packing
Papers - Speculative Decoding
Papers - Fine-tuning - Multimodal
Datasets - Math - Word Problems
Spaces - Coding
Models - Audio - Music Generation
Datasets - Audio
Datasets - Audio - Fine-tuning
Models - Audio - Sheet Music Gen
Papers - Striped Attention
Datasets - Text - Instruction (non-Alpaca)
Models - Images - Instruct
Papers - Benchmarks - Image and Text
Papers - Image - Not-using CLIP
Models - Suggest - Audiobooks from Playlist
Models - MoE
Papers - MoE
Models - MoE - IoT
Models - IoT
Models - Mamba
Models - MoE - Mulitmodal
Papers - MoE - Research
Papers - Image - Knowledge Graphs
Papers - MoE - Training
Papers - Image - MoE
Models - Image - MoE
Papers - Lora - LCM
Models - Image - Drone Photography
Models - Image - Lora
Models - MoE - Principles
Models - MoE - Constitutional Experts
Models - MoE - Visual Relationship Detection
Models - MoE - Training using Lora
Papers - Training with Lora
Papers - MoE - Prompt Immunity
Papers - MoE - Router
Models - MoE - Audio - Underwater Acoustics
Models - MoE - Audio
Papers - MoE - Malicious Queries
Papers - MoE - Image
Models - MoE - Image
Papers - MoE - Training - Blocks
Papers - MoE - Scaling
Papers - MoE - Adversary Queries
Papers - MoE - Deny an Expert
Papers - MoE - Custom Layers
Papers - MoE - Frankenmerge
Papers - Multimodal
Papers - Image - Bounding Box
Papers - Multimodal - Documents
Papers - Exploit - Model Layer Retrieval
Papers - Image
Papers - Image - Dataset Generator
Datasets - Text and Video
Papers - Video - Mamba
Papers - Performance Trends in AI
Papers - Fine-tuning - Home Lab
Papers - MoE - Audio
Papers - MoE - Attention
Papers - Quants
Papers - Image - MoE - IoT
Papers - MoE - Speech Recognition
Papers - MoE - Router - Task
Papers - MoE - Multilingual
Papers - MoE - Federated Learning
Papers - MoE - Training - Weight Sharing
Papers - MoE - Router - Research
Papers - Image - Handwriting Recognition
Papers - MoE - IoT
Papers - MoE - Handwriting Recognition
Papers - Image - OCR Handwriting
Papers - Image - Adversarial
Papers - Image - Segment - Handwriting
Papers - Image - Handwriting and Online Gestures
Papers - Image - Handwritten Characters
Papers - Image - Fine-tuning
Papers - Image - HTR - Math Gestures and Symbols
Papers - Image - Handwritten Generation
Models - Text - Multilingual
Models - Image - Diffusion Probabilistic Models
Papers - Benchmark - Handwriting Recognition
Papers - Image - Handwriting Recognition - Lexical Features
Datasets - Image - Handwritten Recognition
Papers - Image - Custom Layers
Papers - Image - Handwriting Recognition - Tetrolets
Papers - Image - Handwriting Recognition - Near-Realtime
Papers - Text - Encoders
Papers - Text - Decoders
Papers - Text - Bidirectional - Bio
Papers - Text - Bidirectional Encoders
Papers - Text - Pre-training
Papers - Text - Pre-training - Research
Papers - Text - Pre-training - Decoder Multi-Steps
Papers - Text - Benchmarks - Quality Diversity
Papers - Image - Multimodal - Handwriting Recognition
Papers - Text - Research
Papers - Text - Multilingual
Papers - Multimodal - Speech and Text
Papers - Multimodal - Speech and Text - Multilingual
Papers - Multimodal - Training and Tuning
Papers - Multimodal - Document Analysis
Papers - Video - Motion Control
Papers - Video - Entity Recognition
Papers - Video - Pre-training
Papers - Image - Pre-training
Papers - Image - Caption Generation
Papers - Image - Synthetic Data Generator
Papers - Transformer Research - Custom Layers
Papers - ResNet
Papers - SuperNets
Papers - Federated Learning
Papers - Mamba - Structured State Space Model
Papers - Image - Human Motion Generator
Papers - MoE - Multimodal
Papers - Autonomous Drones
Papers - Multimodal - Drone
Papers - Multimodal - Drone - Object Manipulation
Papers - Training Research - Time series
Papers - Pre-training - Time Series
Papers - Neural Architecture Search
Papers - Training - Hardware Detection
Papers - Image - Split Computing
Papers - Image - IoT - Split Computing
Papers - U-Net
Papers - Image - Segmentation
Papers - Image - Segmentation - Cancer
Papers - Video - Synthetic Data Generator
Papers - Image - Segmentation - Drone
Papers - Image - Segmentation - Report
Papers - Image - Segmentation - Adversarial
Papers - Image - Segmentation - MRI
Papers - Image - Segmentation - Stroke Brain Lesions
Papers - Image - SkipNet
Papers - Image - IoT
Papers - Image - Hybrid
Papers - Image - Hybrid - ResNet - U-Net
Papers - Image - Hybrid - Swin - U-Net
Papers - Image - Segmentation - Bio Cell
Papers - Image - Segmentation - Quantum
Papers - Image - Hybrid - Graph Net - U-Net
Papers - Image - Hybrid - Patient Meta Data - U-Net
Papers - Image - CSWin - Cross-Shaped Windows
Papers - Image - Encoders
Papers - Image - Encoders - LePE - Local-Enhanced Pos Enc
Papers - Image - Attention - BOAT - Bilateral Local Attn
Papers - Image - Attention - Multi-Scale
Papers - Image - Swin
Papers - Text - Fine-tuning - Math
Papers - BYOL
Papers - Robot - Tasks - Boss
Papers - Text - Model Guided Training
Papers - Robot - Research
Papers - Image - Hybrid - Hybrid Task Cascade (HTC) - Swin
Papers - Image - GasHis
Papers - Image - Dino
Papers - Text - Architecture - Scaling to 1000 Layers
Papers - DenseNet
Papers - Adversarial Testing
Papers - Image - EfficientNet
Papers - Image - Compound Scaling Method
Papers - Base Models - Text - Coding
Papers - Image - Visualization - Splatting
Papers - AI - Social Risks
Papers - AI - Safety
Papers - Testing - Single Layer Model
Papers - Custom Layers
Papers - Pre-training
Papers - Motion Control
Papers - Fine-tuning
Papers - Fine-tuning - LoRA
Papers - Reinforcement Learning
Papers - AI - Self-refinement - Training and Tuning
Papers - Training
Papers - Audio
Papers - Text - Math
Papers - Observability and Interpretability
Papers - Multimodal - Healthcare
Papers - Interpretability - DAS
Papers - Named Entity Extraction - Healthcare
Papers - Multilingual
Papers - Healthcare
Papers - Watermark
Papers - Image - Clip
Papers - Proof of Learning
Papers - Disaster Recovery
Papers - Named Entity Extraction and Disambiguation
Papers - Neural Architecture Search - Report
Papers - Neural Architecture Search - One-shot
Papers - Neural Architecture Search - Tabular Data
Papers - Hyperparameter Architecture Search
Papers - Image - Neural Architecture Search
Papers - Neural Architecture Search - RNN
Papers - Neural Architecture Search - Reinforcement Learning
Papers - Neural Architecture Search - Quantization - FLIQS
Papers - AutoML
Papers - Neural Architecture Search - AutoML
Papers - Testing - Speech and Text
Papers - AI - Are models similar to a human brain?
Papers - Automated Training - Self Discover
Papers - Math - Automated Discovery
Papers - Math - Research
Papers - Alpaca
Papers - Critical Thinking - Step Back
Papers - Critical Thinking
Papers - Text - Length Generalization
Papers - Text - Encoders - Fire
Papers - Image - Multi-Image Reasoning
Paper - Image - Chain of Thought
Papers - Image - Text and Symbolic Image Generator
Models - Fine-tuning - Mixture of Loras
Papers - Multimodal - Text to 2D to 3D Mesh
Datasets - HTML
Datasets - Multimodal - Text and Image
Papers - Image - Mamba
Papers - Image - Selective Scan
Papers - Healthcare - Mental Health
Papers - Encoders
Papers - Encoders - Fire
Papers - Video - Understanding with Many Models
Papers - Video - Understanding
Papers - Image - Understanding
Papers - Multimodal - Encoders
Papers - Image - GiT
Papers - Text - Star
Papers - QFormer
Papers - Image - Near Real Time
Papers - Image - Attention - Window
Papers - Image - Editing
Papers - Image - Training - Noise
Papers - Image - LCM
Papers - Image - Training - Quantized Mask
Papers - Image - Editing - Glide
Papers - Image - Training - Seed Vector
Papers - Image - Semantic Palette
Papers - Blockwise Parallel
Papers - Training - Distributed
Papers - Training - Masked Sequence Packing
Datasets - Chess
Papers - Semantic Segmentation
Papers - Training - FixMatch
Papers - Training - Self-Training - Student and Teacher
Papers - Task Assistant - ExploreLLM
Papers - Training - Guided Task Flow
Papers - Training - Problem Solving
Papers - Structured Thoughts
Papers - GUI - Task Assistants
Papers - Chinchilla
Papers - Model Scaling - Effective Parameter Count
Papers - Custom Layers - Hash Layers
Papers - Scaling
Papers - Hallucination - Reduction
Papers - Chain of Verification
Papers - Reading Comprehension
Datasets - Text - Multilingual
Papers - Training - Chain of Thought
Papers - CoT - Chain of Thought
Papers - Ethics
Papers - Fine-tuning - QA-LoRA
Papers - Fine-tuning - Understanding Tables
Papers - Text - Perform Tasks on Tabular Data
Datasets - Text - Tabular
Papers - Text - Dataset - TabLib - Tabular
Papers - Qwen
Papers - Qwen - Report
Papers - Multimodal - Report
Papers - MoE - Quantization
Papers - Attention - Custom Encoder
Papers - Research - Replacing Attention
Papers - Research - Safety
Embeddings - C4 - Jina
Papers - Reduce Model Size - SliceGPT
Papers - Decoders - CoT Decoding
Papers - Rag
Papers - Rag - Multi-hop Queries
Papers - Encoders - Coding
Embeddings - Coding
Embeddings - Coding - CodeBert
Papers - Training - Synthetic Noise
Papers - Coding - Fill in the Middle - Infilling
Papers - Text - Pre-training - Synthetic Noise
Papers - Training - Knowledge Graphs
Papers - Image - Training - Knowledge Graphs
Papers - Image - Training - Adversarial
Papers - Multimodal - Fine-tuning - Report
Papers - Text - Tabular - Conditional Formatting
Papers - Text - Training - Code - Byte Pair Encoding
Papers - Coding - Out of Vocabulary
Papers - Coding - BPE vs Pointer Mixture Network
Papers - Automatic Speech Recognition
Papers - Automatic Speech Recognition - Beam Search
Papers - Beam Search
Papers - Explainability
Papers - Training - Synthetic Data - Sycophancy
Papers - Training - DoReMi
Papers - Training - Domain Reweighting
Papers - Training - AI training AI
Papers - Training - Proxy Model - Group DRO
Papers - Adafactor
Papers - Coding - Decoding with Static Analysis
Papers - MoE - Hashing instead of a Router
Papers - UDOP
Datasets - Multimodal - Image and Text
Papers - Multimodal - Document and Text
Datasets - Multimodal - Document and Image
Papers - Encoder - Byte-Pair Encoding
Papers - Text - SQL
Papers - Science - Research Analysis
Papers - Training - Speculative Decoding - Single Model
Papers - Attention - Tree Attention
Papers - Fine-tuning - Rag
Models - Table - Extraction
Papers - Video - Agent
Papers - Audio - GAN - Upsamplimg
Papers - Audio - GAN
Papers - Image - Illumination
Papers - Decoders - 3D Nerf
Papers - Image - Edit
Papers - ControlNet
Papers - Fine-tuning - Parameter Efficiency
Papers - Image - Lightning
Papers - Text - 3D Mesh - Volumetric
Papers - Text - Label Generator
Papers - Image - Limited-Training
Papers - Image - Chart to Table
Papers - Image - Plot - Understanding and Reasoning
Papers - Image - 3D Asset Enhancement
Papers - Text - Taxonomy Generator
Papers - Training - Reward Model
Papers - Fine-tuning - Language Model Policy with LoRA
Papers - Fine-tuning - Mixture of LoRA (MoL)
Papers - Robotic - Observational Learning
Papers - Attention - Cross
Papers - Training - Skill Learning
Papers - FIne-tuning - Multi-Agent
Papers - mPlug-Owl
Papers - Image - Document - mPlugOwl
Papers - Document - mPlugOwl
Papers - Structured Learning - Document
Papers - Prompt - Prompt Compression - Report
Papers - Prompt
Papers - Image - Gaussian Splatting and NeRF
Models - Reverse Engineering - Decompiler
Models - Reverse Engineering
Papers - Text - 3D
Models - Table - Structure - Recognition
Paper - Image - Table - Extraction
Paper - Image - Table
Papers - Tabular
Papers - Image - Object Detection
Models - Image - Object Detection
Papers - Benchmarks - Reward Models
Papers - 3D - Text
Papers - Science - Molecule
Papers - Frankenmerging
Papers - Image - Frankenmerging
Papers - Image - Model Merging
Papers - Attention - Grouped-Query Attention (GQA)
Papers - Image - Math
Papers - Benchmarks - Math
Papers - Image - Reward Model
Papers - Multimodal - Mamba
Papers - Video - Editing
Papers - Image - Personalization
Papers - Image - Personalization - Captions
Papers - Image - Blip
Papers - 3D - Reconstruction
Papers - Image - Video Generator
Papers - Video - Upsampler
Papers - Video - Time Reversal Fusion
Papers - Image - Adversarial (GAN)
Papers - Image - Video - Adversarial (GAN)
Papers - Toxicity
Papers - Fine-tuning - Toxicity
Papers - Video - Content Motion Latent Diffusion
Papers - Decoders - Chain of Thought
Papers - Image - Depth Estimation
Papers - Image - Flow Matching
Papers - Image - Training
Papers - Text - Classification - Social Media
Papers - Text - Classification
Papers - Text - Training - Classification
Papers - Audio - Training
Papers - Multimodal - Audio
Papers - Audio - Whisper vs Clap - Whisper wins with ASR
Papers - Encoders - Audio
Papers - ICL - In-Context Learning
Papers - Math - Derive New Math - Function Class
Papers - Agent - Architecture
Papers - Agent - Memory
Papers - Fine-tuning - DPO
Papers - Critic Models
Papers - Training - Critic Model
Papers - Security
Papers - Security - Fuzzing
Papers - Reasoning - Critic Pattern
Papers - Benchmarks - Reasoning
Papers - Sports
Papers - Music
Papers - Pop Culture
Papers - Coding - Chain of Thought
Papers - Coding - Training
Papers - Coding - Fine-tuning
Papers - Coding - Reasoning
Papers - Fine-tuning - Reasoning
Papers - Video - Streaming
Papers - Mamba - FFT - EinFFT
Papers - Encoders - Video
Papers - Multimodal - Video - Text - Audio
Papers - Multimodal - Captions - Audio
Papers - Multimodal - Captions - Speech
Papers - Multimodal - Captions - Video
Papers - Synthetic Data - Multimodal
Papers - 3D
Papers - 3D - Synthetic Data
Papers - Document - Understanding
Papers - Documents - Fine-tuning
Papers - Compiler
Papers - Coding - Compiler
Papers - LLVM
Papers - Training - Teacher Model
Papers - Tree of Thoughts
Papers - Searchformer
Papers - Coding - Stack Traces
Papers - Training Research - Stack Traces
Papers - Fine-tuning - Search Based
Papers - Fine-tuning - Procedure Cloning
Papers - Encoders - T5
Papers - Decoders - T5
Papers - T5
Papers - DenseFormer
Papers - Training - Weighted Average
Papers - Encoders - Image - Clip
Papers - Training - Fitness Score
Papers - Training Research - Exemplary Prompts
Papers - Fine-tuning - Prompts
Models - TTS
Models - T5
Models - Documents
Papers - Encoders - VAE
Papers - Agent - Operating Systems
Papers - Image - Synthetic Data - Human Faces
Papers - Multilingual - Japanese
Papers - Fine-tuning - Multilingual
Papers - Document - Understanding - Historical Images Text
Papers - SAM - Segment Anything Model
Papers - Image - Historical
Papers - Image - Explainability
Papers - Image - VGG
Papers - Image - Pattern Recognition
Papers - Image - Historical - Symbolic and Artistic
Papers - Training - Distribution-based
Papers - Research - Emergent Properties
Papers - Image - In-Context Learning
Papers - Deepmind - ICL vs RNN vs LTSM
Papers - Deepmind - ICL Rule-based Classification
Papers - DeepMind - ICL Small Models are More Exemplar-Based
Spaces - Decoders - Beam Search Visualizer
Spaces - Decoders - Beam
Spaces - Decoders
Papers - Video - NeRF
Papers - FAIR
Papers - Fine-tuning - Model Layer Pruning
Papers - Healthcare - Text - Antibodies
Papers - Intel - MLP
Papers - Performance - Intel
Papers - Image - Prompt
Papers - VQA
Papers - Fine-tuning - SFT
Papers - Fine-tuning - Report
Papers - Text - Video Generator
Papers - Video - Enhance
Spaces - LangChain
Papers - Image - Gaussian Splatting - 2D
Papers - Meta
Papers - Audio - Image
Papers - Image - Avatar Generator
Papers - Training Research - Audio
Papers - Healthcare - Synthetic Data Generator - 3D
Models - Image - Streaming
Datasets - Fine-tuning
Datasets - Meta
Papers - University - MIT
Papers - Google
Papers - Image - MultiDiffusion
Papers - Imagen
Papers - Convert - T2I to T2V
Papers - University - University of California Berkeley
Papers - OpenAI
Papers - Adobe
Papers - RWKV
Papers - 3DGS
Papers - Text - Fact Checking
Papers - Text - Factuality
Papers - Healthcare - Text
Papers - Healthcare - Training Research
Papers - University - Stanford University
Papers - DataBricks
Models - Healthcare
Papers - Image - Generator - Large Resolution
Papers - Encoders - Synthetic Noise
Papers - Apple
Papers - Video - Clothing
Papers - Encoders - Video - MetaCLIP
Papers - IoT - Assistant
Papers - Training Research - Mixture FOFE
Papers - Training Research - AD FOFE
Papers - Image - Editing - Object Removal
Papers - Image - Editing - Object Insertion
Papers - Image - Editing - Counterfactual Supervision
Papers - 3DGS - Feature Rendering
Papers - 3DGS - Open-world Segmentation
Papers - 3DGS - Security Camera Object Detection
Papers - Microsoft
Papers - University - Carnegie Mellon University
Papers - Healthcare - Image Analysis
Papers - Healthcare - Image - SynthRAD2023
Papers - Healthcare - Image - CT
Models - MoE - GQA
Papers - Image - Segmentation - Bounding Box Infilling
Models - MoE - Coding
Papers - Image - Translation
Papers - Text - Translation
Papers - Multilingual - German
Papers - Image - Synthetic Noise
Papers - Multilingual - Translation
Papers - Johns Hopkins
Papers - Multilingual - Synthetic Noise
Papers - Intel
Papers - Fine-tuning - Text - U-Net
Papers - Image - Encoders - Text
Papers - Image - Encoders - Clip
Papers - Video - Reasoning - Time of Events
Papers - Video - Encoders
Papers - Video - Training - Understanding Time
Papers - Nvidia
Papers - U-Net - 3D
Papers - 3DGS - 3D Mesh Generator
Models - Fine-tuning
Papers - Model - SFT - Alpaca and DPO - Solar
Papers - Fine-tuning - Preference-based RL (PbRL)
Papers - University - Cornell University
Papers - Robotics - Fine-tuning - PbRL
Papers - Fine-tuning - DPO - Reward Model Training
Papers - Reward Model
Papers - Reward Model - Bradley-Terry
Papers - Reward Model - Training
Papers - University of Chicago
Papers - Reward Models - KL Regularization - RL
Papers - KL Regularization - ADP - Con/Divergence Error Rate
Models - Fine-tuning - PPO
Papers - Fine-tuning - Factuality
Papers - Fine-tuning - Emulator
Datasets - RLHF
Datasets - Fine-tuning - RLHF
Papers - top-p - Nucleus Sampling
Papers - top-k - Flat (good) vs Peaked (bad) Dist Sampling
Papers - Distribution - Zipf Analysis
Papers - Institute - Allen Institute
Papers - University - University of Washington
Models - 1bit
Models - Bitnet - Text
Papers - Coding - Unit Tests
Papers - Tacotron 2
Papers - Audio - WaveNet
Papers - Audio - Time Domain Waveforms
Papers - Audio - TTS
Papers - Audio - Mel Spectogram
Papers - Decoders - Audio
Papers - GAN
Papers - Image - GAN
Papers - GAN - Compression - Bitstream
Papers - GAN - Compression
Papers - Audio - STT - ASR
Papers - Audio - Speech Transcription
Papers - Audio - WhisperX
Papers - Audio - Voice Activity Detection
Papers - Audio - VoiceCraft
Models - Audio - TTS
Papers - Audio - Compression
Models - Audio
Models - Audio - Codec
Models - Audio - Encoders
Models - Audio - Decoders
Models - FAIR
Models - Meta - FAIR
Models - Audio - Music Generator
Models - Getting Started - Pre-training
Models - TinyLlama
Models - Reward Model
Models - Starling
Datasets - Chat - RLHF
Datasets - Starling
Papers - Audio - Masked Language Model
Papers - Audio - Residual Vector Quantization
Papers - Audio - Encoders
Models - Image - Object Detection - DETR
Models - ResNet
Papers - Audio - Inference - Rescore Models
Papers - Inference - Rescore Models
Inference - Autoregressive and Non-Autoregressive Models
Papers - Kyutai
Models - Text - Music Generator
Models - Audio - Hybrid - AR with NAR Models
Papers - Touch
Papers - MoE - Mamba
Papers - Flan-T5
Papers - IoT - Screen Usage Understanding and Context
Papers - Mobile - User Entity Context Understanding
Papers - Mamba - Limitations - In-Context Learning (ICL)
Models - MoE - Mamba
Papers - AI21 Labs
Papers - University of Tokyo
Papers - S-Lab
Papers - Duke
Papers - University of Wisconsin
Papers - Image - Report
Papers - Hallucinations
Papers - Trustworthiness
Papers - University of Bristol
Papers - Healthcare - Surgical Gestures
Papers - Vanderbilt
Papers - Fine-tuning - Dataset - Few-Shot Retrieval (FRet)
Papers - University - New York University
Papers - Embeddings
Papers - Embeddings - Text
Papers - Text - Memorization
Papers - Training a 2.8B Model in 38 days
Papers - Huawei
Papers - vLLM
Papers - Inference - vLLM
Papers - Attention - PagedAttention
Papers - Fine-tuning - Model Merge
Papers - Frankenmerge - Model Stock - Use Fine-tuned Models
Papers - Naver
Models - Model Stock
Models - Frankenmerge
Models - Frankenmerge - Model Stock
Papers - Benchmarks
Papers - Benchmarks - Financials
Papers - 1bit
Models - 2bit
Papers - Video - Fine-tuning
Papers - Video - Reward Model
Models - Spright
Papers - ASU
Papers - Hugging Face
Papers - University of Maryland
Papers - University - Tsinghua University
Papers - Chinese Academy of Sciences
Papers - Xidian University
Papers - 3D - FlexiCubes
Papers - ShengShu
Papers - Fine-tuning - Llava - DPO
Papers - Non-Autoregressive Transformers
Papers - Salesforce
Papers - Safety
Papers - Speech - Chain of Thought
Papers - Audio - Chain of Thought
Papers - Chinese University of Hong Kong
Papers - Audio - Fine-tuning
Papers - Audio - Fine-tuning - Lora
Papers - Image - Continual Training Framework
Papers - Documents - LayoutLM
Papers - Documents - FormNet
Papers - Document - OCR
Papers - Ohio State
Papers - Video - Captions
Papers - Video - Streaming - Captions
Papers - Decoders - Training Decoding Point Supervision
Papers - Healthcare - Cardiac MRI - CMRxRecon Challenge 2023
Papers - Image - Healthcare - Cardiac MRI
Papers - Image - Healthcare
Papers - Training Research - Optimizers
Papers - Coding - C/C++ - Memory
Papers - Coding - C/C++
Papers - Coding - Annotations, Decorators and Captions
Papers - Coding - Operating Systems - Memory
Papers - Image - Contrastive Graph Learning
Papers - Extended Transformer Construction
Papers - Documents - Tabular
Papers - Graph Convolutional Network
Papers - Documents - Graph Convolutional Network
Papers - Training Research - Contrastive Predictive Coding
Papers - Decoders - Bert
Papers - Optimizers - Adafactor
Papers - T5 - MoE
Papers - University of Georgia Tech
Papers - Image - Extract Style
Papers - Image - Contrastive Style Descriptors
Papers - Image - Use a Model to find a similar image
Papers - Ellis Institute
Papers - Shanghai AI Laboratory
Papers - Image - Security Cameras
Papers - Government - USA
Papers - University - University of Waterloo
Papers - Vector Institute
Papers - Benchmarks - Text
Papers - Benchmarks - In-Context Learning
Papers - Benchmarks - Text - Long Context
Models - Documents - OCR
Models - Text - Classifier - Zero-Shot
Models - Text - Classifier - Deberta
Papers - Network - Adaptive BitRate Algorithms
Papers - Network Traffic - 4G and 5G - OTA - Packet Shaping
Papers - Network Traffic - 4G and 5G - OTA
Papers - Network Traffic - 4G and 5G
Papers - Network Traffic - OTA
Papers - Network Traffic - Packet Shaping
Papers - Network Traffic - Transport Optimization
Papers - Network Traffic
Papers - University of Texas
Papers - University of Peking
Papers - Coding - Preference Trees
Papers - Coding - Understanding Tree Structures
Papers - Math - Reasoning
Papers - University - University of Illinois
Papers - University - Northeastern University
Papers - Multilingual - Finnish
Papers - Multilingual - Encoders - BPE
Papers - LLaVA
Papers - Gemma
Papers - Multimodal - Training
Papers - Encoders - DinoV2
Papers - Image - Encoders - DinoV2
Papers - Training Research - Scaling Properties - T2I
Papers - Training Research - Smaller vs Larger Models
Papers - Pre-training - In-filling - PSM and SPM ordering
Papers - Pre-training - Dynamic Context Length
Papers - Text - Supervised Fine-tuning
Papers - Text - Supervised Fine-tuning - Batch Grouping
Papers - Fine-tuning - PPO
Papers - Multilingual - Benchmarks
Papers - Amazon
Papers - Image - SDXL
Papers - ByteDance
Papers - Video - Autoregressive Model
Papers - Infererence - Performance
Papers - Coding - Algorithmic Reasoning
Papers - Coding - Think and Execute vs CoT and PoTs
Papers - Coding - Program of Thoughts (PoT)
Papers - Coding - Think and Exectue - 7B vs 13B vs GPT
Papers - Prompts - Detailed Examples
Papers - Infra - Cost - Automatic Compute Planning
Papers - Mixture of Depths - MLP, residuals, router, tokens
Papers - MoD - Router
Papers - Yonsei University
Papers - Image - NeRF
Papers - Alibaba
Papers - University - Fudan University
Papers - Image - Frequency Decomposition
Papers - Image - Demosaic
Papers - University - Hong Kong University of Science and Te
Papers - Image - Interior Design
Papers - 3D - Interior Design
Papers - ETH Zurich
Papers - 3D - Indoor Scene Synthesis
Datasets - Reasoning
Papers - Reasoning - Self-Reference Metalinguistic
Papers - University - University of California San Diego
Papers - PlayTest AI
Papers - Contextual AI
Papers - Reasoning - MRGSM8k - Meta Math Multi Step
Papers - Reasoning - GSM8k
Papers - Tencent
Papers - Benchmarks - GSM8k
Datasets - Reasoning - Meta Math Multi-Step - GSM8k
Datasets - Math - Meta Context Reasoning
Papers - University of Cambridge
Papers - Southern University of Science and Technology
Papers - Alan Turing Institute
Papers - Max Planck Institute
Datasets - Text - QA
Datasets - Text - System Chat
Models - Image - Handwriting Comprehension
Models - Table - Handwriting Comprehension
Papers - Arctic University of Norway
Papers - Document - Tabular - Manual Review
Papers - Documents - Tabular - Census
Papers - Image - Custom Annotation and Labeling Tools
Papers - Documents - Custom Annotation and Labeling Tools
Papers - Image - Tabular
Papers - CascadeTabNet
Papers - Image - OCR
Papers - Pune Institute
Papers - Image - Table Structure Recognition
Papers - Documents - Table Recognition - Fine-tuning
Papers - Image - Fine-tuning - Tables
Papers - Image - OCR - Tesseract for Text Location
Papers - Document AI
Papers - Harbin Institute
Papers - Coding - Benchmarks - Report
Papers - Coding - OpenCodeInterpreter
Papers - Benchmarks - Coding
Papers - Coding - Training - Equal-Info Windows
Papers - Coding - Multi-Model Inference
Papers - Coding - Distributed - Adaptive Computation Time
Papers - Anthropic
Papers - Training Research - Compression and Multi-Model Inf
Papers - Coding - Encoders
Papers - Encoders - Compression
Papers - Coding - Compression
Papers - Tokenizer - Neural Compression
Papers - Inference - Multi-Model
Papers - Fine-tuning - ReFT
Papers - Fine-tuning - Report - Llama 7B and 13B
Datasets - Reasoning - Commonsense
Papers - Tokenizers - Roberta
Papers - Reasoning - Commonsense
Papers - Reasoning - Social IQ
Papers - University of Houston
Papers - Image - Classifier - Label Quality Assessment
Datasets - Reasoning - Math
Papers - Benchmarks - Image - Labels
Papers - Benchmarks - Image
Papers - Reasoning - Math
Papers - Reasoning - Math - AQuA
Papers - University of Oxford
Papers - University of IAIR Xi’an Jiaotong
Papers - Training - Instruction-Following
Datasets - Text - Instruction-following
Papers - RLHF
Papers - Benchmarks - Text - General Language Understanding
Papers - Benchmarks - Text - Glue
Datasets - Benchmarks - Glue
Datasets - Benchmarks - Text
Papers - Encoders - Roberta
Papers - Reasoning - Program of Thoughts
Papers - University of California Santa Barabra
Papers - StructLM - Understanding Structured Data
Models - StructLM
Datasets - Text - StructLM
Papers - Prompts - System Chat
Papers - Prompts - Chain of Thought
Papers - Tokenizers - LLaMA Byte Pair Encoding (BPE)
Datasets - OCR - Image with Text from Textract
Datasets - Documents - OCR - Image with Text from Textract
Papers - Benchmarks - Web Browsing Tasks
Papers - University - Harvard University
Papers - Kaust
Papers - Image - Point Cloud
Papers - Video - MultiView Compressive Coding (MCC)
Papers - Image - Encoders - RBG-D
Papers - Image - Training - Low Res Predicts High Res
Papers - University - Beihang University
Papers - Tokenizers - Documents - TrOCR
Papers - Tokenizers - Image - TrOCR
Papers - Tokenizers - Image - Handwriting
Spaces - Image - Handwriting Recognition
Papers - University of Zhejiang
Papers - Audio - Text to Speech
Papers - Audio - TTS - VALL-E
Papers - Audio - TTS - RALL-E
Papers - Security - Jailbreak
Papers - Benchmark - Security
Papers - LMU Munich
Papers - Siemens
Papers - University of Wuhan
Papers - Munich Center for Machine Learning (MCML)
Papers - Benchmarks - Website Navigation
Papers - Web Navigation - Chrome Extension
Papers - Web - Recognition
Papers - Web - Training - Curriculum Learning
Papers - Fine-tuning - Rejection Sampling (RFT)
Papers - Zhipu AI
Models - General Purpose
Datasets - Benchmarks - CodeEditorBench - OCI
Models - Chat
Models - Text - Image
Models - Multimodal - Chat
Models - Audio - Understanding
Models - Synthetic Data - Audio
Models - Audio - Edit with Text
Models - Audio - Classification and Segmentation
Models - Image - Chat
Models - Image - Synthetic Data
Spaces - Image - Chat
Papers - Audio - Understanding
Papers - Audio - Captions
Spaces - Qwen - Image
Datasets - SQL
Models - Audio - STT - ASR
Papers - Redwood Research
Papers - Automated Interpretability
Models - Encoders - Bidirectional
Models - Encoders - Bert
Papers - Text - Encoders - Image - Clip
Papers - Training Research - Rank-One Model Editing
Papers - Training Research - Mamba
Papers - Training Research - Ablation - Mamba
Papers - Training Research - Ablation - Factuality
Papers - Training Research - Weights - Activation Patching
Papers - Training Research - Interpretability
Papers - Interpretability - Rome - Factuality Editing
Papers - Interpretability
Papers - University of Tel-Aviv
Papers - Interpretability - Attention
Papers - University of Brown
Papers - Training Research - Layer Understanding
Papers - Interpretability - Prompts
Papers - Image - Imagen
Papers - Training Research - Control Attention Reweighting
Papers - Attention - Weights - Re-Weighting
Papers - Training Research - Text - Token Visualization
Datasets - Image - ImageNet
Datasets - Image
Papers - Recommendation - Cloze Task
Papers - Recommendation - Encoders - Bert
Papers - Recommendation
Papers - Recommendation - Multi-Task Learning
Papers - Recommendation - Bert4rec - SASRec
Papers - Recommendation - RTG Balancing
Papers - University of Zurich
Papers - Healthcare - Radiology
Papers - University - Shanghai Jiao Tong University
Papers - Training Research - Pre-training - ALBEF
Papers - Training Research - Vision Language Pre-training
Papers - Pre-training - ALBEF - Multimodal Encoder
Papers - Multimodal - Encoders - ALBEF
Papers - Dataset - MultiModal - MultiLingual - Wiki
Papers - Fine-tuning - RLHF - Direct Nash Optimization (DNO)
Papers - RLHF - Iterative Contrastive Self-Improvement
Datasets - Text - Alpaca
Papers - RL - Consistency Model (RLCM)
Papers - Fine-tuning - Image - Prompt Image Alignment
Papers - Harvey Mudd
Papers - Fine-tuning - Stream of Search
Papers - Training Research - Search Based (BFS / DFS)
Models - Text - Science
Papers - University of Tubingen
Papers - HKUST
Papers - Kuaishou
Papers - Text - Dialog Inpainting
Papers - 3DGS - Motion Blur
Papers - 3DGS - Color Transformation
Papers - Image - Encoders - RGB-T (Thermal)
Papers - University of Dalian
Models - Image - Stock Market - Pattern Detection
Papers - Audio - Encoders - HuBert with EnCodec
Papers - Audio - Bark
Papers - Mobile - Multimodal - Screen Image with Captions
Papers - Training Research - DeiT
Papers - Healthcare- DeiT
Papers - Image - Object Detection - YoloV8
Papers - Healthcare - Image - Cancer - Brain
Papers - Image - Hybrid - DeiT and YoloV8
Papers - Image - Healthcare - DICOM
Papers - Image - Healthcare - PTP Metrics
Papers - Image - DeiT
Papers - Custom Layers - MLP
Papers - University of Melbourne
Papers - Multilingual - Image - Greek
Papers - Indian Institute of Technology
Papers - Indian Institute of Science
Papers - University of Sorbonne
Papers - Regularization - LayerScale
Papers - Regularization - Binary Cross Entropy
Models - Image - DeiT
Models - Image - Classification
Papers - Image - Report - VQA
Papers - Image - Training - Mistral
Papers - AIRI Institute
Papers - Sber AI
Papers - Skoltech
Papers - Image - LLaVA
Papers - Image - Coco Testing
Papers - Image - Clip - Coco Testing
Papers - Image - Frechet Inception Distance (FID)
Papers - Training - Long Context
Papers - Benchmark - Context
Papers - Benchmarks - Context - Ruler
Papers - Image - Decoders
Papers - Image - Decoders - ViT
Papers - Training - Image - Causal Self Attention
Papers - Image - Training - AS2D RoPE and SwiGLU
Papers - Training - Detailed Appendices
Papers - Image - Encoders - ViT
Papers - 3D - Panoramic View Generator
Papers - Image - Training - Self Refinement
Papers - Training - Noisy or Unseen Data Drops Accuracy 6%
Papers - Image - Object Detection - DETR
Spaces - Healthcare - Multimodal
Papers - Text - Social Skills
Papers - Fine-tuning - Orpo
Papers - KAIST AI
Papers - Image - Fourier Neural Operators (FNO) vs CNNs
Papers - Image - FNO - Low and High Frequency Data
Papers - Image - Training - Training with an Ensemble
Papers - Image - FNO - SpecBoost Ensemble
Papers - Image - Differential Equations - FNO - ReLu
Papers - Image - Spectral Analysis
Papers - Rag - Prompts
Papers - Rag - Multiple Documents in Parallel
Papers - Tokens - Path Equilibrium Positioning
Papers - Tokens - Real-Valued Positioning
Papers - Model - Griffin
Papers - Models - Griffin - RecurrentGemma
Models - Mistral - Orpo
Papers - Fine-tuning - ControlNet
Papers - University of Central Florida
Papers - Reward Model - Consistency Loss - ControlNet
Papers - Audio - Datasets - Dialog
Papers - Qwen - Audio
Papers - Advanced Micro Devices
Papers - Image - Auto - Lane Detection
Papers - Image - Auto - Lane - Training Segmentation
Papers - Operating Systems
Papers - Agents - Operating Systems
Papers - Benchmarks - Agent - Multimodal - Tasks
Papers - University of Aalto
Papers - University - Princeton University
Papers - Megatron
Papers - Attention - Mixture of Attention Heads (MoA)
Papers - DiffusionDet
Papers - Image - Generator - Gaussian Noise - Bounding Boxes
Papers - Image - Ordinary Differential Equations (ODE)
Papers - Image - Object Detection - Bounding Boxes
Papers - Image - Bounding Boxes - Loss - Timeseries
Datasets - Image - Coco - Obj Det, Segmentation, Captions
Models - Image - Image Segmentation - Coco
Models - Image - DPT - Dino
Papers - Image - ConsistencyDet
Papers - Image - TrOCR
Models - Rag
Models - Mistral
Models - Image - Clip
Models - Image - Dino
Models - Agent
Models - Agent - On-Device
Spaces - Comics
Papers - Chain of Thoughts - Visualization
Papers - Visualization of Thought (VoT) - Mind’s Eye
Papers - Benchmarks - Documentation
Papers - Benchmark - Multimodal - Image Documentation
Papers - AutoDesk
Papers - Investing - Stock Forecasting
Papers - University of Shenzhen
Papers - Investing - AceFormer - ACEEMD
Papers - Image - Knowledge Graph
Papers - Agent
Papers - Knowledge Graph - Tasks
Papers - Panasonic
Papers - University of Xiamen
Papers - Selective Language Modeling vs Causal
Papers - Fine-tuning - Math
Datasets - Chat
Papers - Image - VQA
Papers - Image - VQA - Captions High Res Alignment
Papers - University - University of Santa Barbara
Papers - University - Columbia University
Papers - Image - VQA - Ferret
Papers - Image - Encoders - Dual Vision MLP projectors
Papers - Image - Referring Object Classification (ROC)
Papers - Image - Dataset - LVIS
Papers - Image - Grounding
Papers - Image - Training - OCR - High-Res Dense Alignment
Papers - Image - Captioning
Papers - Documents - UDOP
Papers - Documents - Fine-tuning - LayoutLM and UDOP
Papers - Image - Scientific Charts
Papers - Documents - Scientific Charts
Papers - University of Ulm
Papers - Image - Fine-tuning - ICPR22 dataset
Papers - Image - Fine-tuning - CHIME-R and EconBiz datasets
Papers - Image - Fine-tuning - DeGruyter dataset
Papers - Embeddings - Text - RoBERTA and BPE
Papers - Embeddings - Image
Papers - Embeddings - Image - DiT and dVAE
Papers - LayoutLM - Fine-tuning - Word Patch Alignment
Papers - Tokenizers - Text - T5
Papers - Fine-tuning - Hyperparameter - FUNSD
Papers - Classification - F1 Macro and F1 Micro
Papers - Timeseries
Papers - University of Panjab
Papers - Image - Report - Training - CNN RNN LTSM MLP
Papers - Image - Connectionist Temporal Classification (CTC)
Papers - Image - Climate - SHAP
Papers - Courant Institute
Papers - Image - Climate - ERA5
Papers - Image - Coco - Annotation Pipeline
Papers - Image - Mask - box-kMaX over kMaX-DeepLab
Papers - Image - Coco - Annotation RLHF
Papers - Image - Coco - Panoptic
Papers - Video - NeRF - Real Estate Walkthroughs
Papers - NeRF - Training - Photometric Consistency Patches
Papers - Image - Datasets - ETH3D
Papers - Image - Datasets - TanksAndTemples
Papers - Image - NeRF - Mesh - TSDF fusion RGBD sequences
Papers - Image - Evaluation Metrics - PSNR SSIM LPIPS
Datasets - Research Papers - ARXIV QA
Papers - University of Alberta
Papers - University of Auburn
Papers - Explainability - Image - VQA
Papers - Explainability - Image - VQA - CHM-Corr++
Spaces - Chat - QA - Research Papers on Arxiv read by Claude
Audio Reading - 2404.08639 - COCONut
Audio Reading - 2403.07691 - ORPO Fine-tuning
Audio Reading - 2212.05525 - Extending TrOCR
Audio Reading - 2404.06209 - Elephants Never Forget
Audio Reading - 2404.07773 - ConsistencyDet
Models - Reasoning
Datasets - Audio - Large
Datasets - Audio - Multilingual
Datasets - Audio - Multilingual - Large
Spaces - Audio - TTS
Models - WizardLM
Datasets - Benchmark - Tasks
Models - Image - QA
Datasets - Chat - Persuasion
Papers - Training Research - Dataset Ordering
Papers - Training - Curriculum Learning
Papers - Training - Education Stage then Cognitive Hierarchy
Papers - Training - Curriculum Instruction Tuning
Papers - Llama 2
Papers - Training - AI2 Reasoning
Papers - Training - Out of Vocabulary
Papers - Training - Multilingual - Out of Vocabulary
Papers - University of Charles
Papers - Training - Report - LTSM vs LLM vs Ensemble
Papers - Training - Filter Low Quality with Contriever
Papers - University of Seoul National
Papers - University of Ewha Womans
Papers - University - National University of Singapore
Papers - University - University of Michigan
Papers - Audio - Fine-tuning - DPO
Papers - Audio - Fine-tuning - Alpaca
Papers - Audio - Clap
Papers - Audio - Encoder - Variational Auto-Encoder (VAE)
Papers - Audio - Frechet Audio Distance (FAD) like FID
Papers - University of North Carolina Chapel Hill
Papers - University of Southern California
Papers - Megalodon - Unlimited Context
Papers - Multimodal - Long Context - Megalodon
Papers - 3DGS - Compression
Papers - Multimodal - Speculative Decoding
Papers - Inference - Multimodal
Papers - Qualcomm
Papers - Inference - Speculative Decoding - Draft Model
Papers - Dataset Grooming - Report
Papers - Dataset Generation - Guide
Papers - Image - Hyperspectral Images (HSI)
Papers - Mamba - Bidirectional
Papers - Healthcare - Image - Cancer
Papers - Healthcare - Image - Cancer - Prostate
Papers - Agent - Research
Papers - Research - Automated Research
Papers - Fine-tuning - DPO - KL Divergence vs Learning Rates
Papers - Tinkoff AI
Papers - Embeddings - Scalable Positional Encodings
Papers - University of Pennsylvania
Papers - Image - Layer Pruning
Papers - Inference - Image
Papers - Inference - Image - Layer Pruning
Audio Reading - 2402.16827 - Survey on Data Selection ~3.5h
Audio Reading - 2404.08011 - Review Handwriting Recognition
Papers - Pre-training - Warm-Start - Encoder and Decoders
Papers - Pre-training - Pegasus
Papers - Imperial College of London
Papers - Pre-training - Text - Masked Language Models (MLM)
Papers - Pre-training - Self-Supervised for Downstream Tasks
Papers - Pre-training - Warm-Start - Encoders - BPE
Papers - Pre-training - Warm-Start - Encoders - Unigram
Papers - Pre-training - Summarization
Papers - Pre-training - Encoders - Bert
Papers - Pre-training - Encoders - Roberta
Papers - Pre-training - Warm-Start
Papers - Pre-training - Unsupervised
Papers - Pre-training - Checkpoints
Models - Fintech - Financial Summarization
Audio Reading - 2310.09518 - Instruct with Human Curriculum
Datasets - Image - Multilingual - VQA
Datasets - Image - VQA
Papers - Inference
Papers - Inference - KV Cache
Models - Encoders - Multimodal - Clip - SigLIP
Models - Image - Encoders
Spaces - Multimodal - Image and Chat
Papers - Stability AI
Papers - Audio - Activation - Snake
Papers - Audio - Decoders - DAC - No tanh activation
Papers - Audio - RoPE
Papers - Audio - Embedding - Time - Sinusoidal Cross Attensi
Papers - Audio - Embedding - Text - Clap - Cross Attention
Papers - Audio - Embedding - Clap - Timestep - Prepended
Papers - Audio - Encoders - Clap - HTSAT audio RoBERTa text
Papers - Attention - Block-wise
Papers - Audio - Encoders - Clap - Training - Metadata
Papers - Audio - Musical Structure Analysis
Papers - Audio - Encoders - Laion-Clap
Papers - Agent - Sima
Papers - World Sim - Agent - Tasks
Papers - Training - Video Games
Papers - Video Games
Papers - Video Games - Survival
Papers - Video Games - Crafting
Papers - Video Games - Survival - Valheim
Papers - Video Games - Navigation
Papers - Video Games - Object Tools
Papers - Video Games - Farming
Papers - Video Games - Environment Resource Planning
Papers - World Sim - Encoder - Image - Sparc
Papers - World Sim - Encoder - Video - Phenaki
Papers - World Sim - OCR
Papers - World Sim - Training - Classifier-Free Guidance
Papers - World Sim - Cognitive Architectures
Papers - Video - Phenaki
Papers - Video - Encoders - C-ViViT
Papers - Video - Encoders - C-ViViT - MaskGiT
Papers - Embeddings - Text - T5X
Papers - World Sim - Embedings - Text - T5X
Papers - JAX
Papers - GNN
Papers - Training - GNN
Papers - GNN - Dataset - LargeMix
Papers - GNN - Fine-tuning
Papers - GNN - Benchmark - TDC
Papers - GNN - Benchmark - Polaris
Papers - GNN - Benchmark - MoleculeNet
Papers - Hybrid Arch - Skip Connections
Papers - GNN - MPNN
Papers - GNN - Encoders
Papers - GNN - Encoders - Positional and Structural Encoding
Papers - GNN - Fine-tuning - Custom Layer - MLP
Papers - GNN - MoIE
Papers - Healthcare - Molecules - GNN
Papers - Healthcare - Molecules
Papers - Healthcare - GNN
Papers - GNN - Ensemble
Papers - Healthcare - Drug Discovery
Papers - Healthcare - Drug Discovery - GNN
Papers - Valence Labs
Papers - University of Montreal
Papers - University - University of Toronto
Papers - University of McGill
Papers - Healthcare - Image - X-ray
Papers - Healthcare - Image - Chest - X-ray
Papers - Healthcare - Image - Lung Disease
Papers - XAI
Papers - XAI - Gradient Weighted Class Activation Mapping
Papers - XAI - Loc Interpretable Model Agnostic Explanation
Papers - XAI - Fine-tuning
Papers - University of Ahsanullah
Papers - Healthcare - Image - Covid-19
Papers - Image - Visual Feature Extractor
Papers - Inference - Batch - Hierarchical Sharing Pattern
Papers - Optimizer - Lamb
Papers - Attention - Sliding Window
Papers - Training - 3D Parallelism - Back - Reduce-Scatter
Papers - Training - 3D Parallelism - Forward - All-Gather
Papers - Custom Layers - Feedforward Neural Network (FFN)
Papers - Training Research - Model FLOPs Utilization (MFU)
Papers - Training Research - Fault Tolerance
Papers - Custom Layers - Decoders - No FFN
Papers - Training - Parameter Reduction - FFN
Papers - Equall AI
Papers - Multilingual - Spanish
Datasets - Fine-tuning - Orpo
Papers - Emergent Properties
Papers - Emergent Properties - Multiple Choice Grade
Papers - Emergent Properties - Exact String Match
Papers - Emergent Properties - Image
Papers - Training - Epoch - 4 Epochs by Default
Papers - Attention - Mixture-of-Attention (MoA)
Papers - Surge Global
Papers - Benchmarks - Safety
Papers - Benchmarks - Toxicity
Papers - Reward Model - Fine-tuning
Papers - Fine-tuning - Reward Model
Papers - Reward Model - Cross-Lingual
Papers - Datasets - Multilingual - Documents - Seahorse
Papers - Datasets - Multilingual - OpenAssistant
Papers - Inference - Speculative Decoding - KV Cache
Papers - Speculative Decoding - KV Cache
Papers - KV Cache
Papers - Inference - Speculative Decoding - Draft - KV Cache
Papers - Speculative Decoding - Draft - Base Model - JF68M
Papers - Speculative Decoding - Long Context
Papers - Speculative Decoding - Draft - Model - SpecInfer
Models - Speculative Decoding - Draft - Base Model
Models - Speculative Decoding - Draft - SpecInfer
Papers - Speculative Decoding - Token Tree Verification
Papers - Speculative Decoding - Token Verification
Papers - TensorRT-LLM - FasterTransformer - deprecated
Papers - Multimodal - Reka - Image Video Text Audio
Papers - Tokenizers - tiktoken
Papers - Animation - Text
Papers - Animation - Text - Kinetic Typography
Papers - Video - Text Animation
Papers - Image - LPIPS
Papers - Video - Score Distillation Sampling
Models - Fine-tuning - Orpo
Papers. - Samsung
Papers - Nota
Papers - 3D - Mesh Generator
Papers - Training - 3D - NeRF
Papers - Games - AlphaGo
Papers - Training - Self-Improvement
Papers - University of Turku
Datasets - Benchmarks - Image - QA - Real World Objects
Papers - Benchmarks - Image - QA - Abstract
Papers - Benchmarks - Image - Visual Commonsense
Datasets - Benchmarks - Image
Datasets - Benchmarks - Image - QA
Datasets - Benchmarks - Image - Blink
Papers - Context - NoPE
Papers - International Human Phenome Institute
Papers - University - East China Normal University
Papers - Datasets - Training - Context - LongBencb
Papers - Context - Length Generalization
Papers - Attention - NoPE - Long Context with SoftMax Temp
Papers - Attention - Training - Context - Head-based Scaling
Papers - TinyLlama
Papers - Datasets - Training - Context - SlimPajama
Papers - Training - Eval - Sliding Window Perplexity
Papers - Datasets - Training - Context - Starcoderdata
Papers - Training - Eval - Sliding Window - PG19
Papers - Training - Eval - Sliding Window - Proof-pile
Papers - Context - NoPE vs RoPE - Passkey Retrieval Viz
Papers - Transformers Without Positional Encoding - NoPE
Papers - Mila
Papers - IBM
Papers - ServiceNow
Papers - Attention - Multi-Head Attention (MHA)
Papers - Training - Residual Connections
Papers - Text - Encoders - Bert
Papers - Positional Encodings
Papers - Embeddings - Absolute Position Embedding (APE)
Papers - Embeddings - ALiBi
Papers - Encodings - Rotary - RoPE
Papers - Encodings - No Positional Encodings - NoPE
Papers - Embeddings - T5 Relative Bias
Papers - Chain of Thought - Scratchpad
Papers - Text - Classification - FastFit
Datasets - Text - Argument Topics
Datasets - Text - FinTech
Papers - University - Hebrew University of Jerusalem
Papers - Text - Datasets - Classification and Labels
Papers - Benchmarks - Text - Classification - FewMany
Papers - Weather
Papers - Datasets - Weather
Papers - Datasets - Weather - ERA5
Papers - Historical - Weather
Papers - University of Aarhus
Papers - University - Berlin Technical University
Datasets - Coding - Code Reviews
Datasets - Benchmarks - Coding
Datasets - Text - Web
Datasets - Text - CommonCrawl
Datasets - Text - QA - Web
Datasets - Text - Research Papers - QA - QASPER
Papers - Image - Graph - Understanding
Papers - Knowledge Graphs
Papers - Image - Glip
Papers - University - UCLA
Papers - International Digital Economy Academy (IDEA)
Papers - Image - Phrase Grounding
Papers - Image - Bounding Box - Coco - Teacher and Student
Papers - Image - Grounded Captions
Models - Image - GLIGEN
Papers - Text - Instruct - Grounding and Captions
Papers - Image - UMAP
Papers - Text - Legal - Remove Redaction
Papers - University - University of Padua
Papers - Benchmarks - Text - Text Anonymization Benchmark
Papers - Text - Named Entity Recognition (NER)
Papers - Text - Encoders - Sentence Transformers (SBERT)
Papers - Text - Eval - SMOTE
Papers - ML - XGBoost
Papers - Text - Remove Redaction - Countermeasures
Papers - University - Delft University
Papers - FDM Business Services
Papers - Attention - Gated Self-Attentio - Spatial Grounding
Papers - Inference - Scheduled Sampling
Papers - Image - Object Detection - YOLO
Papers - Image - Inpainting
Papers - Image - Keypoint
Papers - 3DGS - Structure from Motion
Papers - SQL - Database Migrations
Papers - SQL - Knowledge Graphs
Papers - SQL - Query Tree
Papers - SQL - Curriculum Learning
Papers - Web - Agent
Papers - University - Simon Fraser University
Papers - University - University of British Columbia
Papers - Coding - Git Commits
Papers - Coding - Defects
Papers - 3DGS - Material Point Method (MPM)
Papers - 3DGS - Motion
Papers - Video - Simulated Material Dynamics - MLS-MPM
Papers - 3DGS - K-Means Clustering
Papers - University - Huazhong University
Papers - Phi-3 - Technical Report
Papers - Text - Mobile
Papers - Audio - Classifier-Free Guidance (CFG)
Papers - Kunlun
Papers - Image - Fine-tuning - LoRA
Papers - Multimodal - XAI
Papers - XAI - Eval - Synthetic Vision Neuron
Papers - XAI - Research in Appendix
Papers - XAI - MAIA
Papers - Llama 3
Papers - Llama 3 - Fine-tuning - Quantization
Papers - Llama 3 - Fine-tuning
Papers - Llama 3 - GPTQ AWQ PB-LLM BiLLM - 1.1-8 bits LoRA
Papers - Image - NeRF - Structure from Motion (SfM)
Papers - Niantic
Papers - Benchmarks - Fintech
Papers - Coding - Automated Workflows
Papers - Fintech - Datasets - SEC - Edgar Filings DB - N-CEN
Papers - Investing - Document QA - SEC Filings
Papers - JP Morgan Chase
Papers - Image - Consistency Trajectory Model (CTM)
Papers - KL Regularization - Diffusion Matching Distillation
Papers - Security - Prompt Injection
Papers - Prompts - Security - Instruction Prioritization
Papers - Image - Multi-Concept Customization (MCC)
Papers - Image - Adaptive Concept Normalization (ACN)
Papers - Image - Encoder - Single-Concept Learning - QFormer
Papers - Image - Synthetic Generator - Canny
Papers - Image - Synthetic Generator - Depth
Datasets - Image - Classification
Papers - Image - Datasets - CIFAR
Papers - Image - Datasets - MNIST
Papers - Activation Functions
Papers - Pre-training - Layer Initialization
Papers - Pre-training - Layer Initialization - LSUV
Papers - Image - Datasets - ImageNet
Papers - University - Czech Technical University
Papers - Pre-training - Weight Initialization
Models - Instruct - Context - 128k
Models - Phi-3
Models - Text - Long Context
Papers - Audio - Attention - FlashSpeech
Papers - Command-R
Papers - Cohere
Papers - Pre-training - Text - Cross-lingual
Papers - Training - KL-divergence Upper bound (KLUB)
Papers - Twelve Labs
Papers - Audio - Latent Consistency Model (LCM)
Papers - Audio - Discriminator - Adversarial Loss
Papers - Audio - Prosody Generator
Papers - Audio - Voice Conversion
Papers - MSRA
Papers - University - Inner Mongolia University
Papers - University - Beijing University
Papers - Attention - Flash Attention
Papers - OLMo
Papers - MobiLlama
Papers - Fine-tuning - Dataset - Instruct - UltraFeedback
Papers - Fine-tuning - PEFT
Papers - Fine-tuning - DoRA
Papers - OpenELM
Papers - Fine-tuning - Text - Bottleneck - RMSNorm
Models - OpenELM
Papers - Training Research - Flash Memory - DRAM
Papers - Attention - Sparse Attention
Papers - Attention - Hard Attention
Papers - Image - Mask2Former
Papers - Training - Early Exit - Gating Network
Papers - Image - Detectron2
Paper - Image - Segmentation - Cost vs Quality - Gating Net
Papers - University - University of California Riverside
Papers - NEC Laboratories
Papers - Image - Cost Reduction - Early Exit
Papers - Custom Layers - No Dropout - Batch Normalization
Papers - Model - Inception
Papers - Pre-training - Batch Normalization
Papers - Image - Training - Per-class Regressor (PCR)