matlok
's Collections
Papers - Encoders
updated
Functional Interpolation for Relative Positions Improves Long Context
Transformers
Paper
•
2310.04418
•
Published
•
4
SPBERT: An Efficient Pre-training BERT on SPARQL Queries for Question
Answering over Knowledge Graphs
Paper
•
2106.09997
•
Published
•
2
Neural Machine Translation of Rare Words with Subword Units
Paper
•
1508.07909
•
Published
•
4
A Multimodal Approach to Device-Directed Speech Detection with Large
Language Models
Paper
•
2403.14438
•
Published
•
2
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper
•
2402.04252
•
Published
•
25
Mini-Gemini: Mining the Potential of Multi-modality Vision Language
Models
Paper
•
2403.18814
•
Published
•
45
Training LLMs over Neurally Compressed Text
Paper
•
2404.03626
•
Published
•
21
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Paper
•
1907.11692
•
Published
•
7
CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language
Representation
Paper
•
2103.06874
•
Published
•
1
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for
Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper
•
2412.13663
•
Published
•
113
Byte Latent Transformer: Patches Scale Better Than Tokens
Paper
•
2412.09871
•
Published
•
79