GVR (Gurumurthi V Ramanan)

upvoted an article 2 days ago

Article

Training and Finetuning Embedding Models with Sentence Transformers v3

6 days ago

• 62

upvoted a paper 5 days ago

AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct

Paper • 2405.14906 • Published 11 days ago • 18

upvoted a collection 5 days ago

Sparse Foundational Llama 2 Models

Collection

Sparse pre-trained and fine-tuned Llama models made by Neural Magic + Cerebras • 27 items • Updated 16 days ago • 7

upvoted an article 10 days ago

Article

Let's talk about LLM evaluation

By

•

10 days ago

• 82

upvoted a collection 10 days ago

C4AI Aya 23

Collection

Aya 23 is an open weights research release of an instruction fine-tuned model with highly advanced multilingual capabilities. • 3 items • Updated 10 days ago • 34

upvoted a collection 11 days ago

Phi-3

Collection

Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 22 items • Updated 3 days ago • 301

upvoted a collection 16 days ago

IndicGenBench

Collection

Datasets released in "IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs" (https://arxiv.org/abs/2404.16816) • 4 items • Updated 19 days ago • 2

upvoted a paper 17 days ago

Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM

Paper • 2401.02994 • Published Jan 4 • 44

upvoted a collection 20 days ago

Yi-1.5 (2024/05)

Collection

10 items • Updated 14 days ago • 76

upvoted a paper 29 days ago

WildChat: 1M ChatGPT Interaction Logs in the Wild

Paper • 2405.01470 • Published about 1 month ago • 53

upvoted an article about 1 month ago

Article

StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation

Apr 29

• 69

upvoted a collection about 1 month ago

LLaVA++ (LLaMA-3 and Phi-3-Mini)

Collection

Extending Visual Capabilities of LLaVA with LLaMA-3 and Phi-3 • 11 items • Updated 38 minutes ago • 22

upvoted 2 articles about 1 month ago

Article

seemore: Implement a Vision Language Model from Scratch

By

•

21 days ago

• 44

Article

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

Apr 15

• 134

upvoted a collection about 1 month ago

Quantized-FT-Orca-Math

Collection

Models trained during quantization aware fine-tuning experiments using PyTorch's FSDP. • 8 items • Updated Apr 16 • 7

upvoted a paper about 2 months ago

Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs

Paper • 2404.05719 • Published Apr 8 • 57

upvoted a collection about 2 months ago

Eurus

Collection

Advancing LLM Reasoning Generalists with Preference Trees • 11 items • Updated Apr 15 • 22

upvoted 2 collections 2 months ago

UDOP

Collection

UDOP is a general multimodal model for document AI • 4 items • Updated 11 days ago • 20

Aya Indic Suite

Collection

An Indic language filtered dataset from the Aya dataset collection. • 9 items • Updated Mar 31 • 1

upvoted a paper 2 months ago

Long-form factuality in large language models

Paper • 2403.18802 • Published Mar 27 • 23

upvoted a collection 3 months ago

StarChat2 15B

Collection

Model, datasets, and demo for StarChat2 15B. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 10 items • Updated Apr 12 • 12

upvoted a paper 3 months ago

Unifying Vision, Text, and Layout for Universal Document Processing

Paper • 2212.02623 • Published Dec 5, 2022 • 10

upvoted a collection 3 months ago

Code LMs Evaluation

Collection

50 items • Updated 27 days ago • 4

upvoted a paper 3 months ago

Design2Code: How Far Are We From Automating Front-End Engineering?

Paper • 2403.03163 • Published Mar 5 • 92

upvoted 2 collections 3 months ago

💫 StarCoder2

Collection

StarCoder2 models and datasets! • 8 items • Updated Mar 1 • 74

Tower

Collection

Model weights and SFT data for Tower. • 9 items • Updated Apr 16 • 19

upvoted 2 collections 4 months ago

OpenMath

Collection

A collection of models and datasets introduced in "OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset" • 15 items • Updated Feb 19 • 28

⛔️🔦 Provenance, Watermarking & Deepfake Detection

Collection

Technical tools for more control over non-consensual synthetic content • 14 items • Updated Apr 1 • 36

upvoted 3 papers 5 months ago

upvoted a paper 6 months ago

SparQ Attention: Bandwidth-Efficient LLM Inference

Paper • 2312.04985 • Published Dec 8, 2023 • 35

upvoted a paper 7 months ago

Zephyr: Direct Distillation of LM Alignment

Paper • 2310.16944 • Published Oct 25, 2023 • 116

upvoted 2 papers 9 months ago

Language Modeling Is Compression

Paper • 2309.10668 • Published Sep 19, 2023 • 80

CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Paper • 2309.09400 • Published Sep 17, 2023 • 77

upvoted a paper 10 months ago

Scaling Relationship on Learning Mathematical Reasoning with Large Language Models

Paper • 2308.01825 • Published Aug 3, 2023 • 19

Gurumurthi V Ramanan

AI & ML interests

Organizations

GVR's activity

Training and Finetuning Embedding Models with Sentence Transformers v3

AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct

Sparse Foundational Llama 2 Models

Let's talk about LLM evaluation

C4AI Aya 23

Phi-3

IndicGenBench

Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM

Yi-1.5 (2024/05)

WildChat: 1M ChatGPT Interaction Logs in the Wild

StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation

LLaVA++ (LLaMA-3 and Phi-3-Mini)

seemore: Implement a Vision Language Model from Scratch

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

Quantized-FT-Orca-Math

Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs

Eurus

UDOP

Aya Indic Suite

Long-form factuality in large language models

StarChat2 15B

Unifying Vision, Text, and Layout for Universal Document Processing

Code LMs Evaluation

Design2Code: How Far Are We From Automating Front-End Engineering?

💫 StarCoder2

Tower

OpenMath

⛔️🔦 Provenance, Watermarking & Deepfake Detection

CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution

Understanding LLMs: A Comprehensive Overview from Training to Inference

ControlLLM: Augment Language Models with Tools by Searching on Graphs

SparQ Attention: Bandwidth-Efficient LLM Inference

Zephyr: Direct Distillation of LM Alignment

Language Modeling Is Compression

CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Scaling Relationship on Learning Mathematical Reasoning with Large Language Models