1 1 7

Abhranil Chandra

abhranil14

AI & ML interests

Reinforcement Learning, Deep Unsupervised Learning, NLP and Bayesian Deep Learning

Recent Activity

updated a model about 4 hours ago

abhranil14/llama_on_wrong_soln_wrt_human_1_soln_per_qs_6076_FF_batch64_lr10e-6_warmup100

updated a model about 5 hours ago

abhranil14/Gemma_FF_on_gemma_gold_6319_FF_batch64_lr10e-6_warmup100

updated a model about 5 hours ago

abhranil14/Gemma_FF_on_gemma_gold_6319_FF_batch64_lr10e-6_warmup100

View all activity

Organizations

Collections 8

Find the current local time in any timezone

models 23

abhranil14/llama_on_wrong_soln_wrt_human_1_soln_per_qs_6076_FF_batch64_lr10e-6_warmup100

Updated about 4 hours ago

abhranil14/Gemma_FF_on_gemma_gold_6319_FF_batch64_lr10e-6_warmup100

Updated about 5 hours ago

abhranil14/Gemma_FF_on_gemma_gold_6319_FF_batch256_lr10e-6_warmup100

Updated about 6 hours ago

abhranil14/Qwen_wrong_soln_wrt_human_1_soln_per_qs_6076_PEFT_batch256_lr10e-6_warmup100

Updated 3 days ago

abhranil14/Qwen_wrong_soln_wrt_human_1_soln_per_qs_6076_PEFT_batch64_lr10e-6_warmup100

Updated 3 days ago

abhranil14/llama_on_wrong_soln_wrt_human_1_soln_per_qs_6076_FF_batch256_lr10e-6_warmup100

Updated 3 days ago

abhranil14/Math_gemma9b_ver_gen_75_25_full_finetune

Updated 3 days ago

abhranil14/llama3.1_8B_gemma_gold_batch_256

Updated 3 days ago

abhranil14/llama3.1_8B_human_gold_batch_256

Updated 3 days ago

abhranil14/llama3.1_8B_gemma_gold_batch_64

Updated 3 days ago

datasets 2

abhranil14/instruct-human-assistant-prompt-clean-105k

Viewer • Updated Sep 18, 2024 • 105k • 26

abhranil14/first-instruct-human-assistant-prompt-clean-33k

Viewer • Updated Sep 18, 2024 • 33.1k • 33

Abhranil Chandra

AI & ML interests

Recent Activity

Organizations

Collections 8

AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO

R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning

Offline Reinforcement Learning for LLM Multi-Step Reasoning

OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement

MLGym: A New Framework and Benchmark for Advancing AI Research Agents

Papers 3

spaces 1

First Agent Template

models 23

abhranil14/llama_on_wrong_soln_wrt_human_1_soln_per_qs_6076_FF_batch64_lr10e-6_warmup100

abhranil14/Gemma_FF_on_gemma_gold_6319_FF_batch64_lr10e-6_warmup100

abhranil14/Gemma_FF_on_gemma_gold_6319_FF_batch256_lr10e-6_warmup100

abhranil14/Qwen_wrong_soln_wrt_human_1_soln_per_qs_6076_PEFT_batch256_lr10e-6_warmup100

abhranil14/Qwen_wrong_soln_wrt_human_1_soln_per_qs_6076_PEFT_batch64_lr10e-6_warmup100

abhranil14/llama_on_wrong_soln_wrt_human_1_soln_per_qs_6076_FF_batch256_lr10e-6_warmup100

abhranil14/Math_gemma9b_ver_gen_75_25_full_finetune

abhranil14/llama3.1_8B_gemma_gold_batch_256

abhranil14/llama3.1_8B_human_gold_batch_256

abhranil14/llama3.1_8B_gemma_gold_batch_64

datasets 2

abhranil14/instruct-human-assistant-prompt-clean-105k

abhranil14/first-instruct-human-assistant-prompt-clean-33k

Abhranil Chandra

AI & ML interests

Recent Activity

Organizations

Collections 8

Papers 3

spaces 1

First Agent Template

models 23 Sort: Recently updated

datasets 2 Sort: Recently updated

models 23

datasets 2