Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
Supreeth Rao's picture

Supreeth Rao

Supreeth
5 20
https://supreethrao.com
  • SupreethRao99
  • SupreethRao99
  • supreeth-rao

AI & ML interests

Reinforcement Learning, Large Language Models, Distributed Computing

Recent Activity

updated a collection 3 days ago
SearchLM
updated a collection 3 days ago
SearchLM
updated a collection 3 days ago
SearchLM
View all activity

Organizations

None yet

Supreeth 's collections 1

SearchLM
NL2BM25: teaching Qwen2.5-3B to generate Tantivy boolean queries via SFT + GRPO. Covers reward hacking (GRPO v1) and the shaped-reward fix (GRPO v2).
  • Supreeth/searchlm-nl2bm25-sft

    Text Generation • 3B • Updated 3 days ago • 62
  • Supreeth/searchlm-nl2bm25-sft-v2

    Text Generation • 3B • Updated 3 days ago • 47
  • Supreeth/searchlm-nl2bm25-grpo

    Text Generation • 3B • Updated 3 days ago • 49
  • Supreeth/searchlm-nl2bm25-grpo-v2

    Text Generation • 3B • Updated 3 days ago • 48
SearchLM
NL2BM25: teaching Qwen2.5-3B to generate Tantivy boolean queries via SFT + GRPO. Covers reward hacking (GRPO v1) and the shaped-reward fix (GRPO v2).
  • Supreeth/searchlm-nl2bm25-sft

    Text Generation • 3B • Updated 3 days ago • 62
  • Supreeth/searchlm-nl2bm25-sft-v2

    Text Generation • 3B • Updated 3 days ago • 47
  • Supreeth/searchlm-nl2bm25-grpo

    Text Generation • 3B • Updated 3 days ago • 49
  • Supreeth/searchlm-nl2bm25-grpo-v2

    Text Generation • 3B • Updated 3 days ago • 48
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs