llm-agents/tora-code-7b-v1.0
Text Generation
• Updated • 509
• • 18
Text Generation
• Updated • 524
• • 8
llm-agents/tora-code-13b-v1.0
Text Generation
• Updated • 494
• • 15
Text Generation
• Updated • 493
• • 6
llm-agents/tora-code-34b-v1.0
Text Generation
• Updated • 521
• 14
Text Generation
• Updated • 508
• 21
shenzj/LLM-Agent-Recommendation
Updated
rl-llm-agent/Llama-3.1-8B-Instruct-sft-alfworld-iter0
Text Generation
• 8B • Updated • 6
rl-llm-agent/Llama-3.2-3B-Instruct-sft-alfworld-iter0
Text Generation
• 3B • Updated • 16
• rl-llm-agent/Llama-3.2-3B-Instruct-online-dpo-alfworld-iter0
rl-llm-agent/Llama-3.2-3B-Instruct-online-dpo-alfworld-iter1
Text Generation
• 3B • Updated • 5
rl-llm-agent/Llama-3.2-3B-Instruct-online-dpo-alfworld-iter2
rl-llm-agent/Llama-3.2-3B-Instruct-reward-alfworld-iqlearn-iter0
rl-llm-agent/Llama-3.2-3B-Instruct-value-alfworld-8b-sft
rl-llm-agent/Llama-3.2-3B-Instruct-online-dpo-alfworld-iqlearn-iter0
rl-llm-agent/Llama-3.2-3B-Instruct-reward-alfworld-shaped-iter0
rl-llm-agent/Llama-3.2-3B-Instruct-reward-alfworld-iqlearn-iter1
rl-llm-agent/Llama-3.2-3B-Instruct-reward-alfworld-iter2-70k
rl-llm-agent/Llama-3.2-3B-Instruct-online-dpo-exploration-aflworld-iter0-checkpoint-50
rl-llm-agent/Llama-3.2-3B-Instruct-sft-alfworld-leap-iter1
Text Generation
• 3B • Updated • 5
zeeshanp/llm_agents_final_proj
Updated
tensorblock/llm-agents_tora-code-7b-v1.0-GGUF
Text Generation
• 7B • Updated • 28
LLM4RAM/ppo-driving-agent
Updated
shivank21/diag_agent_deepseek-llm-7b-base
Text Generation
• 250k • Updated • 1
shivank21/diag_agent_deepseek-llm-7b-3500
7B • Updated • 3
rshn-krn/hotpotqa-agent-sft-llm
Text Generation
• 3B • Updated • 1