Falcon-1B-RW-RLLMv3
Collection
This is a collection designed to present the 10 RLLM steps/ training runs intended to improve Falcon-RW-1B's outputs towards coherence and politeness.
•
10 items
•
Updated
Companion Post: Research Log, RLLMv3 (GPT2-XL, Phi-1.5 and Falcon-RW-1B)
Main post: BetterDAN, AI Machiavelli & Oppo Jailbreaks vs. SOTA models & GPT2XL_RLLMv3
Related post: Coherence (and Response Time) Test