Phi-1.5-RLLMv3
Collection
This is a collection designed to present the ten RLLM steps/ training runs intended to improve Phi-1.5's outputs towards coherence and politeness.
•
10 items
•
Updated
Companion Post: Research Log, RLLMv3 (GPT2-XL, Phi-1.5 and Falcon-RW-1B)
Main post: BetterDAN, AI Machiavelli & Oppo Jailbreaks vs. SOTA models & GPT2XL_RLLMv3
Related post: Coherence (and Response Time) Test