Back to Miskies
Outline
Create Your Own
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
0% Complete
0%
Share
No slides available