Intrinsic Mutual Information-Regulated Preference Optimization: A New Paradigm for LLM Alignment
A latest arXiv paper proposes using Intrinsic Mutual Information (IMI) as a regulator for preference optimization, aimin…
1 articles about 'Preference Optimization'
A latest arXiv paper proposes using Intrinsic Mutual Information (IMI) as a regulator for preference optimization, aimin…