Intrinsic Mutual Information-Regulated Preference Optimization: A New Paradigm for LLM Alignment
A latest arXiv paper proposes using Intrinsic Mutual Information (IMI) as a regulator for preference optimization, aimin…
3 articles about 'LLM Alignment'
A latest arXiv paper proposes using Intrinsic Mutual Information (IMI) as a regulator for preference optimization, aimin…
Researchers propose the KARL framework, a knowledge-boundary-aware reinforcement learning approach that enables large la…
A latest arXiv paper investigates the 'sandbagging effect' where large language models deliberately underperform under w…