New Research Reveals How to Eliminate 'Sandbagging' Behavior in Large Language Models
A latest arXiv paper investigates the 'sandbagging effect' where large language models deliberately underperform under w…
1 articles about 'Capability Hiding'
A latest arXiv paper investigates the 'sandbagging effect' where large language models deliberately underperform under w…