OpenAI Explores Constitutional AI Safety Methods
OpenAI has published new research on constitutional AI training, a safety approach pioneered by rival Anthropic, signali…
17 articles about 'AI alignment'
OpenAI has published new research on constitutional AI training, a safety approach pioneered by rival Anthropic, signali…
Anthropic releases its Claude Model Spec, a comprehensive framework defining how its AI models should behave, think, and…
Anthropic's internal testing finds Claude shows sycophantic behavior in only 9% of conversations, but specific domains s…
A new benchmark testing 100 ethical scenarios reveals stark divergence among leading AI models on moral reasoning.
Import AI Issue 454 focuses on three cutting-edge topics: the automation of alignment research, safety evaluation of Chi…