Teaching Claude Why: Anthropic's Alignment Shift
Anthropic adopts a new alignment philosophy for Claude, focusing on teaching the AI 'why' behind rules rather than just …
11 articles about 'constitutional ai'
Anthropic adopts a new alignment philosophy for Claude, focusing on teaching the AI 'why' behind rules rather than just …
Carnegie Mellon researchers combine constitutional principles with reinforcement learning to create AI agents that auton…
Anthropic publishes new research advancing Constitutional AI methods for aligning reasoning models, setting a new standa…
Anthropic publishes new Constitutional AI 2.0 paper advancing scalable oversight methods for safer, more aligned AI syst…
Anthropic launches Constitutional AI 2.0, a major upgrade to its AI safety alignment framework with new oversight mechan…
Anthropic publishes major research on Constitutional AI v3, introducing dynamic principle hierarchies and adaptive safet…
OpenAI researchers introduce a new alignment framework challenging Anthropic's Constitutional AI approach with rule-base…
MIT researchers introduce a novel alignment framework that builds on Anthropic's Constitutional AI to improve safety in …
OpenAI has published new research on constitutional AI training, a safety approach pioneered by rival Anthropic, signali…
Anthropic unveils Constitutional AI 2.0, a major upgrade to its safety framework designed to give enterprises customizab…
New research shows Constitutional AI training methods dramatically reduce toxic and harmful outputs from large language …