Google, xAI, Microsoft Agree to US AI Safety Reviews
Three major AI companies voluntarily commit to federal safety evaluations of their most powerful new AI models before pu…
122 articles about 'AI safety'
Three major AI companies voluntarily commit to federal safety evaluations of their most powerful new AI models before pu…
MIT researchers introduce a novel alignment framework that builds on Anthropic's Constitutional AI to improve safety in …
Former Google AI ethics leader argues voluntary industry commitments on AI safety are inadequate, calling for binding re…
Ilya Sutskever's Safe Superintelligence releases its debut technical paper, offering the first glimpse into the secretiv…
Anthropic CEO Dario Amodei forecasts AI systems will match or exceed human-level expertise across most domains by 2027 o…
The UK AI Safety Institute announces a landmark partnership with Anthropic to conduct pre-deployment evaluations of fron…
Anthropic CEO Dario Amodei calls on AI industry leaders to unite on safety standards before advanced systems outpace cur…
OpenAI has published new research on constitutional AI training, a safety approach pioneered by rival Anthropic, signali…
A new paper shows AI models can become 'addicted' to specially crafted images, preferring them over news of humanity cur…
Anthropic researchers reveal internal decision pathways in Claude, marking a major step in AI interpretability and safet…
Jack Clark estimates 60% odds that recursive AI improvement arrives by end of 2028, potentially outpacing human supervis…
Professor Hannah Fry's experiment with an autonomous AI agent reveals alarming risks of agentic AI when given real-world…