AgentSearchBench: The First AI Agent Search Benchmark Arrives
A research team has released the AgentSearchBench benchmark, designed to address the challenge of finding the right AI a…
Latest articles in Research
A research team has released the AgentSearchBench benchmark, designed to address the challenge of finding the right AI a…
Researchers introduce the concept of 'background temperature,' formally quantifying the phenomenon where large language …
Researchers model LLM self-correction as a cybernetic feedback loop, proposing a concise diagnostic criterion based on M…
A research team introduces the Memanto framework, which solves memory bottleneck problems for long-horizon AI agents in …
A research team has released MolClaw, an autonomous intelligent agent that integrates over 30 specialized tools. Through…
As AI-automated research pipelines generate a growing volume of academic output, researchers propose a dual-layer certif…
A latest arXiv paper proposes the 'Math Takes Two' testing framework, which examines whether language models possess gen…
A research team developed an agentic reproduction system that automatically extracts structured methods, writes code, an…
Researchers have proposed an artifact-based agent framework designed to address the challenges of adaptability and repro…
Researchers successfully trained an mRNA language model covering 25 species at a computational cost of just $165, openin…
Google has launched Deep Research Max, a major upgrade built on the Gemini Deep Research Agent architecture, marking a s…
A survey of 158 professional software engineers by researcher Annie Vella reveals that AI tools are shifting engineers' …