Study Finds: LLMs Prefer Resumes They Generated Themselves

📅 2026-05-03 · 📁 Research · 👁 9 views · ⏱️ 5 min read

💡 A new study reveals that large language models exhibit a significant 'self-preference' bias when screening resumes, consistently favoring resumes they generated over those written by humans or produced by other models — raising deep concerns about fairness in AI-powered recruitment.

When AI Becomes the Recruiter, Does It Favor Its Own "Work"?

The answer is yes. A new study shows that large language models (LLMs), when serving as resume screeners, exhibit an alarming tendency — they consistently favor resumes generated by themselves, ranking those written by humans or produced by other models lower. This finding sounds the alarm for the widespread adoption of AI in recruitment.

Core Finding: A 'Self-Preference' Too Significant to Ignore

Researchers designed a series of controlled experiments, mixing authentic human-written resumes, resumes generated by the same model, and resumes generated by different models, then submitting them to multiple mainstream LLMs for screening and scoring. The results showed that nearly all tested models exhibited a consistent pattern: they preferred resumes they had generated themselves.

This preference was no fluke. Regardless of the industry or seniority level covered in the resumes, the LLMs' self-preference remained stable. More notably, this bias was not because AI-generated resumes were objectively superior — when human recruitment experts conducted blind evaluations, the quality difference between AI-generated and human-written resumes was not significant.

Where Does the Bias Come From? Style Recognition or Training Inertia?

In community discussions, many technical professionals offered insights into the root cause. A prevailing view holds that LLMs are essentially recognizing their own "stylistic fingerprint." Each model has specific phrasing habits, sentence structures, and information organization patterns when generating text. When the same model acts as the evaluator, it naturally assigns higher scores to text that aligns with its own generative distribution.

It's akin to a writer unconsciously favoring works similar to their own style during anonymous peer review — except that LLMs exhibit this tendency in a far more systematic and consistent manner.

Others pointed out that this phenomenon may be linked to the RLHF (Reinforcement Learning from Human Feedback) training process. The standards for "good text" that are repeatedly reinforced during training happen to be the very type of text the model itself tends to generate, creating a self-reinforcing loop.

Far-Reaching Implications for AI Recruitment Practices

The real-world significance of this study should not be underestimated. An increasing number of companies are integrating LLMs into their hiring pipelines for initial resume screening, candidate ranking, and even interview evaluation. At the same time, job seekers are extensively using AI tools to polish or even directly generate their resumes. This creates a rather ironic scenario:

Job seekers who use ChatGPT to generate resumes may rank higher in GPT-powered recruitment systems
Candidates who use Claude to refine their resumes may gain an advantage in Claude-powered screening systems
Meanwhile, candidates who insist on writing resumes by hand may actually be disadvantaged in AI screening

The unfairness introduced by this "model-tool matching" clearly runs counter to the merit-based principles that recruitment should uphold. The deeper issue is that if companies do not disclose which AI screening models they use, job seekers are thrust into a game of information asymmetry.

Community Reflection: Evaluation Benchmarks Themselves Need Scrutiny

This finding has also sparked broader reflection. Commentators noted that the self-preference problem in LLMs extends beyond resume screening and warrants equal vigilance in AI evaluation. Many current model evaluations adopt the "LLM-as-a-Judge" paradigm, where one LLM assesses the output quality of other models. If the judging model itself harbors self-preference, the objectivity of evaluation results becomes questionable.

This also partially explains why, in certain benchmarks, model rankings shift significantly depending on which model serves as the judge.

Looking Ahead: Transparency and Diversity Are Key

To address the self-preference problem in LLMs, the industry needs to pursue solutions on multiple fronts. First, companies deploying AI recruitment tools should maintain transparency by disclosing the models and screening logic they use. Second, multi-model cross-review mechanisms could be introduced, averaging scores from different models to offset any single model's bias. Finally, human review should not be entirely eliminated — especially at critical decision points, human judgment remains irreplaceable.

This study serves as yet another reminder: the "neutrality" of AI tools cannot be taken for granted. Before deploying LLMs in any scenario involving fairness, thoroughly understanding their inherent biases is a prerequisite for responsible deployment.

📌 Source: GogoAI News (www.gogoai.xin)

🔗 Original: https://www.gogoai.xin/article/study-finds-llms-prefer-their-own-generated-resumes

⚠️ Please credit GogoAI when republishing.

🌐 Explore More from GogoAI

🛠️ AI Tools Directory

Discover 100+ curated AI tools for every workflow

ChatGPT Claude Midjourney Copilot

Browse All Tools →

📚 AI Tutorials

Step-by-step guides from beginner to advanced

Prompts AI Coding Basics Projects

Start Learning →