🏷️ Inference-Time Guardrails

1 articles about 'Inference-Time Guardrails'

New Research Proposes Test-Time Safety Alignment Method for Large Language Models

2026-04-30 research 👁 11

A latest arXiv paper explores using input word embeddings as control variables to achieve safety alignment of large lang…