What Matters For Safety Alignment?
Xing Li, Hui-Ling Zhen, Lihao Yin, Xianzhi Yu, Zhenhua Dong, Mingxuan Yuan · Jan 7, 2026
Citations: 0
Red Team Automatic Metrics Tool Use General
- This paper presents a comprehensive empirical study on the safety alignment capabilities.
- We evaluate what matters for safety alignment in LLMs and LRMs to provide essential insights for developing more secure and reliable AI systems.