Anthropic's Responsible Scaling Policy
Last summer we published our first Responsible Scaling Policy (RSP), which focuses on addressing catastrophic safety failures and misuse. In adopting such a policy, our primary goal has been to help turn high-level safety concepts into practical policies for fast-moving technical organizations and demonstrate the viability of these measures as possible standards.
Our Responsible Scaling Policy has been a powerful rallying point with many teams' work over the last six months connecting directly back to major RSP work streams. The progress we have made has required significant work from teams across Anthropic and there is much more work to be done. Our new Responsible Scaling Team will:
- Help leadership align on a practical approach to scaling responsibly that will raise the safety waterline in industry, inform regulation, and mitigate catastrophic risks from models
- Rally teams internally to operationalize and implement this technical roadmap and set of high-level commitments, making object level decisions as needed
- Iterate internally on different approaches to safety challenges, feeding these learnings back into the high-level policy, and sharing our learnings with industry and policymakers
As we continue to iterate on and improve the original policy, we are actively exploring ways to incorporate practices from existing risk management and operational safety domains. While none of these domains alone will be perfectly analogous, we expect to find valuable insights from nuclear security, biosecurity, systems safety, autonomous vehicles, aerospace, and cybersecurity. We intend to build an interdisciplinary team to help us integrate the most relevant and valuable practices from each.
Note: For this role, we are looking for candidates who can start within 3 months. We will consider all candidates who can meet the organization's hybrid policy, provided you have significant (60%+) overlap with Pacific Time.
About the Role
The Safety Systems Engineer will lead critical technical safety processes at Anthropic by conducting systematic threat modeling and risk assessments across our AI systems. Key deliverables include: performing technical safety reviews of new AI capabilities to identify potential risks before deployment, developing and maintaining a comprehensive AI safety risk register with clear ownership and mitigation strategies, conducting structured assessments to inform the prioritization and design of technical safeguards, and establishing robust validation frameworks to verify safeguard effectiveness.
The role will coordinate closely with Frontier Risk, Trust & Safety, and AI development teams to ensure safety considerations are integrated into technical roadmaps and drive concrete mitigation decisions. This position fills a crucial gap between risk discovery and practical implementation by providing rigorous technical analysis to inform both internal safety decisions and external policy discussions. The engineer will also support the RSP program team to establish scalable technical and scale-able processes for safety including safety reviews, risk monitoring, and incident response as our AI capabilities advance, ensuring we maintain strong safety standards, understand operational risks, while meeting development timelines.
Responsibilities:
- Design a practical approach to scaling responsibly that will raise the safety waterline in industry, inform regulation, and mitigate catastrophic risks from models.
- Align leadership and other stakeholder groups on the overall policy and key safety decisions, by making recommendations, synthesizing inputs, and pragmatically balancing competing considerations.
- Work with cross-functional teams to align their technical roadmaps with the RSP, and provide clarity by making or delegating object-level decisions as needed.
- Ensure that any safety cases are robust and have been appropriately stress tested by the time they are made to the board.
- Work closely with the TPM team to ensure that teams have the necessary resources and information to meet RSP objectives on commercially relevant timeframes
- Feeding the implications of technical challenges from safety testing and mitigations back into the high-level policy, and sharing our learnings with industry and policymakers
You may be a good fit if you have:
- Deep expertise in applied system safety and system engineering practices, with a proven track record of applying these to complex systems with AI/ML components.
- Demonstrated ability to develop and implement scalable risk assurance (continuous validation) methods and robust testing frameworks for large, complex systems with AI/ML components
- Strong technical integrity and ethical leadership, with a history of making and defending difficult safety decisions and/or recommendations based on technical expertise
- Ability to balance idealism with practicality in decision-making; capacity to make sound decisions under time pressure or with incomplete information; talent for assessing the real-world feasibility of proposed solutions
- History of quickly mastering complex technical domains, even those outside their primary area of expertise
- Proven ability to communicate complex technical safety considerations to both expert and non-expert stakeholders
- Applied skills with Python, SQL, and able to conduct data analysis
Strong candidates may also have experience with:
- Experience with resilience engineering practices and tools
- Extensive knowledge of AI/ML architectures, training processes, and failure modes specific to AI systems
- Familiarity with evolving AI regulations and standards
- Experience in risk management and safety best practices in complex technical environments.
- Advanced statistical analysis skills