About the role:
As a Software Engineer on the Horizons team, you will collaborate with a diverse group of researchers and engineers to advance the capabilities and safety of large language models through fundamental research in reinforcement learning, creating 'agentic' models via tool use for open-ended tasks such as computer use, improving reasoning abilities in areas such as code generation and mathematics, and developing prototypes for internal use, productivity, and evaluation.
Representative projects:
- Design and maintain high-performance data pipelines for processing large-scale code datasets and implementing secure sandboxed execution environments.
- Architect and optimize core reinforcement learning infrastructure, from clean training abstractions to distributed experiment management across GPU clusters. Help scale our systems to handle increasingly complex research workflows.
- Build intuitive developer tools and dashboards for analyzing ML experiments, including real-time visualizations, interactive debugging interfaces, and efficient metrics collection systems that help researchers understand model behavior.
- Drive performance improvements across our stack through profiling, optimization, and benchmarking. Implement efficient caching solutions and debug distributed systems to accelerate both training and evaluation workflows.
- Collaborate across research and engineering teams to develop automated testing frameworks, design clean APIs, and build scalable infrastructure that accelerates AI research.
You may be a good fit if you:
- 5+ years of industry-related experience
- Are proficient in Python and async/concurrent programming with frameworks like Trio
- Have a strong software engineering background and are interested in working closely with researchers and other engineers
- Enjoy pair programming (we love to pair!)
- Care about code quality, testing, and performance
- Have strong systems design and communication skills
- Are passionate about the potential impact of AI and are committed to developing safe and beneficial systems
Strong candidates may also:
- Have experience with virtualization and sandboxed code execution environments
- Have experience with Kubernetes
- Have contributed to major open-source projects
- Have experience with distributed systems or high-performance computing
- Are familiar with a diverse set of technologies and programming language ecosystems
Candidates need not have:
- Formal certifications or education credentials
- Experience with LLMs, reinforcement learning, or machine learning research before
Deadline to apply: None. Applications will be reviewed on a rolling basis.