Principal Engineer

Coreweave • New York City, New York, United States • 1w ago

CoreWeave is the AI Hyperscaler™, delivering a cloud platform of cutting edge services powering the next wave of AI. The company’s technology provides enterprises and leading AI labs with the most performant, efficient and resilient solutions for accelerated computing. Since 2017, CoreWeave has operated a growing footprint of data centers covering every region of the US and across Europe. CoreWeave was ranked as one of the TIME100 most influential companies of 2024.

As the leader in the industry, we thrive in an environment where adaptability and resilience are key. Our culture offers career-defining opportunities for those who excel amid change and challenge. If you’re someone who thrives in a dynamic environment, enjoys solving complex problems, and is eager to make a significant impact, CoreWeave is the place for you. Join us, and be part of a team solving some of the most exciting challenges in the industry.

As a Principal Engineer at CoreWeave, you will be responsible for leading technical strategy, architectural decisions, and the development of advanced features that power our GPU-accelerated cloud services. You will work closely with senior leadership, engineering teams, and product teams to define and execute on the vision for CoreWeave's next-generation infrastructure. This is an opportunity to have a direct and significant impact on the architecture of a high-growth, cutting-edge cloud platform used by leading companies in machine learning, VFX, and other compute-intensive industries.

What You'll Do:

Architect and Design: Lead the design and architecture of scalable, GPU-accelerated cloud solutions, ensuring they are highly performant, secure, and cost-efficient for customers. Work with cross-functional teams to translate business requirements into technical solutions.
Innovate and Optimize: Drive technical innovation by identifying and implementing new technologies and methodologies that will enhance CoreWeave’s platform. Continuously optimize performance, cost, and scalability of the infrastructure to deliver the best possible customer experience.
Technical Leadership: Provide mentorship and guidance to engineering teams, setting high standards for code quality, system design, and operational excellence. Lead code reviews, foster collaboration, and encourage best practices across teams.
Cross-Team Collaboration: Work closely with Product, Operations, and Data Science teams to ensure seamless integration of new features and services into the CoreWeave platform. Contribute to strategic decisions that influence the company's technical direction.
Solve Complex Problems: Troubleshoot and solve complex technical challenges related to distributed systems, cloud infrastructure, GPU workloads, and Kubernetes. Build solutions to scale our platform in both performance and cost-efficiency.
Thought Leadership: Stay ahead of industry trends in cloud computing, GPU acceleration, Kubernetes, and other relevant technologies. Represent CoreWeave as a thought leader at conferences, industry events, and in technical communities.

Who You Are:

Experience: 10+ years of experience in software engineering, with at least 5 years in leadership roles. Extensive experience in cloud computing, distributed systems, and high-performance workloads.
Strong Technical Background: Expertise in designing and building large-scale distributed systems, with deep knowledge of Kubernetes, containerization, and orchestration for GPU-accelerated workloads.
Cloud Infrastructure: Solid experience with cloud infrastructure (AWS, GCP, Azure), and hands-on knowledge of GPU-based workloads, including machine learning, rendering, and batch processing.
Programming Skills: Strong proficiency in languages such as Go, Python, C++, or similar. Experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation) is a plus.
Architecture & Scalability: Deep understanding of cloud-native architecture, performance optimization, and the challenges of scaling compute-intensive applications. Experience in building infrastructure that is both performant and cost-effective at scale.
Leadership & Mentorship: Proven ability to mentor and lead technical teams, drive complex technical projects, and make high-level architectural decisions that balance short-term needs with long-term scalability.
Problem Solving & Innovation: A track record of solving complex, high-impact engineering problems and driving innovation in cloud infrastructure or other relevant technologies.
Communication: Strong communication skills with the ability to engage stakeholders at all levels of the organization. Experience with presenting technical ideas to non-technical audiences.

Nice to Have:

Experience working with or contributing to the Kubernetes ecosystem, including custom controllers, Helm charts, and operator development.
Background in machine learning or similar GPU-intensive applications.
Familiarity with cloud cost optimization, performance tuning, and multi-cloud strategies.

Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $275,000-$330,000. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience.