Staff Software Engineer, Speculative Decoding
Mission: with a strong background in Generative AI Inference and expertise in Speculative Decoding, you will design, implement, and optimize cutting-edge algorithms to enhance our production AI infrastructure and capabilities in post training, model evaluation, and operational performance.
Responsibilities & outcomes:
- Design, implement, and optimize speculative decoding algorithms and underlying models that enhance the speed and accuracy of Generative AI Inference.
- Collaborate with cross-functional teams to integrate your solutions into Groq’s production AI infrastructure.
- Work in a multi data center production environment and Kubernetes environment with Groq’s customer hardware, inference and compiler stack.
- Develop high-performance, scalable code primarily in C++ and Rust, ensuring efficient resource utilization and system stability. Ability to model performance of a distributed high performance system.
- Experience building production distributed systems involving multi process communication with technologies such as MPI, scheduling and working in a kubernetes environment.
- Stay up-to-date with the latest developments in generative AI and speculative decoding, and translate cutting-edge research into practical, production-ready implementations.
- Work closely with teams across software engineering, research, and operations to drive improvements in post training, model evaluation, and overall system performance.
- Provide technical leadership and mentorship to team members, fostering an environment of continuous learning and innovation.
- Champion code quality, maintainability, observability, monitoring and best practices, ensuring that all deliverables meet rigorous performance and security standards.
Ideal candidates have/are:
- Master’s degree in Computer Science, Electrical Engineering, or a related field (or equivalent industry experience).
- Extensive, hands-on experience in generative AI inference with a specific focus on speculative decoding.
- Proficiency in C++ is essential, with demonstrated experience in developing high-performance systems.
- Strong analytical and problem-solving skills, with a track record of delivering innovative technical solutions.
- Proven ability to work effectively in fast-paced, cross-functional environments, driving projects from conception to production.
- Understanding of the architecture of Generative AI models, PyTorch, familiarity with the data science necessary to evaluate layers of models, their performance and quality.
- Familiarity with AI infrastructure challenges and scalable system design.
Attributes of a Groqster:
- Humility - Egos are checked at the door
- Collaborative & Team Savvy - We make up the smartest person in the room, together
- Growth & Giver Mindset - Learn it all versus know it all, we share knowledge generously
- Curious & Innovative - Take a creative approach to projects, problems, and design
- Passion, Grit, & Boldness - no limit thinking, fueling informed risk taking
If this sounds like you, we’d love to hear from you!
Compensation: At Groq, a competitive base salary is part of our comprehensive compensation package, which includes equity and benefits. For this role, the base salary range is $175,900 to $307,800, determined by your skills, qualifications, experience and internal benchmarks.