About the role
We’re seeking exceptionally talented research engineers to build generative models that can accurately simulate physical and virtual worlds. You’ll play a large role on a small team pushing the boundaries of what is possible with cutting edge multimodal language models, with a particular focus on audio and visual data.
- Develop and train multimodal transformers at massive scale.
- Build infrastructure for large-scale video data pipelines, curation, and annotation.
- Develop and implement quantitative evaluations for world simulation accuracy and intelligence.
- Inference optimization and distillation for real-time generation.
- Develop and implement strategies for ultra-long-context transformers.
- Maintain a deep focus on building a lovable user experience everywhere from data curation to inference
Additionally, experience in any of the following may set you apart:
- A proven track record of training SOTA large foundation models end to end alone or as part of small teams
- Designing and implementing novel, scalable architectures to solve challenging problems
- Shipping and scaling large scale systems to large userbases in production environments
- Track record of releases, publications, and/or open source projects related to video generation, world models, multimodal language models, or transformer architectures
- Strong systems and engineering skills in deep learning frameworks like JAX or PyTorch
Who You Are
- Technically Exceptional: Engineering is your craft and you operate at the very highest level regardless of the stack.
- Impatient: You’re excited about the mission of your work, and treat every project like you wish it had been completed yesterday.
- Perfectionist: You take pride in crafting the most delightful experiences for customers, and are determined to polish every last splinter from the product.
- High Agency: You don’t wait for direction, and are capable of completing large projects end to end on your own.
- Product Minded: You are excited to research and build models in order to deliver exceptional, newly enabled product experiences to customers at scale.
Location
We work out of xAI’s Palo Alto office and prefer that candidates are able to relocate to the Bay Area if necessary.
Interview Process
After submitting your application, the team reviews your CV and statement of exceptional work. If your application passes this stage, you will be invited to a 15 minute interview (“phone interview”) during which a member of our team will ask some basic questions. If you clear the initial phone interview, you will enter the main process, which consists of 2 technical interviews and 1 project deep-dive interview:
- Practical coding assessment in a language of your choice.
- Systems design hands-on: Demonstrate practical skills in a live problem-solving session.
- Project deep-dive: Present and answer questions about exceptional work that you’ve done.
- Meet and greet with the wider team.
Our goal is to finish the main process within one week. Final interviews will be conducted in person.
Annual Salary Range
$180,000 - $440,000 USD