Post Sales Machine Learning Engineer

Lambda • Full-time • Remote (San Francisco, CA) • 3w ago

*Note: This position requires presence in our San Francisco office location 4 days per week; Lambda’s designated work from home day is currently Tuesday.

What You’ll Do

Guide new customers through the technical onboarding process by:
- Assisting ML researchers in migrating their existing workloads to Lambda’s AI Cloud Platform, ensuring that expected performance is achieved
- Providing initial troubleshooting for technical issues that arise during the first few days of customers time on Lambda infrastructure
Collaborate closely with customers to understand their needs and objectives, offer tailored guidance and best practices for deploying models and managing GPU infrastructure
Demonstrate how to optimize and scale training and inference workloads within Lambda by:
- Building proof-of-concept demos
- Creating detailed architecture diagrams
Create and maintain detailed documentation including technical guides, best practices and troubleshooting resources
Conduct training sessions and workshops for customers, enabling them to effectively utilize Lambda’s products and services
Facilitate smooth workload transitions between Lambda’s various products
Drive customer growth by identifying opportunities to increase product adoption
Act as a trusted advisor to new customers, ensuring successful integration and optimization of Lambda products
Provide continuous customer feedback to influence product roadmap and enhancements
Serve as a link between customers and internal teams

You

Have experience in machine learning or data science with a deep understanding of model development, and deployment
Have experience using deep learning frameworks and libraries such as PyTorch, Tensorflow, Deepspeed, etc.
Have experience with containerization technologies such as Docker and Kubernetes
Have experience building and optimizing LLM-based applications
Have experience building end-end ML pipelines on major cloud platforms
Have experience with Linux systems administration
Are an excellent communicator, capable of explaining complex, technical concepts to technical and non-technical audiences
Are customer obsessed, and strive to deliver exceptional experiences to current and future Lambda customers
Experience as an ML educator and/or building and executing customer training sessions, product demos or workshops

Nice to Have

Experience using MLOps tools such as RunAI, Weights and Biases, ClearML
Experience in training large models using distributed systems
- Selecting parallelism strategies
- Multi-GPU and Multi-Node training
- Troubleshooting and configuring NCCL/RDMA
- Quantization
Experience with HPC orchestration technologies such as SLURM
Experience with automation tools like Ansible, Puppet, Salt

Salary Range Information

Based on market data and other factors, the salary range for this position is $144,000 - $210,000. However, a salary higher or lower than this range may be appropriate for a candidate whose qualifications differ meaningfully from those listed in the job description.