About the Internship
As a Research Intern at Snorkel AI, you’ll contribute to internal research and academic collaborations—helping explore and validate new ideas that may shape future publications, open-source artifacts, and long-term product directions. This is a research-first role, designed for interns who want to do deep technical work with real-world relevance.
You’ll work closely with Snorkel researchers on open-ended projects and produce clear research outputs (experiments, prototypes, internal writeups, and potentially publications depending on project fit and timing).
What You’ll Do
- Develop and evaluate new methods for data development for foundation models and enterprise AI systems (e.g., dataset construction, augmentation, synthetic data, and evaluation).
- Research supervision and evaluation techniques such as rubrics and verifiable rewards.
- Design experiments and run rigorous empirical studies (ablations, benchmarks, error analysis).
- Build lightweight research prototypes and tooling in Python to support internal studies.
- Collaborate with academic partners and internal research teams—reading papers, proposing hypotheses, and iterating quickly.
Example Project Areas
Projects vary by mentor and collaboration needs, but may include:
- Synthetic data generation + filtering for specialized tasks
- Evaluation datasets and benchmarks for LLM / RAG / agent behavior
- Data-centric methods for improving reliability, calibration, and failure-mode coverage
- Evaluating HITL data annotation processes, gaps, and improvements
What We’re Looking For
- Current student in a PhD in ML/AI/CS.
- Strong ML fundamentals and demonstrated research ability (papers, preprints, or substantial research artifacts).
- Excellent Python skills and experience with modern ML tooling (PyTorch, NumPy, etc.).
- Ability to operate in ambiguous, research-driven problem spaces with strong experimentation habits.
- Clear communication: can write concise technical notes and present findings.
Nice to have
- Prior work on evaluation, data curation, synthetic data, weak supervision, NLP, or multimodal ML.
- Experience collaborating with academic labs or participating in research programs.
Internship Details
- Duration: Summer (flexible start/end)
- Location: Hybrid (Redwood City/SF) or Remote (US)
- Compensation: Competitive, commensurate with experience
Why Snorkel Research
- Work with a research org advancing data-centric AI and foundation-model development, in close partnership with labs and enterprises.
- Join a company with deep research roots and a substantial body of peer-reviewed work in this space.
- Get mentorship and ownership on projects that can influence long-term research direction.