YOUR MISSION:
As an Engineering Manager for Site Reliability Engineering & Developer Experience (f/m/d) at Parloa, you will nurture and support a collaborative team that ensures the reliability, scalability, and performance of our products while empowering engineers with thoughtful tools and workflows. Your mission is to cultivate and grow SRE practices that enable our architectural transformation and ensuring our systems meet availability targets. You'll foster a caring culture where reliability is a shared responsibility and where automation helps reduce toil, empowering engineers to focus on meaningful work.
IN THIS ROLE YOU WILL:
- Build and nurture a supportive team that harmonizes SRE excellence with developer experience
- Collaborate to establish SRE practices: SLI/SLOs, error budgets, and trust-based postmortems using Datadog metrics
- Create comprehensive observability strategies leveraging our monitoring stack
- Support sustainable incident response, on-call processes, and automation using GitHub Actions and Terraform to improve MTTR
- Partner with engineers and engineering teams to integrate reliability practices into CI/CD pipelines (ArgoCD, GitHub Actions) while supporting developer wellbeing
- Foster adoption of reliability best practices across our Kubernetes-based infrastructure through mentorship and collaboration
- Guide teams in leveraging our Azure cloud platform effectively while preparing for multi-cloud architectures
- Support the thoughtful adoption of AI tools (Cursor, GitHub Copilot) for operational efficiency
WHAT YOU BRING TO THE TABLE:
- Experience supporting SRE, DevOps, or platform teams with focus on reliability and collaboration
- Understanding of SRE principles: SLI/SLOs, error budgets, and toil reduction
- Hands-on experience with our observability stack (Datadog for metrics/APM, ELK for sensitive logs) and production systems at scale
- Deep empathy for developer workflows and creating sustainable on-call processes that support work-life balance
- Familiarity with Infrastructure as Code using Terraform and container orchestration with Kubernetes
- Experience with CI/CD platforms (GitHub Actions, ArgoCD) and integrating reliability into deployment pipelines
- Understanding of Azure cloud services and preparing organizations for multi-cloud transformations
- Warm communication skills for incident support, postmortem facilitation, and cross-team collaboration
- Experience with authentication systems (e.g. Okta, EntraID) and their role in system reliability
- Commitment to building inclusive teams that balance operational care with innovation and wellbeing
- Appreciation for AI-assisted tools in improving operational efficiency and reducing toil
- Background with databases (MySQL, Redis, MongoDB) and their reliability considerations is valued
- Experience with multi-cloud architectures and distributed systems is warmly welcomed
WHAT'S IN IT FOR YOU?
- Join a diverse team of 40+ nationalities with flat hierarchies and a collaborative company culture.
- Opportunity to build and scale your career at the intersection of customer-facing roles and engineering in a dynamic startup on its journey to become an international leader in SaaS platforms for Conversational AI.
- Deutschland ticket, Urban Sports Club, Job Rad, Nilo Health, weekly sponsored office lunches
- Competitive compensation and equity package.
- Flexible working hours, 28 vacation days and workation opportunities.
- Access to a training and development budget for continuous professional growth.
- Regular team events, game nights, and other social activities.
- Hybrid work environment. However, we love to build real connections and want to welcome everyone in our beautiful Berlin office on certain days.
Your recruiting process at Parloa:
Recruiter video call → Meet your manager → Technical Interview + Technical Leadership Interview → Bar Raiser Interview