Dataiku is looking for a Data Engineer I to join our Enterprise Data and Analytics (EDA) team. As a member of the EDA team, you will play a central role in delivering data to fuel analytics and data-driven insights to various stakeholders and teams within the company. You will also be a key technical member contributing to the data platform that fuels centralized analytics, embedded analytics teams, Generative AI engineering, and self-service users across the organization.
This role is about 50% Data Operations, Support & Troubleshooting, and 50% new development. The data engineering day-to-day will primarily be within the data platform built using Snowflake, Dataiku, and GitHub. Primary development will focus on Python & SQL, DataOps processes built within GitHub Actions & Dataiku, and data platform processes built within Snowflake & Dataiku.
Non-technical skills and learning are also critical, as you will collaborate with engineers from various teams and help deliver solutions across a wide variety of technical domains. The ideal candidate is naturally curious, has excellent verbal and written communication skills, a sharp analytical mind, a positive attitude towards work, and thrives when collaborating towards a shared goal.
This is an internal and non-client facing role.
What you’ll do:
Dataiku is unique in that every Dataiker is encouraged to use our own product within our Enterprise Data Platform. That means this is a unique opportunity to deliver a scalable platform with governed data to fuel an entire company of current or potential Data Analysts & Data Consumers! Your responsibilities within the team include but are not limited to:
-
Develop engineering expertise within the Dataiku Platform to help maintain and develop system integrations, platform automations, and platform configurations.
-
Develop engineering expertise within Snowflake for data engineering and security/governance features
-
Build & maintain python & SQL data replication & data pipelines on large & often complex data sets
-
Build & maintain data quality metrics & observability to help drive data quality standards
-
Learn about existing systems and processes across Data Platforms, Data Engineering and Data Governance
-
Troubleshoot data pipelines, platform automations, data access system.
-
Help field and troubleshoot various community questions and challenges
-
Own, maintain and enhance data operation processes, monitoring & data quality systems
-
Design data models for both short term and long term use cases to support data warehouse scalability
-
Build & maintain administration systems and applications for monitoring, alerting, data observability, access management, platform metrics, and end user transparency
-
Identify opportunities for improvements & optimization for greater scalability & delivery velocity
-
Collaborate closely with Analytics Engineers to provide data & data models for analytical deliverables
-
Perform root cause analysis on often complex errors to help ensure data pipeline availability
-
Help test new features in Dataiku and partner tools to both provide feedback internally as well as determine value towards internal analytics & data platform integration
-
Work closely with key stakeholders across the organization including Infra, embedded analytics teams, Product and Engineering to help foster both technical implementations & requirements gathering
-
Proactively drive innovation internally with bringing ideas for platform and process improvements
-
Help contribute to the ongoing documentation of internal systems and processes
Requirements:
-
2+ years of relevant experience in Data Engineering / Data Platform Engineering
-
Strong technical skills in SQL & Python are a must. Experience in Dataiku DSS is a big plus.
-
Prior experience with Snowflake a plus
-
Prior experience with DevOps technologies such as Github Actions, Azure DevOps or Jenkins
-
Experience in building data models
-
Prior experience building and maintaining replication & data pipelines in a cloud data warehouse or data lake environment
-
Excellent analytical and creative problem-solving skills - exhibit confidence to ask questions to bring clarity, share ideas, and challenge the norm.
-
Passion for continuous learning and teaching to help learn & teach new technologies & implementation strategies
-
Experience working with complex stakeholders; dissecting vague asks and helping to define tangible requirements
-
Ability to manage multiple projects and time constraints simultaneously in a high-trust remote environment
-
Ability to wear multiple hats depending on the project with the focus on accomplishing end goals while inspiring colleagues to do the same
-
Excellent written and verbal communication skills (especially with senior-level stakeholders) with the ability to speak to both the business value, data products, & technical capabilities of a platform. Ability to create clear and concise documentations with a high degree of precision