Team |
Location |
ENGINEERING |
Remote |
What You Will Be Doing:
- Design, develop, and optimize data pipelines and architectures to support our composable CDP product.
- Implement and maintain ETL processes using dbt.
- Work with large datasets and develop scalable data models on Snowflake.
- Utilize Databricks for big data processing and machine learning workflows.
- Manage and optimize cloud infrastructure on AWS and GCP to ensure high performance and availability.
- Collaborate with cross-functional teams to understand data requirements and deliver solutions.
- Ensure data quality, governance, and compliance across all data pipelines.
- Implement best practices for data management, including data security and privacy.
- Lead and mentor junior data engineers, fostering a culture of continuous learning and improvement.
- Actively participate in the product development process as part of the core team.
- Publish product updates on Snowflake and Databricks marketplace.
Experience & Requirements:
- Database and Data Warehousing: Advanced experience with Snowflake, Databricks, and other data warehousing solutions.
- Data Transformation and ETL: Proficiency in dbt and experience in building and maintaining ETL pipelines.
- Cloud Platforms: Extensive experience with AWS and GCP,
- Programming Languages:
- Strong proficiency in Python programming language.
- Experience designing and implementing scalable and efficient data pipelines using Python and relevant libraries (such as Pandas, NumPy, SQLAlchemy, etc.).
- Solid understanding of ETL principles, data modeling, and schema design.
- Big Data Technologies: Experience with Apache Spark, Kafka, and other big data technologies.
- Data Integration and Automation: Familiarity with data integration and automation technologies (such as Airbyte, Apache Airflow, Luigi, etc.).