Senior Data Engineer
Prysmian Visualizza tutti gli annunci
- Milano
- Tempo indeterminato
- Full time
- Design, develop, and maintain scalable ETL/ELT pipelines using Databricks and Apache Spark.
- Ingest, transform, and process structured and unstructured data across batch and streaming workflows.
- Implement data processing workflows leveraging Delta Lake for ACID reliability, schema enforcement, and incremental ingestion.
- Optimize pipeline performance, troubleshoot and tune Spark jobs for scalability and reliability.
- Monitor, maintain, and evolve existing pipelines and data models, ensuring performance and operational health.
- Apply data quality techniques, validation, transformation logic, and lineage management to ensure trusted data delivery.
- Partner with data scientists, analysts, architects and business stakeholders to translate business requirements into technical pipelines.
- Implement and maintain CI/CD pipelines for Databricks notebooks, jobs, and workflows.
- Work with cloud‑native services (AWS/Azure depending on environment) for orchestration, storage, ingestion and monitoring.
- Automate processes where possible to improve reliability and reduce manual operations.
- Extensive experience in data engineering, with at least 2 years working hands-on with Databricks or Spark based platforms.
- Strong programming experience in Python/PySpark and solid SQL for transformations and data modeling
- Experience with Delta Lake, data lake/lakehouse architectures, and modern cloud data engineering
- Knowledge of CI/CD tools, Git, workflow orchestration, and DataOps best practices.
- Experience with ETL/ELT design and cloud pipelines (Azure Data Factory, AWS Glue, GCP Dataflow, etc.).
- Strong understanding of data warehousing, data modeling, and pipeline reliability patterns.
- Experience with Unity Catalog, lineage, and governance in Databricks environments.
- Exposure to streaming technologies (Kafka, Event Hubs, Kinesis).
- Familiarity with DevOps tools (Docker, Kubernetes) for data engineering workflows.
- Databricks certifications (Data Engineer Associate/Professional, Spark Developer).
- Experience integrating machine learning workloads or MLflow.