Etl
- Data Pipelines
—
data-pipelines
,
etl
,
streaming
Architectural principles for reliable batch and streaming data pipelines; focusing on strict time semantics, exactly-once processing, optimal partitioning, observability, and reproducible states.
- Spark Trial
—
data-pipelines
,
etl
,
monitoring
and +1 more
An intensive end-to-end ETL processing example leveraging Apache Spark for large-scale parquet datasets; deeply focusing on strict schema handling, optimal partitioning, and reproducible aggregations.