
ETL Pipeline: NYC Taxi Dataset
Live
This project implements an automated ETL (Extract, Transform, Load) pipeline using Apache Airflow to manage, schedule, and monitor data workflows, enabling reliable data ingestion into databases or data warehouses. After the ETL process, the project also includes data analysis on the NYC Taxi Trip dataset, with a dedicated notebook to explore insights, trends, and metrics derived from the processed data.
Stack used
DockerPythonApache AirflowNumpyPandasPostgreSQL