Despite being such a relatively recent issue most schools do not teach about it, which is why we consider young people ackowledge it.
As the spanish philosopher George Santayana said “Those who don’t know history are destined to repeat it.” which is what we are trying to prevent
We thought that the best and easiest way to really see the impact was through charts and maps, thus our results are mostly shown as maps and charts.
Pyspark for data manipulation, Matplotlib/Seaborn for data visualization. Plotly for generating interactive maps.
Hadoop cluster composed of four EC2 AWS machines (AWS EMR), Shared S3 Bucket to store the dataset, additional data and results generated in the process.
For version control.
This project was made in 2020 for the class Cloud and Big Data.