In my final project for the Computational Tools for Big Data course course (Technical University of Denmark, AY 2016/17), I utilized Apache Spark, Neo4j, Pandas DataFrames, and SQL to perform extensive data mining on the massive Amazon Product Reviews Dataset. This project explored the potential and challenges of storing and processing inconveniently large datasets using big data technologies.
Project reportSource code #
lccambiaghi/Amazon-Products-and-Reviews-Analysis
Jupyter Notebook
0
0