Data Science Group – Politecnico di Milano


The Web and Data Science course focuses on the study of large-scale socio-technical systems associated with the World Wide Web. It considers the relationship between people and technology, the ways that society and technology complement one another and the way they impact on broader society. These analyses are inherently associated with Big Data management issues.

The course is given in Como Campus by Marco Brambilla and Emanuele Della Valle.

Up-to-date calendar of the course on Google Docs: calendar

Official course page on Polimi site: web page

This is a (possibly partial) list of course materials related to the course:

LESSON 1. IntroductionScenarios (1) and Scenarios (2), Exam and project rules



LESSON . Introduction to Big Data and basics of hadoop, hdfs, pig, and hive

LESSON . Hadoop, BDAS, Spark. Resources

LESSON . Spark in practice

LESSON . Clustering

LESSON . Classification

LESSON . Recommendations

LESSON . Web API, Rest API and Scraping for Web data collection. Including Source code of examples (ZIP).

LESSON . Data Wrangling and Data Cleansing

LESSON . Web Search foundations

LESSON . Human Computation and Crowdsourcing

LESSON . Semantic Web and RDF and exercise


LESSON . RDF-S and OWL practical cases. Solutions

LESSON . SPARQLexamples, and putting it all together

Additional resources:

Guidelines for the exam