The Data Science Group at Politecnico di Milano is concerned with the study of all aspects of data science – seen as a sound scientific discipline – and then develop methods, tools, technologies and applications for its effective deployment in real world problems.

The main scientific interests are currently concerned with crowdsourcing, data extraction and scraping, streaming data management, social engagement, extraction of emerging knowledge from social content, user-centered data integration and exploration.

From a didactic point of view, the group is promoting the data-shack project, an innovative project jointly managed by DEIB at the Politecnico di Milano and by the Institute for Applied Computational Science (IACS) at Harvard’s John A. Paulson School of Engineering and Applied Sciences (SEAS). Additional initiatives will include summer courses.

From a technological point of view, the group is developing tools covering most of the above topics, with special emphasis on crowdsourcing and stream data management. In the context of genomic computing, the group studies cloud computing and parallel processing for big data, in particular by adopting and comparing the Pig, Flink and Spark platforms.

From an application point of view, the group is concerned with a variety of problems, ranging from smart cities, to event management, to social content monitoring, to social engagement. The most important application is genomic computing, covered by a cooperating group consisting of more than ten scientists.