Knowledge in the world continuously evolves, and ontologies are largely incomplete for what concerns low-frequency data, belonging to the so-called long tail.
Socially produced content is an excellent source for discovering emerging knowledge: it is huge, and immediately reflects the relevant changes which hide emerging entities.
We propose a method and a tool for discovering emerging entities by extracting them from social media.
Once instrumented by experts through very simple initialization, the method is capable of finding emerging entities; we propose a mixed syntactic + semantic method. The method uses seeds, i.e. prototypes of emerging entities provided by experts, for generating candidates; then, it associates candidates to feature vectors, built by using terms occurring in their social content, and then ranks the candidates by using their distance from the centroid of seeds, returning the top candidates as result.
The method can be continuously or periodically iterated, using the results as new seeds.
The full paper presented at WWW 2017 on this topic is available online (open access with Creative Common license). You can also check out the slides of my presentation on Slideshare.
An demo version of the tool is available online for free use, thanks also to our partners Dandelion and Microsoft Azure.
TRY THE TOOL NOW!