Research Report, 4 pages max.
The goal of this assignment is to:
- Get some hands-on experience with instance matching
- Think about missing data, semantics and statistics
- Download, install and setup R version 2.15 (unfortunately this version of R is not yet available on the VU lab computers) and Rstudio. We have made a download package available for Windows and OS X at https://www.dropbox.com/s/6e71pzpkgtjmtdp/Setup.zip
- Download the tutorial and data from: https://github.com/wrvhage/LinkedScienceTutorial/archive/master.zip and do the Linked Open Piracy (LOP) tutorial. You will find additional instructions in the README.md file.
- Load the LOP data and extra GeoNames data about continents into a triple store, for example, Jena Fuseki.
- Create a SPARQL query that connects SEM event types described with WordNet 3.0 synsets to the continent containing the SEM place of the event, described in GeoNames.
- Create a visualization (e.g., pie chart) of all piracy events (WordNet synset wn30:synset-piracy-noun-1), aggregated per GeoNames continent (http://www.geonames.org/ontology#L.CONT).
- Think of a Linked Science experiment: The data set you just used is linked to WordNet and GeoNames (amongst others), can you think of other data sets that you could link to and a corresponding research question that you could then answer? Go into detail about which URIs would have to be aligned with which other URIs and which alignment relations should be used. Explain what the SPARQL query that answers your research question would look like if this alignment would be made.
- Not all piracy events have been reported to the International Chamber of Commerce, and therefore do not appear in the LOP data set. The missing events could be very similar or very different from the events currently in the data set. a) When and how would the schema of the LOP data set have to change should very different events be added? b) What kind of impact could such additions have on the conclusions drawn from the data set?
Monday 3 December, 23:59 CET