ABSTAT

Short Description


ABSTAT (ABstraction and STATistics over Linked Data) is a web application conceived, designed and developed by the Department of Informatiocs, Systems and Communication (DISCO) of the University of Milano – Bicocca. ABSTAT framework, accessible at http://abstat.disco.unimib.it aims at providing a better understanding of big and complex datasets by extracting summaries of linked data sets based on an ontology-driven data abstraction model. It takes as input a data set and an ontology and returns as output an ontology-driven data summary. A summary is aimed at providing a compact but complete representation of a data set. With complete representation we refer to the fact that every relation between concepts that is not in the summary can be inferred. The summary is composed of a set of Abstract Knowledge patterns and statistics. What kind of patterns do we represent in the summary? One distinguishing feature of ABSTAT is that it adopts a minimalization mechanism based on minimal type patterns. Abstract Knowledge Patterns (AKPs) are abstract representations of Knowledge Patterns. Instead of representing every AKP occurring in the data set, ABSTAT summaries include only a base of minimal type patterns, i.e., a subset of the patterns such that every other pattern can be derived using a subtype graph.

Details

Suppose we have the following subtype graph. Consider the two instances; George Clooney and Jim Brown. Supose in the dataset we have a triple saying that George Clooney hasWife Amal Clooney; George Clooney is of type Artist; and another triple saying that Amal Clooney is of type Lawyer. So for the first triple we consider in ABSTAT the minimal type of subject and the minimal type of object. So what is the minimal type for the subject George Clooney? The two types we have in the grah are Type Person and type Artist. The minimal type between these two types is Artist, because Person is a subtype of Artist as we can see from the graph. The same heuristic is applied in the object. We have in the graph that Amal Clooney is of type Lawyer and of type Person. But type Lawyer is minimal because Person is the subtype of Lawyer; thus in this case ABSTAT would extract patterns like: . Consider the other example: We have a triple in the dataset saying that Jim Brown birthdate “Literal”XMLSchema#Date. In the same way as in the previous example we extract the minimal type of subject and the minimal type of object. Jim Brown has two types: one type FootballPlayer and one is type Person. Then from the subtype graph we know that Person is subtype of FootballPlayer, and between these two FootballPlayer is the minimal type. In this example we have a datatype property birthDate. Then in this case we would have in the summary of ABSTAT the AKP <FootballPlayer, birthdate, XMLSchema#Date> Including only the minimal types, ABSTAT is able to exclude many redundant AKPs from the summary, and all these AKP can be inferred by the subtype graph. For example from the AKP: we could inferr the AKP , as Person is the subtype for Artist and Lawyer. In the same way we can infer all the other AKPs. ABSTAT has three user interface. The browsing interface help users browse for a given dataset, its types, properties and its AKPs in the summary, and it also returns the statistics for every element in the summary. ABSTATSearch implements a full-text search functionality over a set of summaries. Types, properties and patterns are represented by means of their local names (e.g., Person , birthPlace or Person birthPlace Country) . While the SPARQL endpoint allows users to execute SPARQL queries to the summary.

Reference

ABSTAT won the “Best Paper Award” at the Workshop of SUMPRE 2016 (http://km.aifb.kit.edu/ws/sumpre2016/) colocated with ESWC 2016, May 31st, 2016 For more details we invite the reader to refer to two published articles about ABSTAT: Matteo Palmonari, Anisa Rula, Riccardo Porrini, Andrea Maurino, Blerina Spahiu, Vincenzo Ferme: ABSTAT: Linked Data Summaries with ABstraction and STATistics. ESWC (Satellite Events)2015: 128-132 (http://link.springer.com/chapter/10.1007%2F978-3-319-25639-9_25) Blerina Spahiu, Riccardo Porrini, Matteo Palmonari, Anisa Rula, Andrea Maurino: ABSTAT: Ontology-Driven Linked Data Summaries with Pattern Minimalization. ESWC (Satellite Events) 2016: 381-395 (http://link.springer.com/chapter/10.1007%2F978-3-319-47602-5_51)

Demo

http://abstat.disco.unimib.it