When talking about DATAGEN and other graph generators with social network characteristics, our attention is typically borrowed by the friendship subgraph and/or its structure. However, a social graph is more than a bunch of people being connected by friendship relations, but has a lot more of other things is worth to look at. With a quick view to commercial social networks like Facebook, Twitter or Google+, one can easily identify a lot of other elements such as text images or even video assets. More importantly, all these elements form other subgraphs within the social network!
The LDBC consortium are pleased to announce its Sixth Technical User Community (TUC) meeting.
This will be a two-day event at Universitat Politècnica de Catalunya, Barcelona on Thursday and Friday March 19/20, 2015.
The 5th LDBC Technical User Community (TUC) meeting took place in Athens on 14.11.2014 being well attended by both Graph and RDF databases industry and academia. In the morning session, members of the LDBC project gave an update on the status of the project and its benchmarks:
The SNB Driver part 1 post introduced, broadly, the challenges faced when developing a workload driver for the LDBC SNB benchmark. In this blog we'll drill down deeper into the details of what it means to execute "dependent queries" during benchmark execution, and how this is handled in the driver. First of all, as many driver-specific terms will be used, below is a listing of their definitions. There is no need to read them in detail, it is just there to serve as a point of reference.
The Semantic Publishing Instance Matching Benchmark (SPIMBench) is a novel benchmark for the assessment of instance matching techniques for RDF data with an associated schema. SPIMBench extends the state-of-the art instance matching benchmarks for RDF data in three main aspects: it allows for systematic scalability testing, supports a wider range of test cases including semantics-aware ones, and provides an enriched gold standard.
We are presently working on the SNB BI workload. Andrey Gubichev of TU Munchen and myself are going through the queries and are playing with two SQL based implementations, one on Virtuoso and the other on Hyper.
As discussed before, the BI workload has the same choke points as TPC-H as a base but pushes further in terms of graphiness and query complexity.
In previous posts (Getting started with SNB, DATAGEN: data generation for the Social Network Benchmark), Arnau Prat discussed the main features and characteristics of DATAGEN: realism, scalability, determinism, usability. DATAGEN is the social network data generator used by the three LDBC-SNB workloads, which produces data simulating the activity in a social network site duri