Until now we have discussed several aspects of the Semantic Publishing Benchmark (SPB) such as the difference in performance between virtual and real servers configuration, how to choose an appropriate query mix for a benchmark run and our experience with using SPB in the development process of GraphDB for finding performance issues.

In this post we provide a step-by-step guide on how to run SPB using the Sesame RDF data store on a fresh install …

Semantic Publishing Instance Matching Benchmark


The Semantic Publishing Instance Matching Benchmark (SPIMBench) is a novel benchmark for the assessment of instance matching techniques for RDF data with an associated schema. SPIMBench extends the state-of-the art instance matching benchmarks for RDF data in three main aspects: it allows for systematic scalability testing, supports a wider range of test cases including semantics-aware ones, and provides an enriched gold standard.

The SPIMBench …

We are presently working on the SNB BI workload. Andrey Gubichev of TU Munchen and myself are going through the queries and are playing with two SQL based implementations, one on Virtuoso and the other on Hyper.

As discussed before, the BI workload has the same choke points as TPC-H as a base but pushes further in terms of graphiness and query complexity.

There are obvious marketing applications for a SNB-like dataset. There are also security …

Sizing AWS Instances for the Semantic Publishing Benchmark


LDBC’s Semantic Publishing Benchmark (SPB) measures the performance of an RDF database in a load typical for metadata-based content publishing, such as the famous BBC Dynamic Semantic Publishing scenario. Such load combines tens of updates per second (e.g. adding metadata about new articles) with even higher volume of read requests (SPARQL queries collecting recent content and data to generate web page on a specific subject, e.g. Frank …

In previous posts (Getting started with snb, DATAGEN: data generation for the Social Network Benchmark), Arnau Prat discussed the main features and characteristics of DATAGEN: realism, scalability, determinism, usability. DATAGEN is the social network data generator used by the three LDBC-SNB workloads, which produces data simulating the activity in a social network site during a period of time. In this post, we conduct a series of experiments …

SNB Driver - Part 1


In this multi-part blog we consider the challenge of running the LDBC Social Network Interactive Benchmark (LDBC SNB) workload in parallel, i.e. the design of the workload driver that will issue the queries against the System Under Test (SUT). We go through design principles that were implemented for the LDBC SNB workload generator/load tester (simply referred to as driver). Software and documentation for this driver is available here: …