
The number of datasets published in the Web of Data as part of the Linked Data Cloud is constantly increasing. The Linked Data paradigm is based on the unconstrained publication of information by different publishers, and the interlinking of web resources through “same-as” links which specify that two URIs correspond to the same real world object. In the vast number of data sources participating in the Linked Data Cloud, this information is not …

In this post we will look at running the LDBC SNB on Virtuoso.

First, let’s recap what the benchmark is about:

  1. fairly frequent short updates, with no update contention worth mentioning

  2. short random lookups

  3. medium complex queries centered around a person’s social environment

The updates exist so as to invalidate strategies that rely too heavily on precomputation. The short lookups exist for the sake of realism; after all, an …

SNB and Graphs Related Presentations at GRADES '15


Next 31st of May the GRADES workshop will take place in Melbourne within the ACM/SIGMOD presentation. GRADES started as an initiative of the Linked Data Benchmark Council in the SIGMOD/PODS 2013 held in New York.

Among the papers published in this edition we have “Graphalytics: A Big Data Benchmark for Graph-Processing Platforms”, which presents a new benchmark that uses the Social Network Benchmark data generator of LDBC (that can …

SNB Interactive Part 2: Modeling Choices


​SNB Interactive is the wild frontier, with very few rules. This is necessary, among other reasons, because there is no standard property graph data model, and because the contestants support a broad mix of programming models, ranging from in-process APIs to declarative query.

In the case of Virtuoso, we have played with SQL and SPARQL implementations. For a fixed schema and well known workload, SQL will always win. The reason for this is that …

LDBC Participates in the 36th Edition of the ACM SIGMOD/PODS Conference


LDBC is presenting two papers at the next edition of the ACM SIGMOD/PODS conference held in Melbourne from May 31st to June 4th, 2015. The annual ACM SIGMOD/PODS conference is a leading international forum for database researchers, practitioners, developers, and users to explore cutting-edge ideas and results, and to exchange techniques, tools and experiences.

On the industry track, LDBC will be presenting the Social Network Benchmark Interactive …

This post is the first in a series of blogs analyzing the LDBC Social Network Benchmark Interactive workload. This is written from the dual perspective of participating in the benchmark design and of building the OpenLink Virtuoso implementation of same.

With two implementations of SNB interactive at four different scales, we can take a first look at what the benchmark is really about. The hallmark of a benchmark implementation is that its …

Why Do We Need an LDBC SNB-Specific Workload Driver?


In a previous 3-part blog series we touched upon the difficulties of executing the LDBC SNB Interactive (SNB) workload, while achieving good performance and scalability. What we didn’t discuss is why these difficulties were unique to SNB, and what aspects of the way we perform workload execution are scientific contributions - novel solutions to previously unsolved problems. This post will highlight the differences between SNB and more …

Event Driven Post Generation in Datagen


As discussed in previous posts, one of the features that makes Datagen more realistic is the fact that the activity volume of the simulated Persons is not uniform, but forms spikes. In this blog entry I want to explain more in depth how this is actually implemented inside of the generator.

First of all, I start with a few basics of how Datagen works internally. In Datagen, once the person graph has been created (persons and their relationships), …

The LDBC Datagen Community Structure


This blog entry is about one of the features of DATAGEN that makes it different from other synthetic graph generators that can be found in the literature: the community structure of the graph.

When generating synthetic graphs, one must not only pay attention to quantitative measures such as the number of nodes and edges, but also to other more qualitative characteristics such as the degree distribution, clustering coefficient. Real graphs, and …