LDBC SNB provides a data generator, which produces synthetic datasets, mimicking a social network’s activity during a period of time. Datagen is defined by the charasteristics of realism, scalability, determinism and usability. More than two years have elapsed since my last technical update on LDBC SNB Datagen, in which I discussed the reasons for moving the code to Apache Spark from the MapReduce-based Apache Hadoop implementation and the …

Speeding Up LDBC SNB Datagen


LDBC’s Social Network Benchmark [4] (LDBC SNB) is an industrial and academic initiative, formed by principal actors in the field of graph-like data management. Its goal is to define a framework where different graph-based technologies can be fairly tested and compared, that can drive the identification of systems’ bottlenecks and required functionalities, and can help researchers open new frontiers in high-performance graph data …

LDBC and Apache Flink


Apache Flink [1] is an open source platform for distributed stream and batch data processing. Flink’s core is a streaming dataflow engine that provides data distribution, communication, and fault tolerance for distributed computations over data streams. Flink also builds batch processing on top of the streaming engine, overlaying native iteration support, managed memory, and program optimization.

Flink offers multiple APIs to process data …

In this post we will look at running the LDBC SNB on Virtuoso.

First, let’s recap what the benchmark is about:

  1. fairly frequent short updates, with no update contention worth mentioning

  2. short random lookups

  3. medium complex queries centered around a person’s social environment

The updates exist so as to invalidate strategies that rely too heavily on precomputation. The short lookups exist for the sake of realism; after all, an …

SNB and Graphs Related Presentations at GRADES '15


Next 31st of May the GRADES workshop will take place in Melbourne within the ACM/SIGMOD presentation. GRADES started as an initiative of the Linked Data Benchmark Council in the SIGMOD/PODS 2013 held in New York.

Among the papers published in this edition we have “Graphalytics: A Big Data Benchmark for Graph-Processing Platforms”, which presents a new benchmark that uses the Social Network Benchmark data generator of LDBC (that can …