datagen

The LDBC Datagen Community Structure

Tags:
DATAGEN , SOCIAL NETWORK , SNB

This blog entry is about one of the features of DATAGEN that makes it different from other synthetic graph generators that can be found in the literature: the community structure of the graph.

When generating synthetic graphs, one must not only pay attention to quantitative measures such as the number of nodes and edges, but also to other more qualitative characteristics such as the degree distribution, clustering coefficient. Real graphs, and …

When talking about DATAGEN and other graph generators with social network characteristics, our attention is typically borrowed by the friendship subgraph and/or its structure. However, a social graph is more than a bunch of people being connected by friendship relations, but has a lot more of other things is worth to look at. With a quick view to commercial social networks like Facebook, Twitter or Google+, one can easily identify a lot of other …

Social Network Benchmark Goals

Tags:
SNB , DATAGEN , INTERACTIVE , BI , GRAPHALYTICS

Social Network interaction is amongst the most natural and widely spread activities in the internet society, and it has turned out to be a very useful way for people to socialise at different levels (friendship, professional, hobby, etc.). As such, Social Networks are well understood from the point of view of the data involved and the interaction required by their actors. Thus, the concepts of friends of friends, or retweet are well established …

DATAGEN: Data Generation for the Social Network Benchmark

Tags:
DATAGEN , SOCIAL NETWORK , SNB

As explained in a previous post, the LDBC Social Network Benchmark (LDBC-SNB) has the objective to provide a realistic yet challenging workload, consisting of a social network and a set of queries. Both have to be realistic, easy to understand and easy to generate. This post has the objective to discuss the main features of DATAGEN, the social network data generator provided by LDBC-SNB, which is an evolution of S3G2 [1].

One of the most …

Getting Started With SNB

Tags:
SNB , INTERACTIVE , DATAGEN

In a previous blog post titled “Is SNB like Facebook’s LinkBench?”, Peter Boncz discusses the design philosophy that shapes SNB and how it compares to other existing benchmarks such as LinkBench. In this post, I will briefly introduce the essential parts forming SNB, which are DATAGEN, the LDBC execution driver and the workloads.

DATAGEN

DATAGEN is the data generator used by all the workloads of SNB. Here we introduced the …

SNB Data Generator - Getting Started

Tags:
DATAGEN , SNB , SOCIAL NETWORK

In previous posts (this and this) we briefly introduced the design goals and philosophy behind DATAGEN, the data generator used in LDBC-SNB. In this post, I will explain how to use DATAGEN to generate the necessary datatsets to run LDBC-SNB. Of course, as DATAGEN is continuously under development, the instructions given in this tutorial might change in the future.

Getting and Configuring Hadoop

DATAGEN runs on top of hadoop 1.2.1 to be scale. …