Until now we have discussed several aspects of the
Semantic Publishing Benchmark (SPB) such as the
difference in performance between virtual and real servers configuration, how to
choose an appropriate query mix for a benchmark run and our experience with using SPB in the
development process of GraphDB for finding performance issues.
In this post we provide a step-by-step guide on how to run SPB using the
Sesame RDF data store on a fresh
install of Ubuntu Server 14.04.1. The scenario is easy to adapt to other RDF triple
stores which support the Sesame Framework used for querying and
analyzing RDF data.
We start with a fresh server installation, but before proceeding with
setup of the Sesame Data Store and SPB benchmark we need the following
pieces of software up and running:
- Apache Ant 1.8 or higher
- OpenJDK 6 or Oracle JDK 6 or higher
- Apache Tomcat 7 or higher
If you already have these components installed on your machine you can
directly proceed to the next section: Installing Sesame
Following are sample commands which can be used to install the required
sudo apt-get install git sudo apt-get install ant sudo apt-get install default-jdk sudo apt-get install tomcat7
Optionally Apache Tomcat Server can be downloaded as a zipped file and
extracted in a location of choice.
After a successful installation of Apache Tomcat you should be able to
get the default splash page “It works” when you open your web browser
and enter the following address: http://<your_ip_address>:8080
We will use current Sesame version 2.7.14. You can download it here or run following command:
wget \\ "http://sourceforge.net/projects/sesame/files/Sesame%202/2.7.14/openrdf-sesame-2.7.14-sdk.tar.gz/download" \\ -O openrdf-sesame-2.7.14-sdk.tar.gz
Then extract the Sesame tarball:
tar -xvzf openrdf-sesame-2.7.14-sdk.tar.gz
To deploy sesame you have to copy the two war files that are in
openrdf-sesame-2.7.14/war to /var/lib/tomcat7/webapps
From openrdf-sesame-2.7.14/war you can do it with command:
cp openrdf-*.war <tomcat_install>/webapps
Sesame applications write and store configuration files in a single
directory and the tomcat server needs permissions for it. You can find
more information about this directory
By default the configuration directory is: /usr/share/tomcat7/.aduna
Create the directory:
sudo mkdir /usr/share/tomcat7/.aduna
Then change the ownership:
sudo chown tomcat7 /usr/share/tomcat7/.aduna
And finally you should give the necessary permissions:
sudo chmod o+rwx /usr/share/tomcat7/.aduna
Now when you go
You should get a screen like this:
You can download the SPB code and find brief documentation on GitHub:
A detailed documentation is located here:
SPB offers many configuration options which control various features of
the benchmark e.g.:
- query mixes
- dataset size
- loading datasets
- number of agents
- validating results
- test conformance to OWL2-RL ruleset
- update rate of agents
Here we demonstrate how to generate a dataset and execute a simple test
run with it.
First download the SPB source code from the repository:
git clone https://github.com/ldbc/ldbc_spb_bm.git
Then in the ldbc_spb_bm directory build the project:
If you simply execute the command:
you’ll get a list of all available build configurations for the SPB test
driver, but for the purpose of this step-by-step guide, configuration
shown above is sufficient.
Depending on generated dataset size a bigger java heap size may be
required for the Sesame Store. You can change it by adding following
arguments to Tomcat’s startup files e.g. in catalina.sh:
export JAVA_OPTS="-d64 -Xmx4G"
To run the Benchmark you need to create a repository in the Sesame Data
Store, similar to the following screenshot:
Then we need to point the benchmark test driver to the SPARQL endpoint of that repository. This is done in ldbc_spb_bm/dist/test.properties file.
The default value of datasetSize in the properties is set to be 10M,
but for the purpose of this guide we will decrease it to 1M.
You need to change
Also the URLs of the SPARQL endpoint for the repository
First step, before measuring the performance of a triple store, is to
load the reference-knowledge data, generate a 1M dataset, load it into
the repository and finally generate query substitution parameters.
These are the settings to do that, following parameters will ‘instruct’
the SPB test driver to perform all the actions described above:
#Benchmark Operational Phases loadOntologies=true loadReferenceDatasets=true generateCreativeWorks=true loadCreativeWorks=true generateQuerySubstitutionParameters=true validateQueryResults=false warmUp=false runBenchmark=false runBenchmarkOnlineReplicationAndBackup=false checkConformance=false
To run the benchmark execute the following:
java -jar semantic_publishing_benchmark-basic-standard.jar test.properties
When the initial run has finished, we should have a 1M dataset loaded
into the repository and a set of files with query substitution
Next we we will measure the performance of Sesame Data Store by changing
some configuration properties:
#Benchmark Configuration Parameters warmupPeriodSeconds=60 benchmarkRunPeriodSeconds=300 ... #Benchmark Operational Phases loadOntologies=false loadReferenceDatasets=false generateCreativeWorks=false loadCreativeWorks=false generateQuerySubstitutionParameters=false validateQueryResults=false warmUp=true runBenchmark=true runBenchmarkOnlineReplicationAndBackup=false checkConformance=false
After the benchmark test run has finished result files are saved in
There you will find three types of results: the result summary of the
benchmark run (semantic_publishing_benchmark_results.log), brief
results and detailed results.
In semantic_publishing_benchmark_results.log you will find the results
distributed per seconds. They should be similar to the listing bellow:
.Benchmark Results for the 300-th second
Seconds : 300 (completed query mixes : 0) Editorial: 2 agents 9 inserts (avg : 22484 ms, min : 115 ms, max : 81389 ms) 0 updates (avg : 0 ms, min : 0 ms, max : 0 ms) 0 deletes (avg : 0 ms, min : 0 ms, max : 0 ms) 9 operations (9 CW Inserts (0 errors), 0 CW Updates (1 errors), 0 CW Deletions (2 errors)) 0.0300 average operations per second Aggregation: 8 agents 2 Q1 queries (avg : 319 ms, min : 188 ms, max : 451 ms, 0 errors) 3 Q2 queries (avg : 550 ms, min : 256 ms, max : 937 ms, 0 errors) 1 Q3 queries (avg : 58380 ms, min : 58380 ms, max : 58380 ms, 0 errors) 2 Q4 queries (avg : 65250 ms, min : 40024 ms, max : 90476 ms, 0 errors) 1 Q5 queries (avg : 84220 ms, min : 84220 ms, max : 84220 ms, 0 errors) 2 Q6 queries (avg : 34620 ms, min : 24499 ms, max : 44741 ms, 0 errors) 3 Q7 queries (avg : 5892 ms, min : 4410 ms, max : 8528 ms, 0 errors) 2 Q8 queries (avg : 3537 ms, min : 546 ms, max : 6528 ms, 0 errors) 4 Q9 queries (avg : 148573 ms, min : 139078 ms, max : 169559 ms, 0 errors)
This step-by-step guide gave an introduction on how to setup and run the
SPB on a Sesame Data Store. Further details can be found in the
reference documentation listed above.
If you have any troubles running the benchmark, don’t hesitate to
comment or use our social media channels.
In a future post we will go through some of the parameters of SPB and
check their performance implications.