The LDBC benchmark data sets are stored under SURF’s CWI repositories.
The data sets are stored on tape, therefore, you may have to stage them before they can be downloaded. To do so, visit the repository of the data set and click “Request” for offline files. Staging a 20 GB file takes approx. 3-5 minutes, while staging a 200 GB one takes approx. 10-15 minutes.
To decompress, use zstd.
tar -xv --use-compress-program=unzstd file.tar.zst
We provide the download-data-set.sh script, which attempts to download the data set and stages it to disk if necessary. Replace the data_set_url with one of the URLs linked below in this README (right click and select Copy Link Address).
./download-data-set.sh data_set_url
Example:
./download-data-set.sh https://repository.surfsara.nl/datasets/cwi/snb/files/social_network-csv_basic-longdateformatter/social_network-csv_basic-longdateformatter-sf0.1.tar.zst