Skip to content

GraphDB

What is GraphDB?

From the GraphDB website:

GraphDB is a family of highly efficient, robust, and scalable RDF databases. It streamlines the load and use of linked data cloud datasets, as well as your own resources. For easy use and compatibility with the industry standards, GraphDB implements the RDF4J framework interfaces, the W3C SPARQL Protocol specification, and supports all RDF serialization formats.

Prerequisites

As a reference, the following are the recommended minimum hardware requirements to run the LUBM Benchmark for 1000 universities on GraphDB 9.10.0 and above.

Prerequisites
CPU 2.7 GHz 4-Core Intel Core i7
Disk Space 80 GB of free disk space is required for the test dataset download and triplestore
Memory 32 GB of RAM is required for the GraphDB server to run

Also check the GraphDB System Requirements.

Installing GraphDB

Download the GraphDB Enterprise Edition 60-day trial. This is the most scalable and resilient version of GraphDB.

Unzip its contents to a local folder of your preference. Note that this location will be used to set the value of the parameter -d (distribution directory) in the next section.

Check the prerequisites in the GraphDB README file. You may need to install a different version of Java.

Executing the Benchmark

The Benchmark for GraphDB is fully automated.

Run the graphdb-execute-benchmark.sh script, as per example below:

./graphdb-execute-benchmark.sh \
    -m "Xms2460m" -x "Xmx7400m" \
    -i preload -d ~/Triplestores/Downloads/graphdb-ee-9.10.0 \
    -s ~/Triplestores/Servers/GraphDB-9.10.0 \
    -u 1000 -f ntriple -t 1800 -c full \
    -p gzip

Usage:

./graphdb-execute-benchmark.sh \
    -m <min-heap-size> -x <max-heap-size> \
    -i <data-load-interface> -d <graphdb-download> \
    -s <graphdb-server> \
    -u <universities> -f <file-format> -t <query-timeout> -c <test-coverage> \
    -p <file-compression>
    -m min Java heap (min)
    -x max Java heap (opt)
    -i data import method; values are preload or loadrdf.
    -d location of the GraphDB download (distribution directory)
    -s location where the GraphDB server will run and database files be stored.
    -u number of universities
    -f file format (ntriple or turtle)
    -t query timeout in seconds
    -c test coverage (loadonly or full). Loadonly will only import the test dataset whereas Full will also execute the sparql queries.
    -p file compression (zip, gzip or none)

Recommended values for the GraphDB memory usage, parameter -m, can be found here.

NOTE: It is not recommended executing any memory-intensive applications while the benchmark is running.

Analysing Test Results

Data import, optimisation, and query timings will be logged to the results/query-timings/graphdb-import-optmise-query-timings.log file and the query outputs to the appropriate files in the results/query-results folder.

The logs are cleaned in the beginning of each run.

Environment Clean up

In the end of the benchmark execution, there is a clean-up task that removes the triplestore server, its databases and data files. The test result logs are kept for analysis, but cleaned in the beginning of each execution. The product download is not deleted.