Executing the LUBM Extended Version¶

Summary¶

This page contains details on how the LUBM Extended Version is set and executed.

Execution Steps¶

The execution of the LUBM Benchmark has been automated and include the following steps:

Install and configure each triplestore for optimum performance for each task to be tested (data import, inference materialisation, data reading and writing, etc)
Import the LUBM test dataset using the most efficient method available in each triplestore product
Materialise Ontology Inferences when applicable
Execute the 14 original queries supplied with the LUBM Benchmark (10 with reasoning, and 4 without)
Import Rules to optimise queries 2 and 9
Run extended (more complex) version of read queries
Run update queries followed by select ones that read the affected data

Triplestore Installation¶

Install the latest versions of each triplestore product to be tested.

Server Configuration¶

Configure servers with settings for optimum bulk load and query reading performance for each dataset size to be tested.

Triplestore Configuration¶

Create triplestores with settings for optimum bulk load and query reading and writing performance when appropriate, for example, disable update statistics when loading data, setting expected triplestore resource capacity, index size, persist to disc, Java heap and direct memory configurations, etc.

Reasoning¶

10 out of 14 queries in the LUBM benchmark rely on reasoning, which are appropriately identified in the test results. Whereas some triplestores use materialisation-based reasoning to precompute and store the inferred triples, others will compute reasoning at query time.

Details¶

Loading Data¶

Tests are executed for each combination of file format (NTRIPLE and TURTLE) and file compression (uncompressed, ZIP, GZIP, and BZ2). For LUBM 1000, the data is provided in one single file. However, tests can be also done for data split across smaller files.

The “auto update statistics” setting is initially disabled in all triplestores prior to loading the data to improve performance.

Importing the Ontology¶

Import the LUBM Ontology with 168 axioms. This step can take a while on triplestores that use materialisation-based reasoning to precompute and store the inferred triples.

Updating Statistics¶

Statistics update after the data import for optimum query performance.

Executing Original Test Queries¶

Execute the original test queries provided by LUBM. Refer to the queries listed under the "Original" tab on the LUBM SPARQL Statements page.

Materialising Rules¶

Rules are not included as part of the original LUBM Benchmark. They were added by agnos.ai to improve the performance of queries 2 and 9. This step only applies to triplestores that use materialisation-based reasoning.

More details on the rules can be found here.

Executing the Extended Test Queries¶

Execute the extended version of the test queries introduced by agnos.ai.

Refer to the queries listed under the "Extended" tab on the LUBM SPARQL Statements page.

Enabling Auto Update Statistics¶

The triplestores were created with the “auto update statistics” option turned off in order to maximise performance during the data import. This option is turned on before executing the data updates in the remaining steps of the benchmark, as it may trigger at different circumstances for each triplestore product during data modification and certainly have an impact on the writing performance.

Executing Data Updates¶

The original LUBM Benchmark did not have tests for data modification. It’s important to test for updates to verify the impact on statistics update as well as the overhead of inference materialisation.