LUBM Extended Version¶
agnos.ai has extended the original version of the LUBM queries to meet more realistic business-like workloads and use case scenarios. It has also introduced data updates as well as powerful inference rules to optimise queries.
This section explains the reasons why agnos.ai chose the LUBM as its starting point for its benchmark journey and the reason why it needed to be extended. This section also explains the steps required to execute the actual benchmark in detail.
Why did we choose LUBM?¶
- it is the most popular RDF Graph database benchmark with results published by most vendors
- it supports reasoning, one of the main selling points of RDF Graph databases
- has an intuitive ontology
- test datasets are easy to generate, and queries straightforward to extend
- tests with relatively small datasets (134M triples/24GB) allows to expose significant performance bottlenecks without the need for very large datasets or extensive high volume concurrent workloads, which can be time-consuming and expensive to set up
- can easily reveal performance differences between triplestores with fundamental working differences, i.e. forward vs backward chaining, total materialisation vs query time vs incremental reasoning, etc.
What was missing in the original LUBM Benchmark?¶
- queries lacked complexity in order to cover some of the most basic features of the query language
- did not have tests for data modification, including data movement between named graphs
- despite of having support for reasoning, it did not include inference rules
- did not cover realistic business-like use cases
- did not include basic business intelligence use cases with large data aggregations and sorting
What is in the agnos.ai extended LUBM¶
The original LUBM Benchmark has been extended by agnos.ai engineers to address the shortcomings listed above and to allow for more realistic performance test case scenarios.
The new version of the benchmark contains the following:
- 14 original queries that came with LUBM
- 14 modified queries extended by agnos.ai to add complexity and cover realistic business-like use cases scenarios
- 2 inference rules to optimise the modified queries 2 and 9
- 4 new update queries to test data modification, which is very important to check how fast inferences are calculated on materialisation-based reasoning triplestores
- 4 new select queries that read the data affected by the update queries, including reading data that has been saved to named graphs
What is next for the agnos.ai extended LUBM¶
- add more complex workload scenarios
- add more metrics and automate its collection
- add
deleteuse cases to test for performance and inference retraction (deletion of triples that are no longer justified)
Refer to Executing the LUBM Extended Version for more details.