-
Notifications
You must be signed in to change notification settings - Fork 56
First tests with SPARQL 1.1 Update
As of 15.11.2024, QLever has basic support for SPARQL 1.1 Update, see https://github.com/ad-freiburg/qlever/wiki/QLever-support-for-SPARQL-1.1-Update . The following is a log of a first performance test, carried out a few days before the last missing pull request was merged into the QLever master.
To test the functionality, you need the access token used for starting the server. This is needed for all privileged operations, which an update of course is (normal users should only have read-only access, and not be able to modify the dataset). The following tests assume access-token=wikidata_R8VkeqQRYnlW
, which is not the access token for the official instance at https://qlever.cs.uni-freiburg.de/wikidata.
The tests were run on our local machine indus
(AMD Ryzen 9 7950X 16-Core, 7.1T NVMe SSD) on wikidata.ssd/index.2024-10-31
.
In the current (first) version, updates are processed at a speed of around 2 µs / triple
. For each 1%
change in the input data, query times slow down by about 2%
.
Not bad for starters + there is still a lot of room improvement regarding both figures.
Each of the following updates or queries clears the cache before executing and outputs the elapsed time in seconds. In addition, each update outputs "Update successful" and each query outputs the result size. Each update also clears the cache after executing.
Let us first start the server from scratch. This takes around 8s
.
qlever start --kill-existing-with-same-port
The first query ask for the original size of the wdt:P31
predicate. It has 117,115,625
triples and the query takes around 1.5s
qlever clear-cache 2> /dev/null && /usr/bin/time -f "Elapsed time: %es" curl -s https://qlever.cs.uni-freiburg.de/api/wikidata-prut -H "Accept: application/qlever-results+json" -d "send=0" --data-urlencode "query=PREFIX wdt: <http://www.wikidata.org/prop/direct/> SELECT ?s ?o WHERE { ?s wdt:P31 ?o }" | jq .resultsize | numfmt --grouping
The second query computes 17,115,625
random triples from wdt:P31
. It takes around 11s
.
qlever clear-cache 2> /dev/null && /usr/bin/time -f "Elapsed time: %es" curl -s https://qlever.cs.uni-freiburg.de/api/wikidata-prut -H "Accept: application/qlever-results+json" -d "send=0" --data-urlencode "query=PREFIX wdt: <http://www.wikidata.org/prop/direct/> SELECT ?s ?o WHERE { ?s wdt:P31 ?o } ORDER BY RAND() LIMIT 17115625" | jq .resultsize | numfmt --grouping
The third query removes 17,115,625
random triples from wdt:P31
. It takes around 40s
, depending on the random distribution of the triples over the predicate. Subtracting the time for the previous query, we can deduce that the update took around 30s
for 17,115,625
, which is roughly 2 µs / triple
or 0.5 M triples / second
. Not bad for starters (there is still a lot of room for optimization over what we are currently doing).
qlever clear-cache 2> /dev/null && /usr/bin/time -f "\nElapsed time: %es" curl -s -H "Accept: application/qlever-results+json" https://qlever.cs.uni-freiburg.de/api/wikidata-prut --data-urlencode "update=PREFIX wdt: <http://www.wikidata.org/prop/direct/> DELETE { ?s wdt:P31 ?o } WHERE { { SELECT ?s ?o WHERE { ?s wdt:P31 ?o } ORDER BY RAND() LIMIT 17115625 } }" --data-urlencode "access-token=wikidata_R8VkeqQRYnlW" && qlever clear-cache 2> /dev/null
The fourth query is a repetition of the first query. It now says that the wdt:P31
predicate has 100,000,000
triples, which is correct. It takes around 2.0s
, that is, around 30%
more than the first query. This additional time is the overhead currently incurred by considering the delta triples. Note that the update affected around 15%
of the data relevant for this query, which is a lot. Typical update queries affect only a small fraction of the data.
qlever clear-cache 2> /dev/null && /usr/bin/time -f "Elapsed time: %es" curl -s https://qlever.cs.uni-freiburg.de/api/wikidata-prut -H "Accept: application/qlever-results+json" -d "send=0" --data-urlencode "query=PREFIX wdt: <http://www.wikidata.org/prop/direct/> SELECT ?s ?o WHERE { ?s wdt:P31 ?o }" | jq .resultsize | numfmt --grouping