26.Nov 2025

Prototyping journey Insights: QLever — Making Data Smarter, Faster, and More Sustainable

As our world becomes increasingly data-driven, the way information moves between systems matters as much as the data itself. Today, when graph databases exchange information using SPARQL — the standard query language of the Semantic Web — the process is surprisingly inefficient. Every exchange of federated data involves sending long text strings over the network, creating massive overhead, latency, and energy use.

The QLever project, supported by the Prototype Fund, set out to change that. Their mission: to make SPARQL federation — and by extension, open data exchange — dramatically faster and more sustainable.

“We realised that a lot of energy and time was being wasted not in the computation itself, but simply in how data was represented and transmitted,” explains the QLever team. “So we asked: what if we could make that exchange more compact — without changing how people use SPARQL?”


From Text to Binary: Rethinking Data Exchange

The team developed a new binary mapping mechanism that replaces long text strings with compact 64-bit integer identifiers. A shared lookup table keeps track of which number represents which string.

This simple but powerful shift turns SPARQL communication from text-based to binary, slashing network traffic and speeding up data transfer — especially in cases where the same identifiers appear repeatedly. The result: less bandwidth, lower latency, and reduced energy use.

“In essence, we taught the databases to ‘speak in numbers’ rather than words,” says the team. “It’s faster, lighter, and much more sustainable.”


Beyond Federation: A Broader Impact

Originally, this innovation was designed for federated queries in life sciences datasets such as UniProt and Rhea. But once tested on large public data collections like OpenStreetMap and Wikidata, the team discovered that the approach had far wider applications.

“That was the breakthrough moment,” they recall. “We realised that our method didn’t just make SPARQL federation faster — it improved QLever’s core storage and indexing efficiency too.”

This insight prompted a major pivot: instead of treating the new mapping as an experimental add-on, the team decided to integrate it directly into QLever’s main engine, bringing benefits to all users much sooner than planned.


Breaking Records — Sustainably

The impact was immediate and measurable. Using the new system, QLever indexed 800 billion OpenStreetMap triples in just 5.4 terabytes of disk space — roughly an order of magnitude smaller than commercial systems such as AWS Neptune, which requires about 128 TB for 500 billion triples.

This wasn’t just a technical win; it was proof that efficient data infrastructure can also be sustainable infrastructure.

“Seeing those numbers was our real success moment,” says the team. “It showed that efficiency and sustainability go hand in hand — and that open-source tools can lead the way.”


What Comes Next

The next phase focuses on integrating the binary exchange mechanism into full SPARQL federation and showcasing its potential for distributed and bandwidth-limited environments, such as connected vehicles or open data platforms.

By combining technical excellence with environmental responsibility, QLever is setting a new benchmark for the Semantic Web community — demonstrating that the future of data isn’t just about speed or scale, but sustainability through design.