International Conference on Secure Systems Design and Technology Development (SysTechDev 2026)

Name: International Conference on Secure Systems Design and Technology Development (SysTechDev 2026)
Start: 2026-06-25T00:30:00+03:00
End: 2026-06-26T21:55:00+03:00
Location: ''Vasil Levski'' National Military University

25–26 Jun 2026

''Vasil Levski'' National Military University

Europe/Sofia timezone

Technical support

IMPROVING TEXT SEARCH MEMORY EFFICIENCY WITH NSW TEXT INDEXING

26 Jun 2026, 12:10

15m

Room 1210

Paper – Oral Presentation Information Technology Information Technology

Dmytro Dashenkov

Modern day full text search relies on a simple yet robust inverted index data structure to find documents matching a text query. However, this approach is limited by the required spelling checker, which often has memory requirements comparable to the index itself. In this work, a novel approach to text indexing is introduced. Based on the navigable small world data structure, it offers a two-in-one solution for search with built-in spelling correction, sacrificing some performance for a substantial memory economy. The proposed index, just like the conventional inverted index, uses separate tokens extracted from a document as the main building blocks. The tokens are organized in a graph, each node representing one token. The nodes list all the document identifiers for the documents where the token is found. The edges of the graph connect tokens that are similar in terms of edit distance. The proposed approach utilizes an arbitrary ranking function to find relevant documents, the state-of-the-art BM25 ranking function is recommended as a reasonable default. The function values are weighed to take the edit distance into account. The ability to process misspelled words allows avoiding maintaining a separate structure for spelling correction while still generating search results which account for possible misspellings in the query phrase. The search time losses are attributed to the neighbor exploration while traversing the navigable small world data structure, and thus can be mitigated by configuring the data structure to maintain a fairly low number of connections. Such a configuration also provides a hard limit to the number of edits allowed in a single query term. The paper discusses in detail the ways of picking the optimal parameters for the algorithm, as well as the limitations of the proposed approach.

Keywords: full text search, search, navigable small world, memory efficiency, text ranking, spelling correction.

Dmytro Dashenkov Serhii Smelyakov Anastasiya Chupryna Loreta Savulioniene Dainius Savulionis Paulius Sakalys

There are no materials yet.

International Conference on Secure Systems Design and Technology Development (SysTechDev 2026)

Technical support

IMPROVING TEXT SEARCH MEMORY EFFICIENCY WITH NSW TEXT INDEXING

Room 1210

Speaker

Description

Authors

Presentation materials

Choose timezone

International Conference on Secure Systems Design and Technology Development (SysTechDev 2026)

Technical support

Speaker

Description

Authors

Presentation materials