Well, this post is in the context of an e2e encrypted DB, but it's subject to th...

Well, this post is in the context of an e2e encrypted DB, but it's subject to the same constraints: https://medium.com/@ZeroDB_/scalable-full-text-search-over-e...

If you understand btrees you understand the hardest part already :)

Basically, you need to design a search index that examines the fewest DB pages in order to find the result. The Lucene scoring method stores a mapping of term -> document[] sorted in relevance order. The main idea is that you can examine only the first n documents for each term in the search query in order to find the most relevant search results. Picking n is sort of tricky, but if your index is stored in this way it's possible to fulfill a large % of queries efficiently without downloading the whole index.

Here's a little Python implementation of what I mean by a "user space implementation". Note that it's a toy but it performs pretty well on some demo sklearn data sets: https://gist.github.com/petehunt/724eeb77189332db101ad7b0db8...