Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You could go a step further by putting the suffixes themselves into the trie and then identifying identical subtrees.

If you can use gzip there's bound to be a clever way of using a suffix array as well, that might end up being better unless you can use an optimised binary format for the tree.



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: