Ditch Algolia, Create Your Own Content Search Engine
Build a full-text search engine with autocorrect for the content of your website that directly connects to your database. Users being able to find the content they want on your site is essential. Out-of-the-box solutions like Algolia are great, but it probably isn't the best use of resources for a small website. With this tutorial, we will use flexsearch to search documents and use some custom code to make the result 10x better.
This tutorial is designed to be platform-neutral and can be used with any database, server, or framework. While we'll be focusing on setting up an express server, the search function can easily be adapted to work in any part of your application. After configuring the server, we'll implement the autocorrect feature and leverage flexsearch to locate relevant documents.
text npm i express flexsearch stopword closest-match
- Acts as a middleware for our web server
- Implements a full-text search algorithm that allows for quick querying
- Removes stopwords from the search query
- Used to create autocorrect
Here's a basic express server that utilizes
express.json() to extract data from POST requests. We import the search function at the beginning and pass it into a POST route after the JSON parser. This not only works as a standalone express app but can be used with Svelte when built for Node JS. All of this code should be put into your main server file, I use
To utilize the search feature in your application, you must send a POST request to the server provided. Please note that this cannot be utilized as a SvelteKit/React endpoint since the searchable documents' entire database must be stored in memory. Otherwise, you would have to read the entire database and index the documents every time you conduct a search query.
Populating the Index
We will be using the flexsearch
Document index because this site is based on a document model however if your data looks different pick the one that fits the best. To initiate the server we need to first load all of our documents from our database into flexsearch. I will be using pseudo code for the database part, just adapt it to your database.
Our autocorrect function works by taking every word in all of the documents and comparing the search query to each one to find the most likely match to the intended query. This is based on the assumption that every word is spelled correctly. An added benefit of this method is that if the word the user searches is not in the database, it will find the closest match to the word and still send results.
Now getting into the actual searching part, we need to create the search function. We will use the npm package
stopword to remove stopwords (filler words) from our query. This improves the search results because documents with words like "the" will not be weighted in the search results. All this code goes into the same
search.js file as above so we have already loaded all of the documents into the database and the autocorrecter.