An experimental HTTP search API for the StackOverflow Look Back Tool.
WARNING: This is experimental. I really like using Algolia and I will continue using it. But I often like to run
my system in a local offline fashion during my development workflow. This gives me control and deep understanding. I can
do things like mock error responses from the API, or use a tailored set of sample data. This search-api/
directory
is an attempt to implement a search API to substitute for Algolia. I'll base this implementation off of my work in
lucene-playground/http-api
.
This project implements a runnable Java program that embeds Lucene into a web server.
Follow these instructions to build and run a Lucene demo program:
- Use Java 17
- Build and run the program:
-
./gradlew run
-
- Make some search requests
-
curl --request GET --url 'http://localhost:8080/?keyword=bash'
-
curl --request GET --url 'http://localhost:8080/?keyword=us*'
-
- Stop the server
- Stop the server process with
Ctrl + C
.
- Stop the server process with
General clean-ups, TODOs and things I wish to implement for this project:
- DONE Read the sample posts
- DONE Index the sample posts (the htmlBody)
- DONE Index and search on question titles
- DONE Serve the title in the API JSON response
- DONE Index
- DONE Search
- DONE Serve the tags (for question posts)
- Index and search on the tags (maybe figure out facets first?)
- Index
- Search (this might have to go away with facets because the shape is different than the "main content")
- Facet search
- DONE JSONify the API response (so it can be used the search-ui).
- DONE Adapt search-ui to point optionally to Algolia or search-api. This should be decently possible, a lot of projects use the algolia-ui tools and don't use Algolia API. * An example of adapting the Algolia client is https://github.com/typesense/typesense-instantsearch-adapter
- How to report highlighting data? Lucene has server-side highlighting, but in React, I don't really want to
dangerouslySetInnerHtml
... although I totally could. That's probably best for "nike just do it". - Do I really need to store any fields besides into the Lucene index besides the post ID? Because we already have access to the full post data outside of Lucene.
- Get highlighting to work everywhere (what are the API-side things needed?)