Recommender

Note

This project was created as a solution to a hiring assignment. The majority of this solution was developed in roughly 2-3 MD (and thus is full of assumptions and simplifications). However, the point was to deliver something complete within that relatively short time span. The README.md is written in a way that attempts to explain my thought process.

This repository presents a possible solution to a simple car recommender system design. The solution primarily focuses on the back-end engineering aspects of the problem. The front-end was written entirely by AI (xAI Grok 3) and the data science solution is more or less just a naive placeholder. However, it was necessary to pick some data science solution as it dictates some parts of the presented back-end stack, such as the database and the event-handling solution.

The solution presented here consists of:

Data science solution
Architecture
Prototyp

Data science solution

After a quick research, I essentially ruled out using any complex ML solutions (since I lack experience and intuition in this domain, making it too time-consuming). I narrowed it down to two possible solution (ideally a hybrid of both):

Content-based filtering
Collaborative filtering (Optionally)

Both solutions seem to boil down to computing a "preference" vector for the user—either based on a real feature set (in the case of content-based filtering) or a latent feature set (in the case of collaborative filtering)—and representing each car listing as a feature vector (again, using either real or latent features). Recommendations for the user are then generated by finding the car listing feature vectors that are nearest to the user’s "preference" vector (in this case, using Euclidean distance).

So, a high-level solution could look something like this:

See the following jupyter notebook for more.
The actual vectorization is then done in init_db.py.

Architecture

This chapter presents a possible software stack and interactions between its components. This is by no means THE RIGHT SOLUTION; this is just A SOLUTION.

A possible back-end layer could look something like this. Naturally, the tech stack largely reflects my previous experience. Redis Pub/Sub was utilized for its simplicity. Bytewax is a relatively new stream processing framework that is fairly simple (compared to alternatives, which are mostly Java-based) and very fast (it is essentially a Python binding for an engine written in Rust).

A possible monitoring and observability solution could be based on Prometheus, Grafana, and Grafana Loki, along with a custom-made monitoring microservice that constructs and exposes advanced metrics and telemetry (which can then be collected by Prometheus and displayed on a Grafana dashboard). A production-ready solution would also include a tracing solution like Sentry.

The diagram below hints at one way this stack could be deployed in a cloud-agnostic manner. There are countless ways the infrastructure and deployment model could be designed, all depending on various external and internal constraints.

Prototype

The idea behind the prototype was to implement a reasonable subset of the product stack presented in the architecture chapter.

In a nutshell, the goals were:

To have a functional API microservice that can query basic entities from the database (e.g., User, Listing) and emit events related to user activity.
To have a consumer for user activity events (e.g., content or collaborative pipelines) that then updates user preference vectors accordingly.
To have a functional vector similarity search.
To have a simple front-end app that can visualize all this functionality.

Note

The front-end app was written almost entirely by AI (xAI Grok 3). Only minor fixes and adjustments were made by me.

The prompt used to generate the frontend app can be found in frontend-xai-prompt.md, and the generated response is available at frontend-xai-prompt-result.md.

How to run the stack

Prerequisites

Linux environment
Docker

Docker compose

run the prototype stack

docker compose up

initialize and populate the db with data

pip install -r dev-requirements.txt
source dev-env.sh
cd backend
python init_db.py

Open the front-end (http://localhost:8080) app in your browser and log in by entering email of one of the dummy user john1@example.com.

Check default recommendations.

Search for "Skoda" listings.

Click on a car listing to get its details.

Observe that recommendations change after clicking on a listing detail.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
backend		backend
diagrams		diagrams
frontend		frontend
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
definitions.json		definitions.json
dev-env.sh		dev-env.sh
dev-requirements.txt		dev-requirements.txt
docker-compose.yml		docker-compose.yml
frontend-xai-prompt-result.md		frontend-xai-prompt-result.md
frontend-xai-prompt.md		frontend-xai-prompt.md
pyproject.toml		pyproject.toml
rabbitmq.conf		rabbitmq.conf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Recommender

Data science solution

Architecture

Prototype

How to run the stack

Prerequisites

Docker compose

About

Releases

Packages

Languages

License

Embi/recommender

Folders and files

Latest commit

History

Repository files navigation

Recommender

Data science solution

Architecture

Prototype

How to run the stack

Prerequisites

Docker compose

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages