Skip to content

Commit f49724b

Browse files
haoruanLiu-Weijiemkbhandayinghu5
authored
Add DocIndexRetriever doc (#322)
* Add DocIndexRetriever doc Co-authored-by: Hao Ruan <hao.ruan@intel.com> Signed-off-by: LiuWeijie <weijie.liu@intel.com> * Remove unused environment variables from Gaudi and Xeon deployment scripts * Update DocIndexRetriever_Guide.rst fix format of list * Update index.rst add DocIndexRetriever into tree --------- Signed-off-by: LiuWeijie <weijie.liu@intel.com> Co-authored-by: LiuWeijie <weijie.liu@intel.com> Co-authored-by: Malini Bhandaru <malini.bhandaru@intel.com> Co-authored-by: Ying Hu <ying.hu@intel.com>
1 parent d717d47 commit f49724b

File tree

4 files changed

+676
-0
lines changed

4 files changed

+676
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,155 @@
1+
.. _DocIndexRetriever_Guide:
2+
3+
DocIndexRetriever
4+
####################
5+
6+
.. note:: This guide is in its early development and is a work-in-progress with
7+
placeholder content.
8+
9+
Overview
10+
********
11+
12+
DocIndexRetriever is the most widely adopted use case for leveraging the different
13+
methodologies to match user query against a set of free-text records. DocIndexRetriever
14+
is essential to RAG system, which bridges the knowledge gap by dynamically fetching
15+
relevant information from external sources, ensuring that responses generated remain
16+
factual and current. The core of this architecture are vector databases, which are
17+
instrumental in enabling efficient and semantic retrieval of information. These
18+
databases store data as vectors, allowing RAG to swiftly access the most pertinent
19+
documents or data points based on semantic similarity.
20+
21+
22+
Purpose
23+
*******
24+
25+
* **Enable document retrieval with LLMs**: DocIndexRetriever is designed to
26+
facilitate the retrieval of documents or information from a large corpus of
27+
text data using Large Language Models (LLMs).
28+
29+
Key Implementation Details
30+
**************************
31+
32+
User Interface:
33+
The interface that interactivates with users, gets inputs from users and
34+
serves responses to users.
35+
DocIndexRetriever GateWay:
36+
The agent that maintains the connections between user-end and service-end,
37+
forwards requests and responses to appropriate nodes.
38+
DocIndexRetriever MegaService:
39+
The central component that converts user query to vector representation,
40+
retrieves relevant documents from the vector database and reranks relevant
41+
documents to select the most related documents.
42+
Data Preparation MicroService:
43+
The component that prepares the data for the vector database.
44+
45+
How It Works
46+
************
47+
48+
The DocIndexRetriever example is implemented using the component-level microservices
49+
defined in [GenAIComps](https://github.com/opea-project/GenAIComps). The flow chart
50+
below shows the information flow between different microservices for this example.
51+
52+
53+
.. mermaid::
54+
55+
---
56+
config:
57+
flowchart:
58+
nodeSpacing: 400
59+
rankSpacing: 100
60+
curve: linear
61+
themeVariables:
62+
fontSize: 50px
63+
---
64+
flowchart LR
65+
%% Colors %%
66+
classDef blue fill:#ADD8E6,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
67+
classDef orange fill:#FBAA60,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
68+
classDef orchid fill:#C26DBC,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
69+
classDef invisible fill:transparent,stroke:transparent;
70+
style DocIndexRetriever-MegaService stroke:#000000
71+
72+
%% Subgraphs %%
73+
subgraph DocIndexRetriever-MegaService["DocIndexRetriever MegaService "]
74+
direction LR
75+
EM([Embedding MicroService]):::blue
76+
RET([Retrieval MicroService]):::blue
77+
RER([Rerank MicroService]):::blue
78+
end
79+
subgraph UserInput[" User Input "]
80+
direction LR
81+
a([User Input Query]):::orchid
82+
Ingest([Ingest data]):::orchid
83+
end
84+
85+
DP([Data Preparation MicroService]):::blue
86+
TEI_RER{{Reranking service<br>}}
87+
TEI_EM{{Embedding service <br>}}
88+
VDB{{Vector DB<br><br>}}
89+
R_RET{{Retriever service <br>}}
90+
GW([DocIndexRetriever GateWay<br>]):::orange
91+
92+
%% Data Preparation flow
93+
%% Ingest data flow
94+
direction LR
95+
Ingest[Ingest data] --> DP
96+
DP <-.-> TEI_EM
97+
98+
%% Questions interaction
99+
direction LR
100+
a[User Input Query] --> GW
101+
GW <==> DocIndexRetriever-MegaService
102+
EM ==> RET
103+
RET ==> RER
104+
105+
%% Embedding service flow
106+
direction LR
107+
EM <-.-> TEI_EM
108+
RET <-.-> R_RET
109+
RER <-.-> TEI_RER
110+
111+
direction TB
112+
%% Vector DB interaction
113+
R_RET <-.-> VDB
114+
DP <-.-> VDB
115+
116+
117+
This diagram illustrates the flow of information in the DocIndexRetriever system.
118+
Firstly, the user provides docments to the system, which are ingested by the
119+
Data Preparation MicroService. The Data Preparation MicroService prepares the data
120+
for the vector database. The User Input Query is then sent to the DocIndexRetriever
121+
Gateway, which forwards the query to the DocIndexRetriever MegaService. The
122+
DocIndexRetriever MegaService uses the Embedding MicroService to convert the query
123+
to a vector representation. The Retrieval MicroService retrieves relevant documents
124+
from the vector database, and the Rerank MicroService reranks the relevant documents
125+
to select the most related documents. The reranked documents are then sent back to
126+
the DocIndexRetriever Gateway, which forwards the documents to the user.
127+
128+
129+
The architecture follows a series of steps to process user queries and generate
130+
responses:
131+
132+
1. **Embedding**: The Embedding MicroService converts the user query into a vector
133+
representation.
134+
#. **Retriever**: The Retrieval MicroService retrieves relevant documents from the
135+
vector database based on the vector representation of the user query.
136+
#. **Reranker**: The Rerank MicroService reranks the relevant documents to select
137+
the most related documents.
138+
#. **Vector Database**: The Vector Database stores data as vectors, allowing the
139+
system to swiftly access the most pertinent documents or data points based on
140+
semantic similarity.
141+
#. **Data Preparation**: The Data Preparation MicroService prepares the data for the
142+
vector database.
143+
144+
Deployment
145+
**********
146+
147+
Here are some deployment options depending on your hardware and environment.
148+
149+
Single Node
150+
+++++++++++++++
151+
.. toctree::
152+
:maxdepth: 1
153+
154+
Xeon Scalable Processor <deploy/xeon>
155+
Gaudi <deploy/gaudi>

0 commit comments

Comments
 (0)