Skip to content

Commit 87baeb8

Browse files
authored
Update TEI docker image to 1.6 (#1650)
Signed-off-by: Wang, Xigui <xigui.wang@intel.com>
1 parent 0317929 commit 87baeb8

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

43 files changed

+60
-60
lines changed

AgentQnA/tests/step1_build_images.sh

+1-1
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ function build_docker_images_for_retrieval_tool(){
2222
echo "Build all the images with --no-cache..."
2323
service_list="doc-index-retriever dataprep embedding retriever reranking"
2424
docker compose -f build.yaml build ${service_list} --no-cache
25-
docker pull ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
25+
docker pull ghcr.io/huggingface/text-embeddings-inference:cpu-1.6
2626

2727
docker images && sleep 1s
2828
}

ChatQnA/docker_compose/intel/cpu/aipc/compose.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ services:
2626
TEI_ENDPOINT: http://tei-embedding-service:80
2727
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
2828
tei-embedding-service:
29-
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
29+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.6
3030
container_name: tei-embedding-server
3131
ports:
3232
- "6006:80"
@@ -59,7 +59,7 @@ services:
5959
RETRIEVER_COMPONENT_NAME: "OPEA_RETRIEVER_REDIS"
6060
restart: unless-stopped
6161
tei-reranking-service:
62-
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
62+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.6
6363
container_name: tei-reranking-server
6464
ports:
6565
- "8808:80"

ChatQnA/docker_compose/intel/cpu/xeon/compose.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ services:
3333
TEI_ENDPOINT: http://tei-embedding-service:80
3434
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
3535
tei-embedding-service:
36-
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
36+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.6
3737
container_name: tei-embedding-server
3838
ports:
3939
- "6006:80"
@@ -66,7 +66,7 @@ services:
6666
RETRIEVER_COMPONENT_NAME: "OPEA_RETRIEVER_REDIS"
6767
restart: unless-stopped
6868
tei-reranking-service:
69-
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
69+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.6
7070
container_name: tei-reranking-server
7171
ports:
7272
- "8808:80"

ChatQnA/docker_compose/intel/cpu/xeon/compose_faqgen.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ services:
2626
TEI_ENDPOINT: http://tei-embedding-service:80
2727
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
2828
tei-embedding-service:
29-
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
29+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.6
3030
container_name: tei-embedding-server
3131
ports:
3232
- "6006:80"
@@ -59,7 +59,7 @@ services:
5959
RETRIEVER_COMPONENT_NAME: "OPEA_RETRIEVER_REDIS"
6060
restart: unless-stopped
6161
tei-reranking-service:
62-
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
62+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.6
6363
container_name: tei-reranking-server
6464
ports:
6565
- "8808:80"

ChatQnA/docker_compose/intel/cpu/xeon/compose_faqgen_tgi.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ services:
2626
TEI_ENDPOINT: http://tei-embedding-service:80
2727
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
2828
tei-embedding-service:
29-
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
29+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.6
3030
container_name: tei-embedding-server
3131
ports:
3232
- "6006:80"
@@ -59,7 +59,7 @@ services:
5959
RETRIEVER_COMPONENT_NAME: "OPEA_RETRIEVER_REDIS"
6060
restart: unless-stopped
6161
tei-reranking-service:
62-
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
62+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.6
6363
container_name: tei-reranking-server
6464
ports:
6565
- "8808:80"

ChatQnA/docker_compose/intel/cpu/xeon/compose_milvus.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -108,7 +108,7 @@ services:
108108
restart: unless-stopped
109109

110110
tei-embedding-service:
111-
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
111+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.6
112112
container_name: tei-embedding-server
113113
ports:
114114
- "6006:80"
@@ -122,7 +122,7 @@ services:
122122
command: --model-id ${EMBEDDING_MODEL_ID} --auto-truncate
123123

124124
tei-reranking-service:
125-
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
125+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.6
126126
container_name: tei-reranking-server
127127
ports:
128128
- "8808:80"

ChatQnA/docker_compose/intel/cpu/xeon/compose_pinecone.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ services:
2323
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
2424
DATAPREP_COMPONENT_NAME: "OPEA_DATAPREP_PINECONE"
2525
tei-embedding-service:
26-
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
26+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.6
2727
container_name: tei-embedding-server
2828
ports:
2929
- "6006:80"
@@ -54,7 +54,7 @@ services:
5454
RETRIEVER_COMPONENT_NAME: "OPEA_RETRIEVER_PINECONE"
5555
restart: unless-stopped
5656
tei-reranking-service:
57-
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
57+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.6
5858
container_name: tei-reranking-server
5959
ports:
6060
- "8808:80"

ChatQnA/docker_compose/intel/cpu/xeon/compose_tgi.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ services:
2626
TEI_ENDPOINT: http://tei-embedding-service:80
2727
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
2828
tei-embedding-service:
29-
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
29+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.6
3030
container_name: tei-embedding-server
3131
ports:
3232
- "6006:80"
@@ -59,7 +59,7 @@ services:
5959
RETRIEVER_COMPONENT_NAME: "OPEA_RETRIEVER_REDIS"
6060
restart: unless-stopped
6161
tei-reranking-service:
62-
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
62+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.6
6363
container_name: tei-reranking-server
6464
ports:
6565
- "8808:80"

ChatQnA/docker_compose/intel/cpu/xeon/compose_without_rerank.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ services:
2626
TEI_ENDPOINT: http://tei-embedding-service:80
2727
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
2828
tei-embedding-service:
29-
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
29+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.6
3030
container_name: tei-embedding-server
3131
ports:
3232
- "6006:80"

ChatQnA/docker_compose/intel/hpu/gaudi/README.md

+8-8
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ d560c232b120 opea/retriever:latest
9595
a1d7ca2d3787 ghcr.io/huggingface/tei-gaudi:1.5.0 "text-embeddings-rou…" 2 minutes ago Up 2 minutes 0.0.0.0:8808->80/tcp, [::]:8808->80/tcp tei-reranking-gaudi-server
9696
9a9f3fd4fd4c opea/vllm-gaudi:latest "python3 -m vllm.ent…" 2 minutes ago Exited (1) 2 minutes ago vllm-gaudi-server
9797
1ab9bbdf5182 redis/redis-stack:7.2.0-v9 "/entrypoint.sh" 2 minutes ago Up 2 minutes 0.0.0.0:6379->6379/tcp, :::6379->6379/tcp, 0.0.0.0:8001->8001/tcp, :::8001->8001/tcp redis-vector-db
98-
9ee0789d819e ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 "text-embeddings-rou…" 2 minutes ago Up 2 minutes 0.0.0.0:8090->80/tcp, [::]:8090->80/tcp tei-embedding-gaudi-server
98+
9ee0789d819e ghcr.io/huggingface/text-embeddings-inference:cpu-1.6 "text-embeddings-rou…" 2 minutes ago Up 2 minutes 0.0.0.0:8090->80/tcp, [::]:8090->80/tcp tei-embedding-gaudi-server
9999
```
100100

101101
### Test the Pipeline
@@ -148,7 +148,7 @@ The default deployment utilizes Gaudi devices primarily for the `vllm-service`,
148148
| ---------------------------- | ----------------------------------------------------- | ------------ |
149149
| redis-vector-db | redis/redis-stack:7.2.0-v9 | No |
150150
| dataprep-redis-service | opea/dataprep:latest | No |
151-
| tei-embedding-service | ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 | No |
151+
| tei-embedding-service | ghcr.io/huggingface/text-embeddings-inference:cpu-1.6 | No |
152152
| retriever | opea/retriever:latest | No |
153153
| tei-reranking-service | ghcr.io/huggingface/tei-gaudi:1.5.0 | 1 card |
154154
| vllm-service | opea/vllm-gaudi:latest | Configurable |
@@ -164,7 +164,7 @@ The TGI (Text Generation Inference) deployment and the default deployment differ
164164
| ---------------------------- | ----------------------------------------------------- | -------------- |
165165
| redis-vector-db | redis/redis-stack:7.2.0-v9 | No |
166166
| dataprep-redis-service | opea/dataprep:latest | No |
167-
| tei-embedding-service | ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 | No |
167+
| tei-embedding-service | ghcr.io/huggingface/text-embeddings-inference:cpu-1.6 | No |
168168
| retriever | opea/retriever:latest | No |
169169
| tei-reranking-service | ghcr.io/huggingface/tei-gaudi:1.5.0 | 1 card |
170170
| **tgi-service** | ghcr.io/huggingface/tgi-gaudi:2.0.6 | Configurable |
@@ -184,7 +184,7 @@ The TGI (Text Generation Inference) deployment and the default deployment differ
184184
| ---------------------------- | ----------------------------------------------------- | ------------ |
185185
| redis-vector-db | redis/redis-stack:7.2.0-v9 | No |
186186
| dataprep-redis-service | opea/dataprep:latest | No |
187-
| tei-embedding-service | ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 | No |
187+
| tei-embedding-service | ghcr.io/huggingface/text-embeddings-inference:cpu-1.6 | No |
188188
| retriever | opea/retriever:latest | No |
189189
| tei-reranking-service | ghcr.io/huggingface/tei-gaudi:1.5.0 | 1 card |
190190
| vllm-service | opea/vllm-gaudi:latest | Configurable |
@@ -203,7 +203,7 @@ The _compose_without_rerank.yaml_ Docker Compose file is distinct from the defau
203203
| ---------------------------- | ----------------------------------------------------- | -------------- |
204204
| redis-vector-db | redis/redis-stack:7.2.0-v9 | No |
205205
| dataprep-redis-service | opea/dataprep:latest | No |
206-
| tei-embedding-service | ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 | No |
206+
| tei-embedding-service | ghcr.io/huggingface/text-embeddings-inference:cpu-1.6 | No |
207207
| retriever | opea/retriever:latest | No |
208208
| vllm-service | opea/vllm-gaudi:latest | Configurable |
209209
| chatqna-gaudi-backend-server | opea/chatqna:latest | No |
@@ -222,7 +222,7 @@ The _compose_guardrails.yaml_ Docker Compose file introduces enhancements over t
222222
| dataprep-redis-service | opea/dataprep:latest | No | No |
223223
| _tgi-guardrails-service_ | ghcr.io/huggingface/tgi-gaudi:2.0.6 | 1 card | Yes |
224224
| _guardrails_ | opea/guardrails:latest | No | No |
225-
| tei-embedding-service | ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 | No | No |
225+
| tei-embedding-service | ghcr.io/huggingface/text-embeddings-inference:cpu-1.6 | No | No |
226226
| retriever | opea/retriever:latest | No | No |
227227
| tei-reranking-service | ghcr.io/huggingface/tei-gaudi:1.5.0 | 1 card | No |
228228
| vllm-service | opea/vllm-gaudi:latest | Configurable | Yes |
@@ -258,7 +258,7 @@ The table provides a comprehensive overview of the ChatQnA services utilized acr
258258
| ---------------------------- | ----------------------------------------------------- | -------- | -------------------------------------------------------------------------------------------------- |
259259
| redis-vector-db | redis/redis-stack:7.2.0-v9 | No | Acts as a Redis database for storing and managing data. |
260260
| dataprep-redis-service | opea/dataprep:latest | No | Prepares data and interacts with the Redis database. |
261-
| tei-embedding-service | ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 | No | Provides text embedding services, often using Hugging Face models. |
261+
| tei-embedding-service | ghcr.io/huggingface/text-embeddings-inference:cpu-1.6 | No | Provides text embedding services, often using Hugging Face models. |
262262
| retriever | opea/retriever:latest | No | Retrieves data from the Redis database and interacts with embedding services. |
263263
| tei-reranking-service | ghcr.io/huggingface/tei-gaudi:1.5.0 | Yes | Reranks text embeddings, typically using Gaudi hardware for enhanced performance. |
264264
| vllm-service | opea/vllm-gaudi:latest | No | Handles large language model (LLM) tasks, utilizing Gaudi hardware. |
@@ -284,7 +284,7 @@ ChatQnA now supports running the latest DeepSeek models, including [deepseek-ai/
284284

285285
### tei-embedding-service & tei-reranking-service
286286

287-
The `ghcr.io/huggingface/text-embeddings-inference:cpu-1.5` image supporting `tei-embedding-service` and `tei-reranking-service` depends on the `EMBEDDING_MODEL_ID` or `RERANK_MODEL_ID` environment variables respectively to specify the embedding model and reranking model used for converting text into vector representations and rankings. This choice impacts the quality and relevance of the embeddings rerankings for various applications. Unlike the `vllm-service`, the `tei-embedding-service` and `tei-reranking-service` each typically acquires only one Gaudi device and does not use the `NUM_CARDS` parameter; embedding and reranking tasks generally do not require extensive parallel processing and one Gaudi per service is appropriate. The list of [supported embedding and reranking models](https://github.com/huggingface/tei-gaudi?tab=readme-ov-file#supported-models) can be found at the the [huggingface/tei-gaudi](https://github.com/huggingface/tei-gaudi?tab=readme-ov-file#supported-models) website.
287+
The `ghcr.io/huggingface/text-embeddings-inference:cpu-1.6` image supporting `tei-embedding-service` and `tei-reranking-service` depends on the `EMBEDDING_MODEL_ID` or `RERANK_MODEL_ID` environment variables respectively to specify the embedding model and reranking model used for converting text into vector representations and rankings. This choice impacts the quality and relevance of the embeddings rerankings for various applications. Unlike the `vllm-service`, the `tei-embedding-service` and `tei-reranking-service` each typically acquires only one Gaudi device and does not use the `NUM_CARDS` parameter; embedding and reranking tasks generally do not require extensive parallel processing and one Gaudi per service is appropriate. The list of [supported embedding and reranking models](https://github.com/huggingface/tei-gaudi?tab=readme-ov-file#supported-models) can be found at the the [huggingface/tei-gaudi](https://github.com/huggingface/tei-gaudi?tab=readme-ov-file#supported-models) website.
288288

289289
### tgi-gaurdrails-service
290290

ChatQnA/docker_compose/intel/hpu/gaudi/compose.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ services:
3333
TEI_ENDPOINT: http://tei-embedding-service:80
3434
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
3535
tei-embedding-service:
36-
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
36+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.6
3737
container_name: tei-embedding-gaudi-server
3838
ports:
3939
- "8090:80"

ChatQnA/docker_compose/intel/hpu/gaudi/compose_faqgen.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ services:
2727
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
2828
LOGFLAG: ${LOGFLAG}
2929
tei-embedding-service:
30-
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
30+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.6
3131
container_name: tei-embedding-gaudi-server
3232
ports:
3333
- "8090:80"

ChatQnA/docker_compose/intel/hpu/gaudi/compose_faqgen_tgi.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ services:
2727
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
2828
LOGFLAG: ${LOGFLAG}
2929
tei-embedding-service:
30-
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
30+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.6
3131
container_name: tei-embedding-gaudi-server
3232
ports:
3333
- "8090:80"

ChatQnA/docker_compose/intel/hpu/gaudi/compose_guardrails.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ services:
6565
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
6666
restart: unless-stopped
6767
tei-embedding-service:
68-
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
68+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.6
6969
container_name: tei-embedding-gaudi-server
7070
ports:
7171
- "8090:80"

ChatQnA/docker_compose/intel/hpu/gaudi/compose_tgi.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ services:
2626
TEI_ENDPOINT: http://tei-embedding-service:80
2727
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
2828
tei-embedding-service:
29-
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
29+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.6
3030
container_name: tei-embedding-gaudi-server
3131
ports:
3232
- "8090:80"

ChatQnA/docker_compose/intel/hpu/gaudi/compose_without_rerank.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ services:
2626
TEI_ENDPOINT: http://tei-embedding-service:80
2727
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
2828
tei-embedding-service:
29-
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
29+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.6
3030
container_name: tei-embedding-gaudi-server
3131
ports:
3232
- "8090:80"

ChatQnA/docker_compose/intel/hpu/gaudi/how_to_validate_service.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ f810f3b4d329 opea/embedding:latest "python embed
5151
174bd43fa6b5 ghcr.io/huggingface/tei-gaudi:1.5.0 "text-embeddings-rou…" 2 minutes ago Up 2 minutes 0.0.0.0:8090->80/tcp, :::8090->80/tcp tei-embedding-gaudi-server
5252
05c40b636239 ghcr.io/huggingface/tgi-gaudi:2.0.6 "text-generation-lau…" 2 minutes ago Exited (1) About a minute ago tgi-gaudi-server
5353
74084469aa33 redis/redis-stack:7.2.0-v9 "/entrypoint.sh" 2 minutes ago Up 2 minutes 0.0.0.0:6379->6379/tcp, :::6379->6379/tcp, 0.0.0.0:8001->8001/tcp, :::8001->8001/tcp redis-vector-db
54-
88399dbc9e43 ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 "text-embeddings-rou…" 2 minutes ago Up 2 minutes 0.0.0.0:8808->80/tcp, :::8808->80/tcp tei-reranking-gaudi-server
54+
88399dbc9e43 ghcr.io/huggingface/text-embeddings-inference:cpu-1.6 "text-embeddings-rou…" 2 minutes ago Up 2 minutes 0.0.0.0:8808->80/tcp, :::8808->80/tcp tei-reranking-gaudi-server
5555
```
5656

5757
In this case, `ghcr.io/huggingface/tgi-gaudi:2.0.6` Existed.

ChatQnA/docker_compose/nvidia/gpu/compose.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ services:
2626
TEI_ENDPOINT: http://tei-embedding-service:80
2727
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
2828
tei-embedding-service:
29-
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
29+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.6
3030
container_name: tei-embedding-server
3131
ports:
3232
- "8090:80"

0 commit comments

Comments
 (0)