elastic · szabosteve · Mar 11, 2025 · Mar 11, 2025 · Mar 11, 2025 · Mar 11, 2025
diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md
@@ -5,6 +5,101 @@ applies_to:
 navigation_title: Elastic Inference Service (EIS)
 ---
 
-# Elastic {{infer-cap}} Service
+# Elastic {{infer-cap}} Service [elastic-inference-service-eis]
 
-This is the documentation of the Elastic Inference Service.
+The Elastic {{infer-cap}} Service (EIS) enables you to leverage AI-powered search as a service without deploying a model in your cluster.
+With EIS, you don't need to manage the infrastructure and resources required for {{ml}} {{infer}} by adding, configuring, and scaling {{ml}} nodes.
+Instead, you can use {{ml}} models for ingest, search and chat independently of your {{es}} infrastructure.
+
+% TO DO: Link to the EIS inference endpoint reference docs when it's added to the OpenAPI spec. (Comming soon) %
+
+## AI features powered by EIS [ai-features-powered-by-eis]
+
+Your Elastic deployment or project comes with a default `Elastic LLM` connector. This connector is used in the AI Assistant, Attack Discovery, Automatic Import and Search Playground.
+
+% TO DO: Link to the EIS inference endpoint reference docs when it's added to the OpenAPI spec. (Comming soon) %
+
+## Available task types
+
+EIS offers the following {{infer}} task types to perform:
+
+* `chat_completion`
+
+## Default EIS endpoints [default-eis-inference-endpoints]
+
+Your {{es}} deployment includes a preconfigured EIS endpoint, making it easier to use chat completion via the {{infer}} API:
+
+* `.rainbow-sprinkles-elastic`
+
+::::{note}
+
+* This endpoint is used by the `Elastic LLM` AI connector, which in turn powers the AI Assistant, Attack Discovery, Automatic Import, and the Search Playground.
+
+::::
+
+% TO DO: Link to the AI assistant documentation in the different solutions and possibly connector docs. %
+
+### Examples
+
+The following example demonstrates how to perform a `chat_completion` task through EIS by using the `.rainbow-sprinkles-elastic` default {{infer}} endpoint.
+
+```json
+POST /_inference/chat_completion/.rainbow-sprinkles-elastic/_stream
+{
+    "messages": [
+        {
+            "role": "user",
+            "content": "Say yes if it works."
+        }
+    ],
+    "temperature": 0.7,
+    "max_completion_tokens": 300
+}
+```
+
+The request returns the following response as a stream:
+
+```json
+event: message
+data: {
+  "id" : "unified-45ecde2b-6293-4fd6-a195-4252de76ee63",
+  "choices" : [
+    {
+      "delta" : {
+        "role" : "assistant"
+      },
+      "index" : 0
+    }
+  ],
+  "model" : "rainbow-sprinkles",
+  "object" : "chat.completion.chunk"
+}
+
+
+event: message
+data: {
+  "id" : "unified-45ecde2b-6293-4fd6-a195-4252de76ee63",
+  "choices" : [
+    {
+      "delta" : {
+        "content" : "Yes"
+      },
+      "index" : 0
+    }
+  ],
+  "model" : "rainbow-sprinkles",
+  "object" : "chat.completion.chunk"
+}
+```
+
+## Regions [eis-regions]
+
+All EIS requests are handled by one of these AWS regions:
+
+* `us-east-1`
+* `us-west-2`
+
+However, projects and deployments can use the Elastic LLM regardless of their cloud provider or region.
+The request routing does not restrict the location of your deployments.
+
+For more details on AWS regions, refer to the [AWS Global Infrastructure](https://aws.amazon.com/about-aws/global-infrastructure/regions_az/).
diff --git a/explore-analyze/elastic-inference/inference-api/elastic-inference-service-eis.md b/explore-analyze/elastic-inference/inference-api/elastic-inference-service-eis.md
@@ -15,12 +15,10 @@ Refer to the [{{infer-cap}} APIs](https://www.elastic.co/docs/api/doc/elasticsea
 
 Creates an {{infer}} endpoint to perform an {{infer}} task with the `elastic` service.
 
-
 ## {{api-request-title}} [infer-service-elastic-api-request] 
 
 `PUT /_inference/<task_type>/<inference_id>`
 
-
 ## {{api-path-parms-title}} [infer-service-elastic-api-path-params] 
 
 `<inference_id>`
@@ -34,16 +32,13 @@ Creates an {{infer}} endpoint to perform an {{infer}} task with the `elastic` se
     * `chat_completion`,
     * `sparse_embedding`.
 
-
 ::::{note} 
 The `chat_completion` task type only supports streaming and only through the `_stream` API.
 
 For more information on how to use the `chat_completion` task type, refer to the [chat completion documentation](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-inference-stream-inference).
 
 ::::
 
-
-
 ## {{api-request-body-title}} [infer-service-elastic-api-request-body] 
 
 `max_chunk_size`
@@ -64,7 +59,6 @@ For more information on how to use the `chat_completion` task type, refer to the
     `service_settings`
     :   (Required, object) Settings used to install the {{infer}} model.
 
-
 `model_id`
 :   (Required, string) The name of the model to use for the {{infer}} task.
 
@@ -77,9 +71,7 @@ For more information on how to use the `chat_completion` task type, refer to the
     }
     ```
 
-
-
-## Elastic {{infer-cap}} Service example [inference-example-elastic] 
+## Elastic {{infer-cap}} Service example [inference-example-elastic]
 
 The following example shows how to create an {{infer}} endpoint called `elser-model-eis` to perform a `text_embedding` task type.
 
@@ -104,4 +96,3 @@ PUT /_inference/chat_completion/chat-completion-endpoint
     }
 }
 ```
-