Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOCS] Moving OVMS to top menu level #29050

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ function(build_docs)
list(APPEND commands COMMAND ${Python3_EXECUTABLE} ${FILE_HELPER_SCRIPT}
--filetype=md
--input_dir=${OVMS_DOCS_DIR}
--output_dir=${SPHINX_SOURCE_DIR}/openvino-workflow/model-server
--output_dir=${SPHINX_SOURCE_DIR}/model-server
--exclude_dir=${SPHINX_SOURCE_DIR})
list(APPEND commands COMMAND ${CMAKE_COMMAND} -E cmake_echo_color --green "FINISHED preprocessing OVMS")
endif()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ and TensorFlow models during training.

| **OpenVINO Model Server**
| :bdg-link-dark:`GitHub <https://github.com/openvinotoolkit/model_server>`
:bdg-link-success:`User Guide <https://docs.openvino.ai/2025/openvino-workflow/model-server/ovms_what_is_openvino_model_server.html>`
:bdg-link-success:`User Guide <https://docs.openvino.ai/2025/model-server/ovms_what_is_openvino_model_server.html>`

A high-performance system that can be used to access the host models via request to the model
server.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ In this release, one person performs the role of both the Model Developer and th
Overview
########

The OpenVINO™ Security Add-on works with the :doc:`OpenVINO™ Model Server <../../../openvino-workflow/model-server/ovms_what_is_openvino_model_server>` on Intel® architecture. Together, the OpenVINO™ Security Add-on and the OpenVINO™ Model Server provide a way for Model Developers and Independent Software Vendors to use secure packaging and secure model execution to enable access control to the OpenVINO™ models, and for model Users to run inference within assigned limits.
The OpenVINO™ Security Add-on works with the :doc:`OpenVINO™ Model Server <../../../../model-server/ovms_what_is_openvino_model_server>` on Intel® architecture. Together, the OpenVINO™ Security Add-on and the OpenVINO™ Model Server provide a way for Model Developers and Independent Software Vendors to use secure packaging and secure model execution to enable access control to the OpenVINO™ models, and for model Users to run inference within assigned limits.

The OpenVINO™ Security Add-on consists of three components that run in Kernel-based Virtual Machines (KVMs). These components provide a way to run security-sensitive operations in an isolated environment. A brief description of the three components are as follows. Click each triangled line for more information about each.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Performance Benchmarks

This page presents benchmark results for the
`Intel® Distribution of OpenVINO™ toolkit <https://software.intel.com/content/www/us/en/develop/tools/openvino-toolkit.html>`__
and :doc:`OpenVINO Model Server <../openvino-workflow/model-server/ovms_what_is_openvino_model_server>`, for a representative
and :doc:`OpenVINO Model Server <../../model-server/ovms_what_is_openvino_model_server>`, for a representative
selection of public neural networks and Intel® devices. The results may help you decide which
hardware to use in your applications or plan AI workload for the hardware you have already
implemented in your solutions. Click the buttons below to see the chosen benchmark data.
Expand Down
27 changes: 19 additions & 8 deletions docs/articles_en/openvino-workflow-generative.rst
Original file line number Diff line number Diff line change
Expand Up @@ -55,14 +55,22 @@ options:
as well as conversion on the fly. For integration with the final product it may offer
lower performance, though.

.. tab-item:: Base OpenVINO (not recommended)
.. tab-item:: OpenVINO™ Model Server

Note that the base version of OpenVINO may also be used to run generative AI. Although it may
offer a simpler environment, with fewer dependencies, it has significant limitations and a more
demanding implementation process.
| - Easy and quick deployment of models to edge or cloud.
| - Includes endpoints for serving generative AI models.
| - Available in both Python and C++.
| - Allows client applications in any programming language that supports REST or gRPC.

To learn more, refer to the article for the 2024.6 OpenVINO version:
`Generative AI with Base OpenVINO <https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide/llm-inference-native-ov.html>`__
:doc:`OpenVINO™ Model Server <model-server/ovms_what_is_openvino_model_server>`
provides a set of REST API endpoints dedicated to generative use cases. The endpoints
simplify writing AI applications, ensure scalability, and provide state-of-the-art
performance optimizations. They include OpenAI API for:
`text generation <https://openvino-doc.iotg.sclab.intel.com/seba-test-8/model-server/ovms_docs_rest_api_chat.html>`__,
`embeddings <https://openvino-doc.iotg.sclab.intel.com/seba-test-8/model-server/ovms_docs_rest_api_embeddings.html>`__,
and `reranking <https://openvino-doc.iotg.sclab.intel.com/seba-test-8/model-server/ovms_docs_rest_api_rerank.html>`__.
The model server supports deployments as containers or binary applications on Linux and Windows with CPU or GPU acceleration.
See the :doc:`demos <model-server/ovms_docs_demos>`.



Expand Down Expand Up @@ -94,10 +102,13 @@ The advantages of using OpenVINO for generative model deployment:
better performance than Python-based runtimes.


You can run Generative AI models, using native OpenVINO API, although it is not recommended.
If you want to learn how to do it, refer to
`the 24.6 documentation <https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide/llm-inference-native-ov.html>`__.


Proceed to guides on:

* :doc:`OpenVINO GenAI <./openvino-workflow-generative/inference-with-genai>`
* :doc:`Hugging Face and Optimum Intel <./openvino-workflow-generative/inference-with-optimum-intel>`
* `Generative AI with Base OpenVINO <https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide/llm-inference-native-ov.html>`__


9 changes: 4 additions & 5 deletions docs/articles_en/openvino-workflow.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,7 @@ OpenVINO Workflow
Model Preparation <openvino-workflow/model-preparation>
openvino-workflow/model-optimization
Running Inference <openvino-workflow/running-inference>
Deployment on a Local System <openvino-workflow/deployment-locally>
Deployment on a Model Server <openvino-workflow/model-server/ovms_what_is_openvino_model_server>
Deployment on a Local System <openvino-workflow/deployment-locally>
openvino-workflow/torch-compile


Expand Down Expand Up @@ -86,11 +85,11 @@ OpenVINO uses the following functions for reading, converting, and saving models
and the quickest way of running a deep learning model.

| :doc:`Deployment Option 1. Using OpenVINO Runtime <openvino-workflow/deployment-locally>`
| Deploy a model locally, reading the file directly from your application and utilizing about-openvino/additional-resources available to the system.
| Deploy a model locally, reading the file directly from your application and utilizing resources available to the system.
| Deployment on a local system uses the steps described in the section on running inference.

| :doc:`Deployment Option 2. Using Model Server <openvino-workflow/model-server/ovms_what_is_openvino_model_server>`
| Deploy a model remotely, connecting your application to an inference server and utilizing external about-openvino/additional-resources, with no impact on the app's performance.
| :doc:`Deployment Option 2. Using Model Server <../model-server/ovms_what_is_openvino_model_server>`
| Deploy a model remotely, connecting your application to an inference server and utilizing external resources, with no impact on the app's performance.
| Deployment on OpenVINO Model Server is quick and does not require any additional steps described in the section on running inference.

| :doc:`Deployment Option 3. Using torch.compile for PyTorch 2.0 <openvino-workflow/torch-compile>`
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -140,4 +140,4 @@ sequences.
You can find more examples demonstrating how to work with states in other articles:

* `LLaVA-NeXT Multimodal Chatbot notebook <../../notebooks/llava-next-multimodal-chatbot-with-output.html>`__
* :doc:`Serving Stateful Models with OpenVINO Model Server <../../openvino-workflow/model-server/ovms_docs_stateful_models>`
* :doc:`Serving Stateful Models with OpenVINO Model Server <../../model-server/ovms_docs_stateful_models>`

Large diffs are not rendered by default.

5 changes: 3 additions & 2 deletions docs/sphinx_setup/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ hardware and environments, on-premises and on-device, in the browser or in the c
<li id="ov-homepage-slide3" class="splide__slide">
<p class="ov-homepage-slide-title">Improved model serving</p>
<p class="ov-homepage-slide-subtitle">OpenVINO Model Server has improved parallel inference!</p>
<a class="ov-homepage-banner-btn" href="https://docs.openvino.ai/2025/openvino-workflow/model-server/ovms_what_is_openvino_model_server.html">Learn more</a>
<a class="ov-homepage-banner-btn" href="https://docs.openvino.ai/2025/model-server/ovms_what_is_openvino_model_server.html">Learn more</a>
</li>
<li id="ov-homepage-slide4" class="splide__slide">
<p class="ov-homepage-slide-title">OpenVINO via PyTorch 2.0 torch.compile()</p>
Expand Down Expand Up @@ -124,7 +124,7 @@ Places to Begin

Cloud-ready deployments for microservice applications.

.. button-link:: openvino-workflow/model-server/ovms_what_is_openvino_model_server.html
.. button-link:: model-server/ovms_what_is_openvino_model_server.html
:color: primary
:outline:

Expand Down Expand Up @@ -195,5 +195,6 @@ Key Features
GET STARTED <get-started>
HOW TO USE - MAIN WORKFLOW <openvino-workflow>
HOW TO USE - GENERATIVE AI WORKFLOW <openvino-workflow-generative>
HOW TO USE - MODEL SERVING <model-server/ovms_what_is_openvino_model_server>
REFERENCE DOCUMENTATION <documentation>
ABOUT OPENVINO <about-openvino>
Loading