|
| 1 | +# Model Caching Overview {#openvino_docs_IE_DG_Model_caching_overview} |
| 2 | + |
| 3 | +## Introduction |
| 4 | + |
| 5 | +As described in [Inference Engine Introduction](inference_engine_intro.md), common application flow consists of the following steps: |
| 6 | + |
| 7 | +1. **Create Inference Engine Core object** |
| 8 | + |
| 9 | +2. **Read the Intermediate Representation** - Read an Intermediate Representation file into an object of the `InferenceEngine::CNNNetwork` |
| 10 | + |
| 11 | +3. **Prepare inputs and outputs** |
| 12 | + |
| 13 | +4. **Set configuration** Pass device-specific loading configurations to the device |
| 14 | + |
| 15 | +5. **Compile and Load Network to device** - Use the `InferenceEngine::Core::LoadNetwork()` method with specific device |
| 16 | + |
| 17 | +6. **Set input data** |
| 18 | + |
| 19 | +7. **Execute** |
| 20 | + |
| 21 | +Step #5 can potentially perform several time-consuming device-specific optimizations and network compilations, |
| 22 | +and such delays can lead to bad user experience on application startup. To avoid this, some devices offer |
| 23 | +Import/Export network capability, and it is possible to either use [Compile tool](../../inference-engine/tools/compile_tool/README.md) |
| 24 | +or enable model caching to export compiled network automatically. Reusing cached networks can significantly reduce load network time. |
| 25 | + |
| 26 | + |
| 27 | +## Set "CACHE_DIR" config option to enable model caching |
| 28 | + |
| 29 | +To enable model caching, the application must specify the folder where to store cached blobs. It can be done like this |
| 30 | + |
| 31 | + |
| 32 | +@snippet snippets/InferenceEngine_Caching0.cpp part0 |
| 33 | + |
| 34 | +With this code, if device supports Import/Export network capability, cached blob is automatically created inside the `myCacheFolder` folder |
| 35 | +CACHE_DIR config is set to the Core object. If device does not support Import/Export capability, cache is just not created and no error is thrown |
| 36 | + |
| 37 | +Depending on your device, total time for loading network on application startup can be significantly reduced. |
| 38 | +Please also note that very first LoadNetwork (when cache is not yet created) takes slightly longer time to 'export' compiled blob into a cache file |
| 39 | +![caching_enabled] |
| 40 | + |
| 41 | +## Even faster: use LoadNetwork(modelPath) |
| 42 | + |
| 43 | +In some cases, applications do not need to customize inputs and outputs every time. Such applications always |
| 44 | +call `cnnNet = ie.ReadNetwork(...)`, then `ie.LoadNetwork(cnnNet, ..)` and it can be further optimized. |
| 45 | +For such cases, more convenient API to load network in one call is introduced in the 2021.4 release. |
| 46 | + |
| 47 | +@snippet snippets/InferenceEngine_Caching1.cpp part1 |
| 48 | + |
| 49 | +With enabled model caching, total load time is even smaller - in case that ReadNetwork is optimized as well |
| 50 | + |
| 51 | +@snippet snippets/InferenceEngine_Caching2.cpp part2 |
| 52 | + |
| 53 | +![caching_times] |
| 54 | + |
| 55 | + |
| 56 | +## Advanced examples |
| 57 | + |
| 58 | +Not every device supports network import/export capability, enabling of caching for such devices do not have any effect. |
| 59 | +To check in advance if a particular device supports model caching, your application can use the following code: |
| 60 | + |
| 61 | +@snippet snippets/InferenceEngine_Caching3.cpp part3 |
| 62 | + |
| 63 | + |
| 64 | +[caching_enabled]: ../img/caching_enabled.png |
| 65 | +[caching_times]: ../img/caching_times.png |
0 commit comments