Skip to content

Commit

Permalink
更新文档
Browse files Browse the repository at this point in the history
  • Loading branch information
liuxinwei committed Nov 13, 2024
1 parent 02ae06d commit 4896953
Show file tree
Hide file tree
Showing 9 changed files with 1,685 additions and 142 deletions.
4 changes: 4 additions & 0 deletions doc/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,7 @@ datasets/
*.onnx
data/
*.onnxlib
*.npy
*.npz
*.torchscript
*.engine
2 changes: 1 addition & 1 deletion doc/ultralytics/.gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
runs/
.temp/
.temp/
103 changes: 102 additions & 1 deletion doc/ultralytics/modes/benchmark.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,112 @@
"\n",
"基准模式:在不同的部署环境中分析模型的速度和准确性。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"一旦你的模型训练和验证完成,下一步就是评估它在不同实际场景中的性能。Ultralytics YOLO11 提供的基准模式正是为此目的而设计的,它提供了强大的框架来评估你的模型在多种导出格式下的速度和准确性。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{admonition} 基准测试为何至关重要?\n",
"- 明智决策:深入了解速度与准确性之间的权衡。\n",
"- 资源分配:了解不同导出格式在不同硬件上的性能。\n",
"- 优化:了解哪种导出格式能为您的特定用例提供最佳性能。\n",
"- 成本效益:根据基准测试结果,更有效地利用硬件资源。\n",
"```\n",
"\n",
"```{admonition} 基准模式下的关键指标\n",
"- mAP50-95: 用于[目标检测](https://www.ultralytics.com/glossary/object-detection)、分割 和动作估计。\n",
"- accuracy_top5: 用于[图像分类]((https://www.ultralytics.com/glossary/image-classification)).\n",
"- 推理时间:每个图像所需推理的时间,以毫秒为单位。\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import set_env"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"示例:\n",
"\n",
"```python\n",
"from ultralytics.utils.benchmarks import benchmark\n",
"\n",
"# Benchmark on GPU\n",
"benchmark(model=\"yolo11n.pt\", data=\"coco8.yaml\", imgsz=640, half=False, device=0)\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 参数\n",
"\n",
"诸如 `model`, `data`, `imgsz`, `half`, `device` 和 `verbose` 用户可以灵活地根据自己的具体需求对基准进行微调,并轻松比较不同导出格式的性能。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"| Key | Default Value | Description |\n",
"| --------- | ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n",
"| `model` | `None` | Specifies the path to the model file. Accepts both `.pt` and `.yaml` formats, e.g., `\"yolo11n.pt\"` for pre-trained models or configuration files. |\n",
"| `data` | `None` | Path to a YAML file defining the dataset for benchmarking, typically including paths and settings for [validation data](https://www.ultralytics.com/glossary/validation-data). Example: `\"coco8.yaml\"`. |\n",
"| `imgsz` | `640` | The input image size for the model. Can be a single integer for square images or a tuple `(width, height)` for non-square, e.g., `(640, 480)`. |\n",
"| `half` | `False` | Enables FP16 (half-precision) inference, reducing memory usage and possibly increasing speed on compatible hardware. Use `half=True` to enable. |\n",
"| `int8` | `False` | Activates INT8 quantization for further optimized performance on supported devices, especially useful for edge devices. Set `int8=True` to use. |\n",
"| `device` | `None` | Defines the computation device(s) for benchmarking, such as `\"cpu\"`, `\"cuda:0\"`, or a list of devices like `\"cuda:0,1\"` for multi-GPU setups. |\n",
"| `verbose` | `False` | Controls the level of detail in logging output. A boolean value; set `verbose=True` for detailed logs or a float for thresholding errors. |\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 将YOLOv11模型导出为不同格式(如ONNX、TensorRT和OpenVINO)有哪些好处?\n",
"\n",
"将YOLOv11模型导出为不同格式,可以根据您的部署环境优化性能。例如:\n",
"\n",
"- **ONNX**:提供高达3倍的CPU加速。\n",
"- **TensorRT**:提供高达5倍的GPU加速。\n",
"- **OpenVINO**:专门为英特尔硬件优化。\n",
"\n",
"这些格式提高了模型的速度和准确性,使它们在各种实际应用中更加高效。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "xxx",
"language": "python",
"name": "python3"
},
"language_info": {
"name": "python"
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.2"
}
},
"nbformat": 4,
Expand Down
47 changes: 17 additions & 30 deletions doc/ultralytics/modes/export.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -11,19 +11,11 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import sys\n",
"_path = \"/media/pc/data/lxw/ai/ultralytics\" # ultralytics api 所在目录\n",
"sys.path.append(_path)\n",
"from pathlib import Path\n",
"from ultralytics import settings\n",
"temp_dir = Path(\"../.temp\") # 设置缓存目录\n",
"temp_dir.mkdir(exist_ok=True, parents=True)\n",
"# 更新项目配置\n",
"settings.update({'weights_dir': f'{temp_dir}/weights'})"
"import set_env"
]
},
{
Expand Down Expand Up @@ -85,21 +77,21 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"| Argument | Type | Default | Description |\n",
"| ----------- | ---------------- | --------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n",
"| `format` | `str` | `'torchscript'` | Target format for the exported model, such as `'onnx'`, `'torchscript'`, `'tensorflow'`, or others, defining compatibility with various deployment environments. |\n",
"| `imgsz` | `int` or `tuple` | `640` | Desired image size for the model input. Can be an integer for square images or a tuple `(height, width)` for specific dimensions. |\n",
"| `keras` | `bool` | `False` | Enables export to Keras format for [TensorFlow](https://www.ultralytics.com/glossary/tensorflow) SavedModel, providing compatibility with TensorFlow serving and APIs. |\n",
"| `optimize` | `bool` | `False` | Applies optimization for mobile devices when exporting to TorchScript, potentially reducing model size and improving performance. |\n",
"| `half` | `bool` | `False` | Enables FP16 (half-precision) quantization, reducing model size and potentially speeding up inference on supported hardware. |\n",
"| `int8` | `bool` | `False` | Activates INT8 quantization, further compressing the model and speeding up inference with minimal [accuracy](https://www.ultralytics.com/glossary/accuracy) loss, primarily for edge devices. |\n",
"| `dynamic` | `bool` | `False` | Allows dynamic input sizes for ONNX, TensorRT and OpenVINO exports, enhancing flexibility in handling varying image dimensions. |\n",
"| `simplify` | `bool` | `True` | Simplifies the model graph for ONNX exports with `onnxslim`, potentially improving performance and compatibility. |\n",
"| `opset` | `int` | `None` | Specifies the ONNX opset version for compatibility with different ONNX parsers and runtimes. If not set, uses the latest supported version. |\n",
"| `workspace` | `float` | `4.0` | Sets the maximum workspace size in GiB for TensorRT optimizations, balancing memory usage and performance. |\n",
"| `nms` | `bool` | `False` | Adds Non-Maximum Suppression (NMS) to the CoreML export, essential for accurate and efficient detection post-processing. |\n",
"| `batch` | `int` | `1` | Specifies export model batch inference size or the max number of images the exported model will process concurrently in `predict` mode. |\n",
"| `device` | `str` | `None` | Specifies the device for exporting: GPU (`device=0`), CPU (`device=cpu`), MPS for Apple silicon (`device=mps`) or DLA for NVIDIA Jetson (`device=dla:0` or `device=dla:1`). |\n"
"| 参数 | 类型 | 默认值 | 描述 |\n",
"| ------------ | --------------- | ------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n",
"| `format` | `str` | `'torchscript'` | 导出模型的目标格式,例如 `'onnx'`, `'torchscript'`, `'tensorflow'` 等,定义了与各种部署环境的兼容性。 |\n",
"| `imgsz` | `int` or `tuple` | `640` | 模型输入的期望图像大小。对于方形图像可以是整数,或为特定维度的元组 `(height, width)`。 |\n",
"| `keras` | `bool` | `False` | 启用导出到 Keras 格式,用于 [TensorFlow](https://www.ultralytics.com/glossary/tensorflow) SavedModel,提供与 TensorFlow 服务和 API 的兼容性。 |\n",
"| `optimize` | `bool` | `False` | 在导出到 TorchScript 时应用针对移动设备的优化,可能减少模型大小并提高性能。 |\n",
"| `half` | `bool` | `False` | 启用 FP16(半精度)量化,减少模型大小并可能在支持的硬件上加快推理速度。 |\n",
"| `int8` | `bool` | `False` | 激活 INT8 量化,进一步压缩模型并在最小 [准确性](https://www.ultralytics.com/glossary/accuracy) 损失下加速推理,主要用于边缘设备。 |\n",
"| `dynamic` | `bool` | `False` | 允许 ONNX、TensorRT 和 OpenVINO 导出的动态输入尺寸,增强处理不同图像尺寸的灵活性。 |\n",
"| `simplify` | `bool` | `True` | 使用 `onnxslim` 简化 ONNX 导出的模型图,可能提高性能和兼容性。 |\n",
"| `opset` | `int` | `None` | 指定兼容不同 ONNX 解析器和运行时的 ONNX opset 版本。如果未设置,则使用最新支持的版本。 |\n",
"| `workspace` | `float` | `4.0` | 设置 TensorRT 优化的最大工作区大小(以 GiB 为单位),平衡内存使用和性能。 |\n",
"| `nms` | `bool` | `False` | 向 CoreML 导出添加非最大抑制(NMS),这对于准确和高效的检测后处理至关重要。 |\n",
"| `batch` | `int` | `1` | 指定导出模型的批处理推理大小或导出模型在 `predict` 模式下将同时处理的最大图像数量。 |\n",
"| `device` | `str` | `None` | 指定导出设备:GPU`device=0`)、CPU`device=cpu`)、Apple silicon 的 MPS(`device=mps`)或 NVIDIA Jetson 的 DLA(`device=dla:0` `device=dla:1`)。 |"
]
},
{
Expand Down Expand Up @@ -218,11 +210,6 @@
"- `optimize`: 针对移动或受限环境进行特定优化。\n",
"- `int8`: 启用 INT8 量化,非常有利于边缘部署。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
}
],
"metadata": {
Expand Down
Loading

0 comments on commit 4896953

Please sign in to comment.