使用现有模型进行推理¶

MMDetection 在模型库中提供了数百个预训练检测模型。本笔记将展示如何进行推理，即使用训练好的模型来检测图像中的物体。

在 MMDetection 中，模型由一个配置文件定义，现有的模型参数保存在一个检查点文件中。

首先，我们推荐使用 RTMDet，并使用这个配置文件和这个检查点文件。建议将检查点文件下载到 checkpoints 目录。

用于推理的高级 API - `Inferencer`¶

在 OpenMMLab 中，所有推理操作都统一到一个新的接口 - Inferencer。Inferencer 旨在为用户提供简洁易用的 API，并在不同的 OpenMMLab 库中共享非常类似的接口。您可以在 demo/inference_demo.ipynb 中找到一个笔记本演示。

基本用法¶

您只需使用 3 行代码就可以获得图像的推理结果。

from mmdet.apis import DetInferencer

# Initialize the DetInferencer
inferencer = DetInferencer('rtmdet_tiny_8xb32-300e_coco')

# Perform inference
inferencer('demo/demo.jpg', show=True)

结果输出将在新的窗口中显示：.

注意

如果您在没有 GUI 的服务器上运行 MMDetection，或者通过禁用 X11 转发的 SSH 隧道运行，则 show 选项将无法工作。但是，您仍然可以通过设置 out_dir 参数将可视化结果保存到文件中。阅读 Dumping Results 了解详细信息。

初始化¶

每个 Inferencer 必须使用一个模型进行初始化。您也可以在初始化期间选择推理设备。

模型初始化¶

要使用 MMDetection 的预训练模型进行推理，可以将模型名称传递给 model 参数。权重将自动从 OpenMMLab 的模型库中下载并加载。
```
inferencer = DetInferencer(model='rtmdet_tiny_8xb32-300e_coco')
```
列出 MMDetection 中所有模型名称非常容易。
```
# models is a list of model names, and them will print automatically
models = DetInferencer.list_models('mmdet')
```
您可以通过将权重路径/URL 传递给 weights 来加载另一个权重。
```
inferencer = DetInferencer(model='rtmdet_tiny_8xb32-300e_coco', weights='path/to/rtmdet.pth')
```
要加载自定义配置和权重，您可以将配置文件路径传递给 model，并将权重路径传递给 weights。
```
inferencer = DetInferencer(model='path/to/rtmdet_config.py', weights='path/to/rtmdet.pth')
```

默认情况下，MMEngine 将配置信息转储到权重文件中。如果您使用 MMEngine 训练了权重，则也可以将权重文件路径传递给 weights，而无需指定 model

# It will raise an error if the config file cannot be found in the weight. Currently, within the MMDetection model repository, only the weights of ddq-detr-4scale_r50 can be loaded in this manner.
inferencer = DetInferencer(weights='https://download.openmmlab.com/mmdetection/v3.0/ddq/ddq-detr-4scale_r50_8xb2-12e_coco/ddq-detr-4scale_r50_8xb2-12e_coco_20230809_170711-42528127.pth')

将配置文件路径传递给 model 而不指定 weight 会导致模型被随机初始化。

设备¶

每个 Inferencer 实例都绑定到一个设备。默认情况下，MMEngine 会自动决定最佳设备。您也可以通过指定 device 参数来更改设备。例如，您可以使用以下代码在 GPU 1 上创建 Inferencer。

inferencer = DetInferencer(model='rtmdet_tiny_8xb32-300e_coco', device='cuda:1')

在 CPU 上创建 Inferencer

inferencer = DetInferencer(model='rtmdet_tiny_8xb32-300e_coco', device='cpu')

有关所有受支持形式的详细信息，请参阅 torch.device。

推理¶

初始化 Inferencer 后，您可以直接将需要进行推理的原始数据传递进来，并从返回值中获取推理结果。

输入¶

输入可以是以下类型之一

str：图像的路径/URL。
```
inferencer('demo/demo.jpg')
```

array：numpy 数组中的图像。它应该以 BGR 格式存储。

import mmcv
array = mmcv.imread('demo/demo.jpg')
inferencer(array)

list：以上基本类型列表。列表中的每个元素将分别处理。

inferencer(['img_1.jpg', 'img_2.jpg])
# You can even mix the types
inferencer(['img_1.jpg', array])

str：目录路径。目录中的所有图像都将被处理。
```
inferencer('path/to/your_imgs/')
```

输出¶

默认情况下，每个 Inferencer 都以字典格式返回预测结果。

visualization 包含可视化的预测结果。
predictions 包含可序列化为 JSON 的预测结果。但是，默认情况下它是一个空列表，除非设置了 return_vis=True。

{
      'predictions' : [
        # Each instance corresponds to an input image
        {
          'labels': [...],  # int list of length (N, )
          'scores': [...],  # float list of length (N, )
          'bboxes': [...],  # 2d list of shape (N, 4), format: [min_x, min_y, max_x, max_y]
        },
        ...
      ],
      'visualization' : [
        array(..., dtype=uint8),
      ]
  }

如果您希望获取模型的原始输出，可以将 return_datasamples 设置为 True，以获取原始 DataSample，该样本将存储在 predictions 中。

转储结果¶

除了从返回值中获取预测结果外，您还可以通过设置 out_dir 和 no_save_pred/no_save_vis 参数将预测结果/可视化结果导出到文件中。

inferencer('demo/demo.jpg', out_dir='outputs/', no_save_pred=False)

结果将以以下目录结构保存：

outputs
├── preds
│   └── demo.json
└── vis
    └── demo.jpg

每个文件的名称与对应的输入图像文件名相同。如果输入图像是一个数组，文件名将从 0 开始的数字。

批量推理¶

您可以通过设置 batch_size 来自定义批次大小。默认批次大小为 1。

API¶

以下是您可以使用的参数的完整列表。

DetInferencer.__init__()

参数	类型	类型	描述
`model`	str，可选	None	配置文件路径或元文件中定义的模型名称。例如，它可以是 'rtmdet-s' 或 'rtmdet_s_8xb32-300e_coco' 或 'configs/rtmdet/rtmdet_s_8xb32-300e_coco.py'。如果模型未指定，用户必须提供由 MMEngine 保存的包含配置字符串的 `weights`。
`weights`	str，可选	None	检查点路径。如果未指定，并且 `model` 是元文件中的模型名称，则权重将从元文件中加载。
`device`	str，可选	None	用于推理的设备，接受 `torch.device` 允许的所有字符串。例如，'cuda:0' 或 'cpu'。如果为 None，则将自动使用可用设备。
`scope`	str，可选	'mmdet'	模型的范围。
`palette`	str	'none'	用于可视化的颜色调色板。优先级顺序为调色板 -> 配置 -> 检查点。
`show_progress`	bool	True	控制在推理过程中是否显示进度条。

DetInferencer.__call__()

参数	类型	默认	描述
`输入`	str/list/tuple/np.array	必需	可以是图像/文件夹路径、np 数组或列表/元组（包含图像路径或 np 数组）。
`批次大小`	int	1	推理批次大小。
`打印结果`	bool	False	是否将推理结果打印到控制台。
`显示`	bool	False	是否在弹出窗口中显示可视化结果。
`等待时间`	float	0	显示的间隔时间。
`不保存可视化`	bool	False	是否强制不保存预测可视化结果。
`绘制预测`	bool	True	是否绘制预测的边界框。
`预测得分阈值`	float	0.3	绘制边界框的最小得分。
`返回数据样本`	bool	False	是否将结果作为数据样本返回。如果为 False，则结果将打包到字典中。
`打印结果`	bool	False	是否将推理结果打印到控制台。
`不保存预测`	bool	True	是否强制不保存预测结果。
`输出目录`	str	''	结果的输出目录。
`文本`	str/list[str]，可选	None	文本提示。
`物质文本`	str/list[str]，可选	None	开放全景任务的物质文本提示。
`自定义实体`	bool	False	是否使用自定义实体。仅在 GLIP 中使用。
kwargs			传递给 :meth:`preprocess`、:meth:`forward`、:meth:`visualize` 和 :meth:`postprocess` 的其他关键字参数。kwargs 中的每个键都应该在相应的 `preprocess_kwargs`、`forward_kwargs`、`visualize_kwargs` 和 `postprocess_kwargs` 集合中。

演示¶

我们还提供四个演示脚本，这些脚本使用高级 API 实现并支持功能代码。源代码可在此处获取 here.

图像演示¶

此脚本对单个图像执行推理。

python demo/image_demo.py \
    ${IMAGE_FILE} \
    ${CONFIG_FILE} \
    [--weights ${WEIGHTS}] \
    [--device ${GPU_ID}] \
    [--pred-score-thr ${SCORE_THR}]

示例

python demo/image_demo.py demo/demo.jpg \
    configs/rtmdet/rtmdet_l_8xb32-300e_coco.py \
    --weights checkpoints/rtmdet_l_8xb32-300e_coco_20220719_112030-5a0be7c4.pth \
    --device cpu

网络摄像头演示¶

这是一个来自网络摄像头的实时演示。

python demo/webcam_demo.py \
    ${CONFIG_FILE} \
    ${CHECKPOINT_FILE} \
    [--device ${GPU_ID}] \
    [--camera-id ${CAMERA-ID}] \
    [--score-thr ${SCORE_THR}]

示例

python demo/webcam_demo.py \
    configs/rtmdet/rtmdet_l_8xb32-300e_coco.py \
    checkpoints/rtmdet_l_8xb32-300e_coco_20220719_112030-5a0be7c4.pth

视频演示¶

此脚本对视频执行推理。

python demo/video_demo.py \
    ${VIDEO_FILE} \
    ${CONFIG_FILE} \
    ${CHECKPOINT_FILE} \
    [--device ${GPU_ID}] \
    [--score-thr ${SCORE_THR}] \
    [--out ${OUT_FILE}] \
    [--show] \
    [--wait-time ${WAIT_TIME}]

示例

python demo/video_demo.py demo/demo.mp4 \
    configs/rtmdet/rtmdet_l_8xb32-300e_coco.py \
    checkpoints/rtmdet_l_8xb32-300e_coco_20220719_112030-5a0be7c4.pth \
    --out result.mp4

使用 GPU 加速的视频演示¶

此脚本使用 GPU 加速对视频执行推理。

python demo/video_gpuaccel_demo.py \
    ${VIDEO_FILE} \
    ${CONFIG_FILE} \
    ${CHECKPOINT_FILE} \
    [--device ${GPU_ID}] \
    [--score-thr ${SCORE_THR}] \
    [--nvdecode] \
    [--out ${OUT_FILE}] \
    [--show] \
    [--wait-time ${WAIT_TIME}]

示例

python demo/video_gpuaccel_demo.py demo/demo.mp4 \
    configs/rtmdet/rtmdet_l_8xb32-300e_coco.py \
    checkpoints/rtmdet_l_8xb32-300e_coco_20220719_112030-5a0be7c4.pth \
    --nvdecode --out result.mp4

大型图像推理演示¶

这是一个用于对大型图像进行切片推理的脚本。

python demo/large_image_demo.py \
	${IMG_PATH} \
	${CONFIG_FILE} \
	${CHECKPOINT_FILE} \
	--device ${GPU_ID}  \
	--show \
	--tta  \
	--score-thr ${SCORE_THR} \
	--patch-size ${PATCH_SIZE} \
	--patch-overlap-ratio ${PATCH_OVERLAP_RATIO} \
	--merge-iou-thr ${MERGE_IOU_THR} \
	--merge-nms-type ${MERGE_NMS_TYPE} \
	--batch-size ${BATCH_SIZE} \
	--debug \
	--save-patch

示例

# inferecnce without tta
wget -P checkpoint https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r101_fpn_2x_coco/faster_rcnn_r101_fpn_2x_coco_bbox_mAP-0.398_20200504_210455-1d2dac9c.pth

python demo/large_image_demo.py \
    demo/large_image.jpg \
    configs/faster_rcnn/faster-rcnn_r101_fpn_2x_coco.py \
    checkpoint/faster_rcnn_r101_fpn_2x_coco_bbox_mAP-0.398_20200504_210455-1d2dac9c.pth

# inference with tta
wget -P checkpoint https://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_r50_fpn_1x_coco/retinanet_r50_fpn_1x_coco_20200130-c2398f9e.pth

python demo/large_image_demo.py \
    demo/large_image.jpg \
    configs/retinanet/retinanet_r50_fpn_1x_coco.py \
    checkpoint/retinanet_r50_fpn_1x_coco_20200130-c2398f9e.pth --tta

多模态算法推理演示和评估¶

随着多模态视觉算法的不断发展，MMDetection 也支持了此类算法。本节以 GLIP 算法和模型为例，演示如何使用与多模态算法相对应的演示和评估脚本。此外，MMDetection 集成了一个 gradio_demo 项目，允许开发人员在本地设备上快速使用 MMDetection 中所有图像输入任务。查看文档以获取更多详细信息。

准备¶

请先确保已安装正确的依赖项

# if source
pip install -r requirements/multimodal.txt

# if wheel
mim install mmdet[multimodal]

MMDetection 已经实现了 GLIP 算法并提供了权重，你可以直接从 URL 下载

cd mmdetection
wget https://download.openmmlab.com/mmdetection/v3.0/glip/glip_tiny_a_mmdet-b3654169.pth

推理¶

模型成功下载后，你可以使用 demo/image_demo.py 脚本运行推理。

python demo/image_demo.py demo/demo.jpg glip_tiny_a_mmdet-b3654169.pth --texts bench

演示结果将类似于此

如果用户希望检测多个目标，请在 --texts 后以 xx. xx 格式声明它们。

python demo/image_demo.py demo/demo.jpg glip_tiny_a_mmdet-b3654169.pth --texts 'bench. car'

结果将类似于此

你也可以使用句子作为 --texts 字段的输入提示，例如

python demo/image_demo.py demo/demo.jpg glip_tiny_a_mmdet-b3654169.pth --texts 'There are a lot of cars here.'