[Fix] Fix `dataset_analysis` ProgressBar incorrect when `RepeatDataset` (#338)

* Fix ProgressBar incorrect when RepeatDataset

* Fix lint

* Improve doc

* Improve doc

* Improve import

* Fix hard code

* Fix hard code
pull/363/head
HinGwenWoong 2022-12-06 10:21:41 +08:00 committed by GitHub
parent 46b2494453
commit 78a23ca8b9
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 59 additions and 57 deletions

View File

@ -119,8 +119,11 @@ python tools/analysis_tools/browse_dataset.py 'configs/yolov5/yolov5_s-v61_syncb
### Visualize dataset analysis
`tools/analysis_tools/dataset_analysis.py` help users get the renderings of the four functions, and save the pictures to the `dataset_analysis` folder under the current running directory.
Description of the script's functions:
The data required by each sub function is obtained through the data preparation of `main()`.
Function 1: Generated by the sub function `show_bbox_num` to display the distribution of categories and bbox instances.
<img src="https://user-images.githubusercontent.com/90811472/200314770-4fb21626-72f2-4a4c-be5d-bf860ad830ec.jpg"/>
@ -143,17 +146,16 @@ Print List: Generated by the sub function `show_class_list` and `show_data_list`
```shell
python tools/analysis_tools/dataset_analysis.py ${CONFIG} \
[-h] \
[--type ${TYPE}] \
[--class-name ${CLASS_NAME}] \
[--area-rule ${AREA_RULE}] \
[--func ${FUNC}] \
[--output-dir ${OUTPUT_DIR}]
[--type ${TYPE}] \
[--class-name ${CLASS_NAME}] \
[--area-rule ${AREA_RULE}] \
[--func ${FUNC}] \
[--out-dir ${OUT_DIR}]
```
E,g
1.Use `config` file `configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py` analyze the dataset, By default,the data loadingt type is `train_dataset`, the area rule is `[0,32,96,1e5]`, generate a result graph containing all functions and save the graph to the current running directory `./dataset_analysis` folder:
1.Use `config` file `configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py` analyze the dataset, By default,the data loading type is `train_dataset`, the area rule is `[0,32,96,1e5]`, generate a result graph containing all functions and save the graph to the current running directory `./dataset_analysis` folder:
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py
@ -163,35 +165,35 @@ python tools/analysis_tools/dataset_analysis.py configs/yolov5/voc/yolov5_s-v61_
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py \
--val-dataset
--val-dataset
```
3.Use `config` file `configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py` analyze the dataset, change the display of all generated classes to specific classes. Take the display of `person` classes as an example:
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py \
--class-name person
--class-name person
```
4.Use `config` file `configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py` analyze the dataset, redefine the area rule through `--area-rule` . Take `30 70 125` as an example, the area rule becomes `[0,30,70,125,1e5]`
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py \
--area-rule 30 70 125
--area-rule 30 70 125
```
5.Use `config` file `configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py` analyze the dataset, change the display of four function renderings to only display `Function 1` as an example:
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py \
--func show_bbox_num
--func show_bbox_num
```
6.Use `config` file `configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py` analyze the dataset, modify the picture saving address to `work_ir/dataset_analysis`:
6.Use `config` file `configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py` analyze the dataset, modify the picture saving address to `work_dirs/dataset_analysis`:
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py \
--output-dir work_dir/dataset_analysis
--out-dir work_dirs/dataset_analysis
```
## Dataset Conversion

View File

@ -119,8 +119,11 @@ python tools/analysis_tools/browse_dataset.py 'configs/yolov5/yolov5_s-v61_syncb
### 可视化数据集分析
脚本 `tools/analysis_tools/dataset_analysis.py` 能够帮助用户得到四种功能的结果图,并将图片保存到当前运行目录下的 `dataset_analysis` 文件夹中。
关于该脚本的功能的说明:
通过 `main()` 的数据准备,得到每个子函数所需要的数据。
功能一:显示类别和 bbox 实例个数的分布图,通过子函数 `show_bbox_num` 生成。
<img src="https://user-images.githubusercontent.com/90811472/200314770-4fb21626-72f2-4a4c-be5d-bf860ad830ec.jpg"/>
@ -143,55 +146,55 @@ python tools/analysis_tools/browse_dataset.py 'configs/yolov5/yolov5_s-v61_syncb
```shell
python tools/analysis_tools/dataset_analysis.py ${CONFIG} \
[-h] \
[--val-dataset ${TYPE}] \
[--class-name ${CLASS_NAME}] \
[--area-rule ${AREA_RULE}] \
[--func ${FUNC}] \
[--output-dir ${OUTPUT_DIR}]
[-h] \
[--val-dataset ${TYPE}] \
[--class-name ${CLASS_NAME}] \
[--area-rule ${AREA_RULE}] \
[--func ${FUNC}] \
[--out-dir ${OUT_DIR}]
```
例子:
1.使用 `config` 文件 `configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py` 分析数据集,其中默认设置:数据加载类型为 `train_dataset` ,面积规则设置为 `[0,32,96,1e5]` ,生成包含所有类的结果图并将图片保存到当前运行目录下 `./dataset_analysis` 文件夹中:
1. 使用 `config` 文件 `configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py` 分析数据集,其中默认设置:数据加载类型为 `train_dataset` ,面积规则设置为 `[0,32,96,1e5]` ,生成包含所有类的结果图并将图片保存到当前运行目录下 `./dataset_analysis` 文件夹中:
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py
```
2.使用 `config` 文件 `configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py` 分析数据集,通过 `--val-dataset` 设置将数据加载类型由默认的 `train_dataset` 改为 `val_dataset`
2. 使用 `config` 文件 `configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py` 分析数据集,通过 `--val-dataset` 设置将数据加载类型由默认的 `train_dataset` 改为 `val_dataset`
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py \
--val-dataset
```
3.使用 `config` 文件 `configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py` 分析数据集,通过 `--class-name` 设置将生成所有类改为特定类显示,以显示 `person` 为例:
3. 使用 `config` 文件 `configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py` 分析数据集,通过 `--class-name` 设置将生成所有类改为特定类显示,以显示 `person` 为例:
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py \
--class-name person
```
4.使用 `config` 文件 `configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py` 分析数据集,通过 `--area-rule` 重新定义面积规则,以 `30 70 125` 为例,面积规则变为 `[0,30,70,125,1e5]`
4. 使用 `config` 文件 `configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py` 分析数据集,通过 `--area-rule` 重新定义面积规则,以 `30 70 125` 为例,面积规则变为 `[0,30,70,125,1e5]`
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py \
--area-rule 30 70 125
```
5.使用 `config` 文件 `configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py` 分析数据集,通过 `--func` 设置,将显示四个功能效果图改为只显示 `功能一` 为例:
5. 使用 `config` 文件 `configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py` 分析数据集,通过 `--func` 设置,将显示四个功能效果图改为只显示 `功能一` 为例:
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py \
--func show_bbox_num
```
6.使用 `config` 文件 `configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py` 分析数据集,通过 `--output-dir` 设置修改图片保存地址,以 `work_ir/dataset_analysis` 地址为例:
6. 使用 `config` 文件 `configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py` 分析数据集,通过 `--out-dir` 设置修改图片保存地址,以 `work_dirs/dataset_analysis` 地址为例:
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/voc/yolov5_s-v61_fast_1xb64-50e_voc.py \
--output-dir work_dir/dataset_analysis
--out-dir work_dirs/dataset_analysis
```
## 数据集转换

View File

@ -47,16 +47,16 @@ def parse_args():
],
help='Dataset analysis function selection.')
parser.add_argument(
'--output-dir',
default='./',
'--out-dir',
default='./dataset_analysis',
type=str,
help='Save address of dataset analysis visualization results,'
'Save in "./dataset_analysis/" by default')
help='Output directory of dataset analysis visualization results,'
' Save in "./dataset_analysis/" by default')
args = parser.parse_args()
return args
def show_bbox_num(cfg, args, fig_set, class_name, class_num):
def show_bbox_num(cfg, out_dir, fig_set, class_name, class_num):
"""Display the distribution map of categories and number of bbox
instances."""
print('\n\nDrawing bbox_num figure:')
@ -73,8 +73,7 @@ def show_bbox_num(cfg, args, fig_set, class_name, class_num):
plt.ylabel('Num of instances')
plt.title(cfg.dataset_type)
# Save figuer
out_dir = os.path.join(args.output_dir, 'dataset_analysis')
# Save figure
if not os.path.exists(out_dir):
os.makedirs(out_dir)
out_name = fig_set['out_name']
@ -86,7 +85,7 @@ def show_bbox_num(cfg, args, fig_set, class_name, class_num):
print(f'End and save in {out_dir}/{out_name}_bbox_num.jpg')
def show_bbox_wh(args, fig_set, class_bbox_w, class_bbox_h, class_name):
def show_bbox_wh(out_dir, fig_set, class_bbox_w, class_bbox_h, class_name):
"""Display the width and height distribution of categories and bbox
instances."""
print('\n\nDrawing bbox_wh figure:')
@ -97,7 +96,7 @@ def show_bbox_wh(args, fig_set, class_bbox_w, class_bbox_h, class_name):
# Set the position of the map and label on the x-axis
positions_w = list(range(0, 12 * len(class_name), 12))
positions_h = list(range(6, 12 * len(class_name), 12))
positions_x_lable = list(range(3, 12 * len(class_name) + 1, 12))
positions_x_label = list(range(3, 12 * len(class_name) + 1, 12))
ax.violinplot(
class_bbox_w, positions_w, showmeans=True, showmedians=True, widths=4)
ax.violinplot(
@ -152,7 +151,7 @@ def show_bbox_wh(args, fig_set, class_bbox_w, class_bbox_h, class_name):
fontsize=fig_set['fontsize'])
# Draw Legend
plt.setp(ax, xticks=positions_x_lable, xticklabels=class_name)
plt.setp(ax, xticks=positions_x_label, xticklabels=class_name)
labels = ['bbox_w', 'bbox_h']
colors = ['steelblue', 'darkorange']
patches = [
@ -164,8 +163,7 @@ def show_bbox_wh(args, fig_set, class_bbox_w, class_bbox_h, class_name):
ax.set_position([box.x0, box.y0, box.width, box.height * 0.8])
ax.legend(loc='upper center', handles=patches, ncol=2)
# Save figuer
out_dir = os.path.join(args.output_dir, 'dataset_analysis')
# Save figure
if not os.path.exists(out_dir):
os.makedirs(out_dir)
out_name = fig_set['out_name']
@ -177,7 +175,7 @@ def show_bbox_wh(args, fig_set, class_bbox_w, class_bbox_h, class_name):
print(f'End and save in {out_dir}/{out_name}_bbox_wh.jpg')
def show_bbox_wh_ratio(args, fig_set, class_name, class_bbox_ratio):
def show_bbox_wh_ratio(out_dir, fig_set, class_name, class_bbox_ratio):
"""Display the distribution map of category and bbox instance width and
height ratio."""
print('\n\nDrawing bbox_wh_ratio figure:')
@ -224,8 +222,7 @@ def show_bbox_wh_ratio(args, fig_set, class_name, class_bbox_ratio):
# Set the position of the map and label on the x-axis
plt.setp(ax, xticks=positions, xticklabels=class_name)
# Save figuer
out_dir = os.path.join(args.output_dir, 'dataset_analysis')
# Save figure
if not os.path.exists(out_dir):
os.makedirs(out_dir)
out_name = fig_set['out_name']
@ -237,7 +234,7 @@ def show_bbox_wh_ratio(args, fig_set, class_name, class_bbox_ratio):
print(f'End and save in {out_dir}/{out_name}_bbox_ratio.jpg')
def show_bbox_area(args, fig_set, area_rule, class_name, bbox_area_num):
def show_bbox_area(out_dir, fig_set, area_rule, class_name, bbox_area_num):
"""Display the distribution map of category and bbox instance area based on
the rules of large, medium and small objects."""
print('\n\nDrawing bbox_area figure:')
@ -285,8 +282,7 @@ def show_bbox_area(args, fig_set, area_rule, class_name, bbox_area_num):
ax.set_position([box.x0, box.y0, box.width, box.height * 0.8])
ax.legend(loc='upper center', handles=patches, ncol=len(area_rule) - 1)
# Save figuer
out_dir = os.path.join(args.output_dir, 'dataset_analysis')
# Save figure
if not os.path.exists(out_dir):
os.makedirs(out_dir)
out_name = fig_set['out_name']
@ -383,9 +379,6 @@ def main():
replace_pipeline_to_none(cfg.val_dataloader)
dataset = DATASETS.build(cfg.val_dataloader.dataset)
# Build lists to store data for all raw data
data_list = dataset
# 2.Prepare data
# Drawing settings
fig_all_set = {
@ -440,8 +433,8 @@ def main():
# Get the quantity and bbox data corresponding to each category
print('\nRead the information of each picture in the dataset:')
progress_bar = ProgressBar(len(dataset))
for img in data_list:
for instance in img['instances']:
for index in range(len(dataset)):
for instance in dataset[index]['instances']:
if instance[
'bbox_label'] in classes_idx and args.class_name is None:
class_num[instance['bbox_label']] += 1
@ -481,18 +474,22 @@ def main():
# 3.draw Dataset Information
if args.func is None:
show_bbox_num(cfg, args, fig_set, class_name, class_num)
show_bbox_wh(args, fig_set, class_bbox_w, class_bbox_h, class_name)
show_bbox_wh_ratio(args, fig_set, class_name, class_bbox_ratio)
show_bbox_area(args, fig_set, area_rule, class_name, bbox_area_num)
show_bbox_num(cfg, args.out_dir, fig_set, class_name, class_num)
show_bbox_wh(args.out_dir, fig_set, class_bbox_w, class_bbox_h,
class_name)
show_bbox_wh_ratio(args.out_dir, fig_set, class_name, class_bbox_ratio)
show_bbox_area(args.out_dir, fig_set, area_rule, class_name,
bbox_area_num)
elif args.func == 'show_bbox_num':
show_bbox_num(cfg, args, fig_set, class_name, class_num)
show_bbox_num(cfg, args.out_dir, fig_set, class_name, class_num)
elif args.func == 'show_bbox_wh':
show_bbox_wh(args, fig_set, class_bbox_w, class_bbox_h, class_name)
show_bbox_wh(args.out_dir, fig_set, class_bbox_w, class_bbox_h,
class_name)
elif args.func == 'show_bbox_wh_ratio':
show_bbox_wh_ratio(args, fig_set, class_name, class_bbox_ratio)
show_bbox_wh_ratio(args.out_dir, fig_set, class_name, class_bbox_ratio)
elif args.func == 'show_bbox_area':
show_bbox_area(args, fig_set, area_rule, class_name, bbox_area_num)
show_bbox_area(args.out_dir, fig_set, area_rule, class_name,
bbox_area_num)
else:
raise RuntimeError(
'Please enter the correct func name, e.g., show_bbox_num')