mmyolo/docs/en/useful_tools/dataset_converters.md

# Dataset Conversion

The folder `tools/data_converters` currently contains `ballon2coco.py`, `yolo2coco.py`, and `labelme2coco.py` - three dataset conversion tools.

- `ballon2coco.py` converts the `balloon` dataset (this small dataset is for starters only) to COCO format.

```shell
python tools/dataset_converters/balloon2coco.py
```

- `yolo2coco.py` converts a dataset from `yolo-style` **.txt** format to COCO format, please use it as follows:

```shell
python tools/dataset_converters/yolo2coco.py /path/to/the/root/dir/of/your_dataset
```

Instructions:

1. `image_dir` is the root directory of the yolo-style dataset you need to pass to the script, which should contain `images`, `labels`, and `classes.txt`. `classes.txt` is the class declaration corresponding to the current dataset. One class a line. The structure of the root directory should be formatted as this example shows:

```bash
.
└── $ROOT_PATH
    ├── classes.txt
    ├── labels
    │    ├── a.txt
    │    ├── b.txt
    │    └── ...
    ├── images
    │    ├── a.jpg
    │    ├── b.png
    │    └── ...
    └── ...
```

2. The script will automatically check if `train.txt`, `val.txt`, and `test.txt` have already existed under `image_dir`. If these files are located, the script will organize the dataset accordingly. Otherwise, the script will convert the dataset into one file. The image paths in these files must be **ABSOLUTE** paths.
3. By default, the script will create a folder called `annotations` in the `image_dir` directory which stores the converted JSON file. If `train.txt`, `val.txt`, and `test.txt` are not found, the output file is `result.json`. Otherwise, the corresponding JSON file will be generated, named as `train.json`, `val.json`, and `test.json`. The `annotations` folder may look similar to this:

```bash
.
└── $ROOT_PATH
    ├── annotations
    │    ├── result.json
    │    └── ...
    ├── classes.txt
    ├── labels
    │    ├── a.txt
    │    ├── b.txt
    │    └── ...
    ├── images
    │    ├── a.jpg
    │    ├── b.png
    │    └── ...
    └── ...
```
[Update] Refactor Docs (#573) * update get_started * update * update * update * update * update * update * fix * fix * refactor en * update * update * update 2023-02-21 17:19:01 +08:00			`# Dataset Conversion`

			The folder `tools/data_converters` currently contains `ballon2coco.py`, `yolo2coco.py`, and `labelme2coco.py` - three dataset conversion tools.

			- `ballon2coco.py` converts the `balloon` dataset (this small dataset is for starters only) to COCO format.

			```shell
			`python tools/dataset_converters/balloon2coco.py`
			```

			- `yolo2coco.py` converts a dataset from `yolo-style` .txt format to COCO format, please use it as follows:

			```shell
			`python tools/dataset_converters/yolo2coco.py /path/to/the/root/dir/of/your_dataset`
			```

			`Instructions:`

			1. `image_dir` is the root directory of the yolo-style dataset you need to pass to the script, which should contain `images`, `labels`, and `classes.txt`. `classes.txt` is the class declaration corresponding to the current dataset. One class a line. The structure of the root directory should be formatted as this example shows:

			```bash
			`.`
			`└── $ROOT_PATH`
			`├── classes.txt`
			`├── labels`
			`│ ├── a.txt`
			`│ ├── b.txt`
			`│ └── ...`
			`├── images`
			`│ ├── a.jpg`
			`│ ├── b.png`
			`│ └── ...`
			`└── ...`
			```

			2. The script will automatically check if `train.txt`, `val.txt`, and `test.txt` have already existed under `image_dir`. If these files are located, the script will organize the dataset accordingly. Otherwise, the script will convert the dataset into one file. The image paths in these files must be ABSOLUTE paths.
			3. By default, the script will create a folder called `annotations` in the `image_dir` directory which stores the converted JSON file. If `train.txt`, `val.txt`, and `test.txt` are not found, the output file is `result.json`. Otherwise, the corresponding JSON file will be generated, named as `train.json`, `val.json`, and `test.json`. The `annotations` folder may look similar to this:

			```bash
			`.`
			`└── $ROOT_PATH`
			`├── annotations`
			`│ ├── result.json`
			`│ └── ...`
			`├── classes.txt`
			`├── labels`
			`│ ├── a.txt`
			`│ ├── b.txt`
			`│ └── ...`
			`├── images`
			`│ ├── a.jpg`
			`│ ├── b.png`
			`│ └── ...`
			`└── ...`
			```