* [Fix]: Set qkv bias to False for cae and True for mae (#303) * [Fix]: Add mmcls transformer layer choice * [Fix]: Fix transformer encoder layer bug * [Fix]: Change UT of cae * [Feature]: Change the file name of cosine annealing hook (#304) * [Feature]: Change cosine annealing hook file name * [Feature]: Add UT for cosine annealing hook * [Fix]: Fix lint * read tutorials and fix typo (#308) * [Fix] fix config errors in MAE (#307) * update readthedocs algorithm readme (#310) * [Docs] Replace markdownlint with mdformat (#311) * Replace markdownlint with mdformat to avoid installing ruby * fix typo * add 'ba' to codespell ignore-words-list * Configure Myst-parser to parse anchor tag (#309) * [Docs] rewrite install.md (#317) * rewrite the install.md * add faq.md * fix lint * add FAQ to README * add Chinese version * fix typo * fix format * remove modification * fix format * [Docs] refine README.md file (#318) * refine README.md file * fix lint * format language button * rename getting_started.md * revise index.rst * add model_zoo.md to index.rst * fix lint * refine readme Co-authored-by: Jiahao Xie <52497952+Jiahao000@users.noreply.github.com> * [Enhance] update byol models and results (#319) * Update version information (#321) Co-authored-by: Yuan Liu <30762564+YuanLiuuuuuu@users.noreply.github.com> Co-authored-by: Yi Lu <21515006@zju.edu.cn> Co-authored-by: RenQin <45731309+soonera@users.noreply.github.com> Co-authored-by: Jiahao Xie <52497952+Jiahao000@users.noreply.github.com>
3.1 KiB
Tutorial 1: Adding New Dataset
In this tutorial, we introduce the basic steps to create your customized dataset:
If your algorithm does not need any customized dataset, you can use these off-the-shelf datasets under datasets. But to use these existing datasets, you have to convert your dataset to existing dataset format.
An example of customized dataset
Assuming the format of your dataset's annotation file is:
000001.jpg 0
000002.jpg 1
To write a new dataset, you need to implement:
DataSource
: inherited fromBaseDataSource
and responsible for loading the annotation files and reading images.Dataset
: inherited fromBaseDataset
and responsible for applying transformation to images and packing these images.
Creating the DataSource
Assume the name of your DataSource
is NewDataSource
, you can create a file, named new_data_source.py
under mmselfsup/datasets/data_sources
and implement NewDataSource
in it.
import mmcv
import numpy as np
from ..builder import DATASOURCES
from .base import BaseDataSource
@DATASOURCES.register_module()
class NewDataSource(BaseDataSource):
def load_annotations(self):
assert isinstance(self.ann_file, str)
data_infos = []
# writing your code here.
return data_infos
Then, add NewDataSource
in mmselfsup/dataset/data_sources/__init__.py
.
from .base import BaseDataSource
...
from .new_data_source import NewDataSource
__all__ = [
'BaseDataSource', ..., 'NewDataSource'
]
Creating the Dataset
Assume the name of your Dataset
is NewDataset
, you can create a file, named new_dataset.py
under mmselfsup/datasets
and implement NewDataset
in it.
# Copyright (c) OpenMMLab. All rights reserved.
import torch
from mmcv.utils import build_from_cfg
from torchvision.transforms import Compose
from .base import BaseDataset
from .builder import DATASETS, PIPELINES, build_datasource
from .utils import to_numpy
@DATASETS.register_module()
class NewDataset(BaseDataset):
def __init__(self, data_source, num_views, pipelines, prefetch=False):
# writing your code here
def __getitem__(self, idx):
# writing your code here
return dict(img=img)
def evaluate(self, results, logger=None):
return NotImplemented
Then, add NewDataset
in mmselfsup/dataset/__init__.py
.
from .base import BaseDataset
...
from .new_dataset import NewDataset
__all__ = [
'BaseDataset', ..., 'NewDataset'
]
Modify config file
To use NewDataset
, you can modify the config as the following:
train=dict(
type='NewDataset',
data_source=dict(
type='NewDataSource',
),
num_views=[2],
pipelines=[train_pipeline],
prefetch=prefetch,
))