MQ-Det/CUSTOMIZED_PRETRAIN.md

2.2 KiB

MQ-Det supports modulated training on any datasets in COCO format. Let's take COCO for example.

To conduct customized modulating, you can follow these steps.

1. Add customized dataset infomation in two places of the code

  1. L120: Add DatasetCatalog

  2. L394: Add to factory

NOTE: make sure to add "_grounding" in dataset name.

Here we add a new dataset coco_grounding_train_for_obj365.

2. Acquire customized config files

You can modified upon the official pretraining config file to get a customized config file. Here we provide an example, which is modified upon mq-glip-t.yaml with"NOTE" on all modifications. You can customize your own needs following the "NOTE" in the file.

Make sure to use correct DATASETS.TRAIN and VISION_QUERY.QUERY_BANK_PATH.

Here we use a new config file configs/pretrain/mq-glip-t_coco.yaml.

3. Extract vision queries

python tools/train_net.py \
--config-file configs/pretrain/mq-glip-t_coco.yaml \
--extract_query \
VISION_QUERY.QUERY_BANK_PATH "" \
VISION_QUERY.QUERY_BANK_SAVE_PATH MODEL/coco_query_5000_sel_tiny.pth

Here we can get a new query bank MODEL/coco_query_5000_sel_tiny.pth. Make sure the VISION_QUERY.QUERY_BANK_PATH in the config file to be this query bank path.

You can specify VISION_QUERY.MAX_QUERY_NUMBER (number of queries for each category in the bank, default 5000) to any number to control the bank size.

4. Conduct modulated pretraining

python -m torch.distributed.launch --nproc_per_node=8 tools/train_net.py --config-file configs/pretrain/mq-glip-t_coco.yaml --use-tensorboard OUTPUT_DIR 'OUTPUT/MQ-GLIP-TINY-COCO/'

You can specify VISION_QUERY.NUM_QUERY_PER_CLASS (default 5) to control the number of vision queries for each category in one forward process during training.