mmpretrain/docs/en/api/models.rst

353 lines
6.1 KiB
ReStructuredText
Raw Normal View History

.. role:: hidden
:class: hidden-section
.. module:: mmpretrain.models
mmpretrain.models
===================================
The ``models`` package contains several sub-packages for addressing the different components of a model.
- :mod:`~mmpretrain.models.classifiers`: The top-level module which defines the whole process of a classification model.
- :mod:`~mmpretrain.models.selfsup`: The top-level module which defines the whole process of a self-supervised learning model.
- :mod:`~mmpretrain.models.retrievers`: The top-level module which defines the whole process of a retrieval model.
- :mod:`~mmpretrain.models.backbones`: Usually a feature extraction network, e.g., ResNet, MobileNet.
- :mod:`~mmpretrain.models.necks`: The component between backbones and heads, e.g., GlobalAveragePooling.
- :mod:`~mmpretrain.models.heads`: The component for specific tasks.
- :mod:`~mmpretrain.models.losses`: Loss functions.
- :mod:`~mmpretrain.models.utils`: Some helper functions and common components used in various networks.
- :mod:`~mmpretrain.models.utils.data_preprocessor`: The component before model to preprocess the inputs, e.g., ClsDataPreprocessor.
- :ref:`components`: Common components used in various networks.
- :ref:`helpers`: Helper functions.
Build Functions
---------------
.. autosummary::
:toctree: generated
:nosignatures:
build_classifier
build_backbone
build_neck
build_head
build_loss
.. module:: mmpretrain.models.classifiers
Classifiers
------------------
.. autosummary::
:toctree: generated
:nosignatures:
BaseClassifier
ImageClassifier
TimmClassifier
HuggingFaceClassifier
.. module:: mmpretrain.models.selfsup
Self-supervised Algorithms
--------------------------
.. _selfsup_algorithms:
.. autosummary::
:toctree: generated
:nosignatures:
BaseSelfSupervisor
BEiT
BYOL
BarlowTwins
CAE
DenseCL
EVA
iTPN
MAE
MILAN
MaskFeat
MixMIM
MoCo
MoCoV3
SimCLR
SimMIM
SimSiam
SparK
SwAV
.. _selfsup_backbones:
Some of above algorithms modified the backbone module to adapt the extra inputs
like ``mask``, and here is the a list of these **modified backbone** modules.
.. autosummary::
:toctree: generated
:nosignatures:
BEiTPretrainViT
CAEPretrainViT
iTPNHiViT
MAEHiViT
MAEViT
MILANViT
MaskFeatViT
MixMIMPretrainTransformer
MoCoV3ViT
SimMIMSwinTransformer
.. _target_generators:
Some self-supervise algorithms need an external **target generator** to
generate the optimization target. Here is a list of target generators.
.. autosummary::
:toctree: generated
:nosignatures:
VQKD
DALLEEncoder
HOGGenerator
CLIPGenerator
.. module:: mmpretrain.models.retrievers
Retrievers
------------------
.. autosummary::
:toctree: generated
:nosignatures:
BaseRetriever
ImageToImageRetriever
[Feature] Support multiple multi-modal algorithms and inferencers. (#1561) * [Feat] Migrate blip caption to mmpretrain. (#50) * Migrate blip caption to mmpretrain * minor fix * support train * [Feature] Support OFA caption task. (#51) * [Feature] Support OFA caption task. * Remove duplicated files. * [Feature] Support OFA vqa task. (#58) * [Feature] Support OFA vqa task. * Fix lint. * [Feat] Add BLIP retrieval to mmpretrain. (#55) * init * minor fix for train * fix according to comments * refactor * Update Blip retrieval. (#62) * [Feature] Support OFA visual grounding task. (#59) * [Feature] Support OFA visual grounding task. * minor add TODO --------- Co-authored-by: yingfhu <yingfhu@gmail.com> * [Feat] Add flamingos coco caption and vqa. (#60) * first init * init flamingo coco * add vqa * minor fix * remove unnecessary modules * Update config * Use `ApplyToList`. --------- Co-authored-by: mzr1996 <mzr1996@163.com> * [Feature]: BLIP2 coco retrieval (#53) * [Feature]: Add blip2 retriever * [Feature]: Add blip2 all modules * [Feature]: Refine model * [Feature]: x1 * [Feature]: Runnable coco ret * [Feature]: Runnable version * [Feature]: Fix lint * [Fix]: Fix lint * [Feature]: Use 364 img size * [Feature]: Refactor blip2 * [Fix]: Fix lint * refactor files * minor fix * minor fix --------- Co-authored-by: yingfhu <yingfhu@gmail.com> * Remove * fix blip caption inputs (#68) * [Feat] Add BLIP NLVR support. (#67) * first init * init flamingo coco * add vqa * add nlvr * refactor nlvr * minor fix * minor fix * Update dataset --------- Co-authored-by: mzr1996 <mzr1996@163.com> * [Feature]: BLIP2 Caption (#70) * [Feature]: Add language model * [Feature]: blip2 caption forward * [Feature]: Reproduce the results * [Feature]: Refactor caption * refine config --------- Co-authored-by: yingfhu <yingfhu@gmail.com> * [Feat] Migrate BLIP VQA to mmpretrain (#69) * reformat * change * change * change * change * change * change * change * change * change * change * change * change * change * change * change * change * change * change * change * refactor code --------- Co-authored-by: yingfhu <yingfhu@gmail.com> * Update RefCOCO dataset * [Fix] fix lint * [Feature] Implement inference APIs for multi-modal tasks. (#65) * [Feature] Implement inference APIs for multi-modal tasks. * [Project] Add gradio demo. * [Improve] Update requirements * Update flamingo * Update blip * Add NLVR inferencer * Update flamingo * Update hugging face model register * Update ofa vqa * Update BLIP-vqa (#71) * Update blip-vqa docstring (#72) * Refine flamingo docstring (#73) * [Feature]: BLIP2 VQA (#61) * [Feature]: VQA forward * [Feature]: Reproduce accuracy * [Fix]: Fix lint * [Fix]: Add blank line * minor fix --------- Co-authored-by: yingfhu <yingfhu@gmail.com> * [Feature]: BLIP2 docstring (#74) * [Feature]: Add caption docstring * [Feature]: Add docstring to blip2 vqa * [Feature]: Add docstring to retrieval * Update BLIP-2 metafile and README (#75) * [Feature]: Add readme and docstring * Update blip2 results --------- Co-authored-by: mzr1996 <mzr1996@163.com> * [Feature] BLIP Visual Grounding on MMPretrain Branch (#66) * blip grounding merge with mmpretrain * remove commit * blip grounding test and inference api * refcoco dataset * refcoco dataset refine config * rebasing * gitignore * rebasing * minor edit * minor edit * Update blip-vqa docstring (#72) * rebasing * Revert "minor edit" This reverts commit 639cec757c215e654625ed0979319e60f0be9044. * blip grounding final * precommit * refine config * refine config * Update blip visual grounding --------- Co-authored-by: Yiqin Wang 王逸钦 <wyq1217@outlook.com> Co-authored-by: mzr1996 <mzr1996@163.com> * Update visual grounding metric * Update OFA docstring, README and metafiles. (#76) * [Docs] Update installation docs and gradio demo docs. (#77) * Update OFA name * Update Visual Grounding Visualizer * Integrate accelerate support * Fix imports. * Fix timm backbone * Update imports * Update README * Update circle ci * Update flamingo config * Add gradio demo README * [Feature]: Add scienceqa (#1571) * [Feature]: Add scienceqa * [Feature]: Change param name * Update docs * Update video --------- Co-authored-by: Hubert <42952108+yingfhu@users.noreply.github.com> Co-authored-by: yingfhu <yingfhu@gmail.com> Co-authored-by: Yuan Liu <30762564+YuanLiuuuuuu@users.noreply.github.com> Co-authored-by: Yiqin Wang 王逸钦 <wyq1217@outlook.com> Co-authored-by: Rongjie Li <limo97@163.com>
2023-05-19 16:50:04 +08:00
.. module:: mmpretrain.models.multimodal
Multi-Modality Algorithms
--------------------------
.. autosummary::
:toctree: generated
:nosignatures:
Blip2Caption
Blip2Retrieval
Blip2VQA
BlipCaption
BlipGrounding
BlipNLVR
BlipRetrieval
BlipVQA
Flamingo
OFA
MiniGPT4
2023-06-17 16:05:52 +08:00
Llava
Otter
[Feature] Support multiple multi-modal algorithms and inferencers. (#1561) * [Feat] Migrate blip caption to mmpretrain. (#50) * Migrate blip caption to mmpretrain * minor fix * support train * [Feature] Support OFA caption task. (#51) * [Feature] Support OFA caption task. * Remove duplicated files. * [Feature] Support OFA vqa task. (#58) * [Feature] Support OFA vqa task. * Fix lint. * [Feat] Add BLIP retrieval to mmpretrain. (#55) * init * minor fix for train * fix according to comments * refactor * Update Blip retrieval. (#62) * [Feature] Support OFA visual grounding task. (#59) * [Feature] Support OFA visual grounding task. * minor add TODO --------- Co-authored-by: yingfhu <yingfhu@gmail.com> * [Feat] Add flamingos coco caption and vqa. (#60) * first init * init flamingo coco * add vqa * minor fix * remove unnecessary modules * Update config * Use `ApplyToList`. --------- Co-authored-by: mzr1996 <mzr1996@163.com> * [Feature]: BLIP2 coco retrieval (#53) * [Feature]: Add blip2 retriever * [Feature]: Add blip2 all modules * [Feature]: Refine model * [Feature]: x1 * [Feature]: Runnable coco ret * [Feature]: Runnable version * [Feature]: Fix lint * [Fix]: Fix lint * [Feature]: Use 364 img size * [Feature]: Refactor blip2 * [Fix]: Fix lint * refactor files * minor fix * minor fix --------- Co-authored-by: yingfhu <yingfhu@gmail.com> * Remove * fix blip caption inputs (#68) * [Feat] Add BLIP NLVR support. (#67) * first init * init flamingo coco * add vqa * add nlvr * refactor nlvr * minor fix * minor fix * Update dataset --------- Co-authored-by: mzr1996 <mzr1996@163.com> * [Feature]: BLIP2 Caption (#70) * [Feature]: Add language model * [Feature]: blip2 caption forward * [Feature]: Reproduce the results * [Feature]: Refactor caption * refine config --------- Co-authored-by: yingfhu <yingfhu@gmail.com> * [Feat] Migrate BLIP VQA to mmpretrain (#69) * reformat * change * change * change * change * change * change * change * change * change * change * change * change * change * change * change * change * change * change * change * refactor code --------- Co-authored-by: yingfhu <yingfhu@gmail.com> * Update RefCOCO dataset * [Fix] fix lint * [Feature] Implement inference APIs for multi-modal tasks. (#65) * [Feature] Implement inference APIs for multi-modal tasks. * [Project] Add gradio demo. * [Improve] Update requirements * Update flamingo * Update blip * Add NLVR inferencer * Update flamingo * Update hugging face model register * Update ofa vqa * Update BLIP-vqa (#71) * Update blip-vqa docstring (#72) * Refine flamingo docstring (#73) * [Feature]: BLIP2 VQA (#61) * [Feature]: VQA forward * [Feature]: Reproduce accuracy * [Fix]: Fix lint * [Fix]: Add blank line * minor fix --------- Co-authored-by: yingfhu <yingfhu@gmail.com> * [Feature]: BLIP2 docstring (#74) * [Feature]: Add caption docstring * [Feature]: Add docstring to blip2 vqa * [Feature]: Add docstring to retrieval * Update BLIP-2 metafile and README (#75) * [Feature]: Add readme and docstring * Update blip2 results --------- Co-authored-by: mzr1996 <mzr1996@163.com> * [Feature] BLIP Visual Grounding on MMPretrain Branch (#66) * blip grounding merge with mmpretrain * remove commit * blip grounding test and inference api * refcoco dataset * refcoco dataset refine config * rebasing * gitignore * rebasing * minor edit * minor edit * Update blip-vqa docstring (#72) * rebasing * Revert "minor edit" This reverts commit 639cec757c215e654625ed0979319e60f0be9044. * blip grounding final * precommit * refine config * refine config * Update blip visual grounding --------- Co-authored-by: Yiqin Wang 王逸钦 <wyq1217@outlook.com> Co-authored-by: mzr1996 <mzr1996@163.com> * Update visual grounding metric * Update OFA docstring, README and metafiles. (#76) * [Docs] Update installation docs and gradio demo docs. (#77) * Update OFA name * Update Visual Grounding Visualizer * Integrate accelerate support * Fix imports. * Fix timm backbone * Update imports * Update README * Update circle ci * Update flamingo config * Add gradio demo README * [Feature]: Add scienceqa (#1571) * [Feature]: Add scienceqa * [Feature]: Change param name * Update docs * Update video --------- Co-authored-by: Hubert <42952108+yingfhu@users.noreply.github.com> Co-authored-by: yingfhu <yingfhu@gmail.com> Co-authored-by: Yuan Liu <30762564+YuanLiuuuuuu@users.noreply.github.com> Co-authored-by: Yiqin Wang 王逸钦 <wyq1217@outlook.com> Co-authored-by: Rongjie Li <limo97@163.com>
2023-05-19 16:50:04 +08:00
.. module:: mmpretrain.models.backbones
Backbones
------------------
.. autosummary::
:toctree: generated
:nosignatures:
AlexNet
BEiTViT
CSPDarkNet
CSPNet
CSPResNeXt
CSPResNet
Conformer
ConvMixer
ConvNeXt
DaViT
DeiT3
DenseNet
DistilledVisionTransformer
EdgeNeXt
EfficientFormer
EfficientNet
[Feature] [CodeCamp #68] Add EfficientnetV2 Backbone. (#1253) * add efficientnet_v2.py * add efficientnet_v2 in __init__.py * add efficientnet_v2_s base config file * add efficientnet_v2 config file * add efficientnet_v2 config file * update tuple output * update config file * update model file * update model file * update model file * update config file * update model file * update config file * update model file * update model file * update model file * update model file * update model file * update config file * update config file * update model file * update model file * update model file * update model file * update model config file * Update efficientnet_v2.py * add config file and modify arch * add config file and modify arch * add the file about convert_pth from timm to mmcls * update efficientnetv2 model file with mmcls style * add the file about convert_pth from timm to mmcls * add the file about convert_pth from timm to mmcls * update convert file * update model file * update convert file * update model file * update model file * update model file * add metefile and README * Update tools/model_converters/efficientnetv2-timm_to_mmcls.py Co-authored-by: Ezra-Yu <18586273+Ezra-Yu@users.noreply.github.com> * update model file and convert file * update model file and convert file * update model file and convert file * update model file and convert file * update model file * update model file * update model file * update config file and README file * update metafile * Update efficientnetv2_to_mmcls.py * update model-index.yml * update metafile.yml * update b0 and s train pipeline * update b0 and s train pipeline * update b0 and s train pipeline * add test_efficientnet_v2 * update test_efficientnet_v2 * update model file docs * update test_efficientnet_v2 * update test_efficientnet_v2 * add efficientnet_v2.py * add efficientnet_v2 in __init__.py * add efficientnet_v2_s base config file * add efficientnet_v2 config file * add efficientnet_v2 config file * update tuple output * update config file * update model file * update model file * update model file * update model file * update config file * update config file * update model file * update model file * update model file * update model file * update model file * update config file * update config file * update model file * update model file * update model file * update model file * update model config file * Update efficientnet_v2.py * add config file and modify arch * add config file and modify arch * add the file about convert_pth from timm to mmcls * update efficientnetv2 model file with mmcls style * add the file about convert_pth from timm to mmcls * add the file about convert_pth from timm to mmcls * update convert file * update model file * update convert file * update model file * update model file * update model file * add metefile and README * Update tools/model_converters/efficientnetv2-timm_to_mmcls.py Co-authored-by: Ezra-Yu <18586273+Ezra-Yu@users.noreply.github.com> * update model file and convert file * update model file and convert file * update model file and convert file * update model file and convert file * update model file * update model file * update model file * update config file and README file * update metafile * Update efficientnetv2_to_mmcls.py * update model-index.yml * update metafile.yml * update b0 and s train pipeline * update b0 and s train pipeline * update b0 and s train pipeline * add test_efficientnet_v2 * update test_efficientnet_v2 * update model file docs * update test_efficientnet_v2 * update test_efficientnet_v2 * pass pre-commit hook * refactor efficientnetv2 * refactor efficientnetv2 * update readme, metafile and weight links * update model-index.yml * fix lint * fix typo * Update efficientnetv2-b1_8xb32_in1k.py * Update efficientnetv2-b2_8xb32_in1k.py * Update efficientnetv2-b3_8xb32_in1k.py * update two moduals and model file * update modual file * update accuracys * update accuracys * update metafile * fix build docs * update links * update README.md Co-authored-by: qingtian <459291290@qq.com> Co-authored-by: Ezra-Yu <18586273+Ezra-Yu@users.noreply.github.com>
2022-12-30 15:18:39 +08:00
EfficientNetV2
HiViT
HRNet
HorNet
InceptionV3
LeNet5
[Feature] Support LeViT backbone. (#1238) * 网络搭建完成、能正常推理 * 网络搭建完成、能正常推理 * 网络搭建完成、能正常推理 * 添加了模型转换未验证,配置文件 但有无法运行 * 模型转换、结构验证完成,可以推理出正确答案 * 推理精度与原论文一致 已完成转化 * 三个方法改为class 暂存 * 完成推理精度对齐 误差0.04 * 暂时使用的levit2mmcls * 训练跑通,训练相关参数未对齐 * '训练相关参数对齐'参数' * '修复训练时验证导致模型结构改变无法复原问题' * '修复训练时验证导致模型结构改变无法复原问题' * '添加mixup和labelsmooth' * '配置文件补齐' * 添加模型转换 * 添加meta文件 * 添加meta文件 * 删除demo.py测试文件 * 添加模型README文件 * docs文件回滚 * model-index删除末行空格 * 更新模型metafile * 更新metafile * 更新metafile * 更新README和metafile * 更新模型README * 更新模型metafile * Delete the model class and get_LeViT_model methods in the mmcls.models.backone.levit file * Change the class name to Google Code Style * use arch to provide default architectures * use nn.Conv2d * mmcv.cnn.fuse_conv_bn * modify some details * remove down_ops from the architectures. * remove init_weight function * Modify ambiguous variable names * Change the drop_path in config to drop_path_rate * Add unit test * remove train function * add unit test * modify nn.norm1d to build_norm_layer * update metafile and readme * Update configs and LeViT implementations. * Update README. * Add docstring and update unit tests. * Revert irrelative modification. * Fix unit tests * minor fix Co-authored-by: mzr1996 <mzr1996@163.com>
2023-01-17 17:43:42 +08:00
LeViT
MViT
MlpMixer
MobileNetV2
MobileNetV3
MobileOne
MobileViT
PCPVT
PoolFormer
PyramidVig
RegNet
RepLKNet
RepMLPNet
RepVGG
Res2Net
ResNeSt
ResNeXt
ResNet
ResNetV1c
ResNetV1d
ResNet_CIFAR
RevVisionTransformer
SEResNeXt
SEResNet
SVT
ShuffleNetV1
ShuffleNetV2
SparseResNet
SparseConvNeXt
SwinTransformer
SwinTransformerV2
T2T_ViT
TIMMBackbone
TNT
VAN
VGG
Vig
VisionTransformer
ViTSAM
XCiT
ViTEVA02
.. module:: mmpretrain.models.necks
Necks
------------------
.. autosummary::
:toctree: generated
:nosignatures:
BEiTV2Neck
CAENeck
ClsBatchNormNeck
DenseCLNeck
GeneralizedMeanPooling
GlobalAveragePooling
HRFuseScales
LinearNeck
MAEPretrainDecoder
MILANPretrainDecoder
MixMIMPretrainDecoder
MoCoV2Neck
NonLinearNeck
SimMIMLinearDecoder
SwAVNeck
iTPNPretrainDecoder
SparKLightDecoder
.. module:: mmpretrain.models.heads
Heads
------------------
.. autosummary::
:toctree: generated
:nosignatures:
ArcFaceClsHead
BEiTV1Head
BEiTV2Head
CAEHead
CSRAClsHead
ClsHead
ConformerHead
ContrastiveHead
DeiTClsHead
EfficientFormerClsHead
LatentCrossCorrelationHead
LatentPredictHead
LeViTClsHead
LinearClsHead
MAEPretrainHead
MIMHead
MixMIMPretrainHead
MoCoV3Head
MultiLabelClsHead
MultiLabelLinearClsHead
MultiTaskHead
SimMIMHead
StackedLinearClsHead
SwAVHead
VigClsHead
VisionTransformerClsHead
iTPNClipHead
SparKPretrainHead
.. module:: mmpretrain.models.losses
Losses
------------------
.. autosummary::
:toctree: generated
:nosignatures:
AsymmetricLoss
CAELoss
CosineSimilarityLoss
CrossCorrelationLoss
CrossEntropyLoss
FocalLoss
LabelSmoothLoss
PixelReconstructionLoss
SeesawLoss
SwAVLoss
.. module:: mmpretrain.models.utils
models.utils
------------
This package includes some helper functions and common components used in various networks.
.. _components:
Common Components
^^^^^^^^^^^^^^^^^
.. autosummary::
:toctree: generated
:nosignatures:
ConditionalPositionEncoding
CosineEMA
HybridEmbed
InvertedResidual
LayerScale
MultiheadAttention
PatchEmbed
PatchMerging
SELayer
ShiftWindowMSA
WindowMSA
WindowMSAV2
.. _helpers:
Helper Functions
^^^^^^^^^^^^^^^^
.. autosummary::
:toctree: generated
:nosignatures:
channel_shuffle
is_tracing
make_divisible
resize_pos_embed
resize_relative_position_bias_table
to_ntuple