mirror of
https://github.com/open-mmlab/mmclassification.git
synced 2025-06-03 21:53:55 +08:00
* [Feat] Migrate blip caption to mmpretrain. (#50) * Migrate blip caption to mmpretrain * minor fix * support train * [Feature] Support OFA caption task. (#51) * [Feature] Support OFA caption task. * Remove duplicated files. * [Feature] Support OFA vqa task. (#58) * [Feature] Support OFA vqa task. * Fix lint. * [Feat] Add BLIP retrieval to mmpretrain. (#55) * init * minor fix for train * fix according to comments * refactor * Update Blip retrieval. (#62) * [Feature] Support OFA visual grounding task. (#59) * [Feature] Support OFA visual grounding task. * minor add TODO --------- Co-authored-by: yingfhu <yingfhu@gmail.com> * [Feat] Add flamingos coco caption and vqa. (#60) * first init * init flamingo coco * add vqa * minor fix * remove unnecessary modules * Update config * Use `ApplyToList`. --------- Co-authored-by: mzr1996 <mzr1996@163.com> * [Feature]: BLIP2 coco retrieval (#53) * [Feature]: Add blip2 retriever * [Feature]: Add blip2 all modules * [Feature]: Refine model * [Feature]: x1 * [Feature]: Runnable coco ret * [Feature]: Runnable version * [Feature]: Fix lint * [Fix]: Fix lint * [Feature]: Use 364 img size * [Feature]: Refactor blip2 * [Fix]: Fix lint * refactor files * minor fix * minor fix --------- Co-authored-by: yingfhu <yingfhu@gmail.com> * Remove * fix blip caption inputs (#68) * [Feat] Add BLIP NLVR support. (#67) * first init * init flamingo coco * add vqa * add nlvr * refactor nlvr * minor fix * minor fix * Update dataset --------- Co-authored-by: mzr1996 <mzr1996@163.com> * [Feature]: BLIP2 Caption (#70) * [Feature]: Add language model * [Feature]: blip2 caption forward * [Feature]: Reproduce the results * [Feature]: Refactor caption * refine config --------- Co-authored-by: yingfhu <yingfhu@gmail.com> * [Feat] Migrate BLIP VQA to mmpretrain (#69) * reformat * change * change * change * change * change * change * change * change * change * change * change * change * change * change * change * change * change * change * change * refactor code --------- Co-authored-by: yingfhu <yingfhu@gmail.com> * Update RefCOCO dataset * [Fix] fix lint * [Feature] Implement inference APIs for multi-modal tasks. (#65) * [Feature] Implement inference APIs for multi-modal tasks. * [Project] Add gradio demo. * [Improve] Update requirements * Update flamingo * Update blip * Add NLVR inferencer * Update flamingo * Update hugging face model register * Update ofa vqa * Update BLIP-vqa (#71) * Update blip-vqa docstring (#72) * Refine flamingo docstring (#73) * [Feature]: BLIP2 VQA (#61) * [Feature]: VQA forward * [Feature]: Reproduce accuracy * [Fix]: Fix lint * [Fix]: Add blank line * minor fix --------- Co-authored-by: yingfhu <yingfhu@gmail.com> * [Feature]: BLIP2 docstring (#74) * [Feature]: Add caption docstring * [Feature]: Add docstring to blip2 vqa * [Feature]: Add docstring to retrieval * Update BLIP-2 metafile and README (#75) * [Feature]: Add readme and docstring * Update blip2 results --------- Co-authored-by: mzr1996 <mzr1996@163.com> * [Feature] BLIP Visual Grounding on MMPretrain Branch (#66) * blip grounding merge with mmpretrain * remove commit * blip grounding test and inference api * refcoco dataset * refcoco dataset refine config * rebasing * gitignore * rebasing * minor edit * minor edit * Update blip-vqa docstring (#72) * rebasing * Revert "minor edit" This reverts commit 639cec757c215e654625ed0979319e60f0be9044. * blip grounding final * precommit * refine config * refine config * Update blip visual grounding --------- Co-authored-by: Yiqin Wang 王逸钦 <wyq1217@outlook.com> Co-authored-by: mzr1996 <mzr1996@163.com> * Update visual grounding metric * Update OFA docstring, README and metafiles. (#76) * [Docs] Update installation docs and gradio demo docs. (#77) * Update OFA name * Update Visual Grounding Visualizer * Integrate accelerate support * Fix imports. * Fix timm backbone * Update imports * Update README * Update circle ci * Update flamingo config * Add gradio demo README * [Feature]: Add scienceqa (#1571) * [Feature]: Add scienceqa * [Feature]: Change param name * Update docs * Update video --------- Co-authored-by: Hubert <42952108+yingfhu@users.noreply.github.com> Co-authored-by: yingfhu <yingfhu@gmail.com> Co-authored-by: Yuan Liu <30762564+YuanLiuuuuuu@users.noreply.github.com> Co-authored-by: Yiqin Wang 王逸钦 <wyq1217@outlook.com> Co-authored-by: Rongjie Li <limo97@163.com>
339 lines
5.9 KiB
ReStructuredText
339 lines
5.9 KiB
ReStructuredText
.. role:: hidden
|
|
:class: hidden-section
|
|
|
|
.. module:: mmpretrain.models
|
|
|
|
mmpretrain.models
|
|
===================================
|
|
|
|
The ``models`` package contains several sub-packages for addressing the different components of a model.
|
|
|
|
- :mod:`~mmpretrain.models.classifiers`: The top-level module which defines the whole process of a classification model.
|
|
- :mod:`~mmpretrain.models.selfsup`: The top-level module which defines the whole process of a self-supervised learning model.
|
|
- :mod:`~mmpretrain.models.retrievers`: The top-level module which defines the whole process of a retrieval model.
|
|
- :mod:`~mmpretrain.models.backbones`: Usually a feature extraction network, e.g., ResNet, MobileNet.
|
|
- :mod:`~mmpretrain.models.necks`: The component between backbones and heads, e.g., GlobalAveragePooling.
|
|
- :mod:`~mmpretrain.models.heads`: The component for specific tasks.
|
|
- :mod:`~mmpretrain.models.losses`: Loss functions.
|
|
- :mod:`~mmpretrain.models.utils`: Some helper functions and common components used in various networks.
|
|
|
|
- :mod:`~mmpretrain.models.utils.data_preprocessor`: The component before model to preprocess the inputs, e.g., ClsDataPreprocessor.
|
|
- :ref:`components`: Common components used in various networks.
|
|
- :ref:`helpers`: Helper functions.
|
|
|
|
Build Functions
|
|
---------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
build_classifier
|
|
build_backbone
|
|
build_neck
|
|
build_head
|
|
build_loss
|
|
|
|
.. module:: mmpretrain.models.classifiers
|
|
|
|
Classifiers
|
|
------------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
BaseClassifier
|
|
ImageClassifier
|
|
TimmClassifier
|
|
HuggingFaceClassifier
|
|
|
|
.. module:: mmpretrain.models.selfsup
|
|
|
|
Self-supervised Algorithms
|
|
--------------------------
|
|
|
|
.. _selfsup_algorithms:
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
BaseSelfSupervisor
|
|
BEiT
|
|
BYOL
|
|
BarlowTwins
|
|
CAE
|
|
DenseCL
|
|
EVA
|
|
MAE
|
|
MILAN
|
|
MaskFeat
|
|
MixMIM
|
|
MoCo
|
|
MoCoV3
|
|
SimCLR
|
|
SimMIM
|
|
SimSiam
|
|
SwAV
|
|
|
|
.. _selfsup_backbones:
|
|
|
|
Some of above algorithms modified the backbone module to adapt the extra inputs
|
|
like ``mask``, and here is the a list of these **modified backbone** modules.
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
BEiTPretrainViT
|
|
CAEPretrainViT
|
|
MAEViT
|
|
MILANViT
|
|
MaskFeatViT
|
|
MixMIMPretrainTransformer
|
|
MoCoV3ViT
|
|
SimMIMSwinTransformer
|
|
|
|
.. _target_generators:
|
|
|
|
Some self-supervise algorithms need an external **target generator** to
|
|
generate the optimization target. Here is a list of target generators.
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
VQKD
|
|
DALLEEncoder
|
|
HOGGenerator
|
|
CLIPGenerator
|
|
|
|
.. module:: mmpretrain.models.retrievers
|
|
|
|
Retrievers
|
|
------------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
BaseRetriever
|
|
ImageToImageRetriever
|
|
|
|
.. module:: mmpretrain.models.multimodal
|
|
|
|
Multi-Modality Algorithms
|
|
--------------------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
Blip2Caption
|
|
Blip2Retrieval
|
|
Blip2VQA
|
|
BlipCaption
|
|
BlipGrounding
|
|
BlipNLVR
|
|
BlipRetrieval
|
|
BlipVQA
|
|
Flamingo
|
|
OFA
|
|
|
|
.. module:: mmpretrain.models.backbones
|
|
|
|
Backbones
|
|
------------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
AlexNet
|
|
BEiTViT
|
|
CSPDarkNet
|
|
CSPNet
|
|
CSPResNeXt
|
|
CSPResNet
|
|
Conformer
|
|
ConvMixer
|
|
ConvNeXt
|
|
DaViT
|
|
DeiT3
|
|
DenseNet
|
|
DistilledVisionTransformer
|
|
EdgeNeXt
|
|
EfficientFormer
|
|
EfficientNet
|
|
EfficientNetV2
|
|
HRNet
|
|
HorNet
|
|
InceptionV3
|
|
LeNet5
|
|
LeViT
|
|
MViT
|
|
MlpMixer
|
|
MobileNetV2
|
|
MobileNetV3
|
|
MobileOne
|
|
MobileViT
|
|
PCPVT
|
|
PoolFormer
|
|
PyramidVig
|
|
RegNet
|
|
RepLKNet
|
|
RepMLPNet
|
|
RepVGG
|
|
Res2Net
|
|
ResNeSt
|
|
ResNeXt
|
|
ResNet
|
|
ResNetV1c
|
|
ResNetV1d
|
|
ResNet_CIFAR
|
|
RevVisionTransformer
|
|
SEResNeXt
|
|
SEResNet
|
|
SVT
|
|
ShuffleNetV1
|
|
ShuffleNetV2
|
|
SwinTransformer
|
|
SwinTransformerV2
|
|
T2T_ViT
|
|
TIMMBackbone
|
|
TNT
|
|
VAN
|
|
VGG
|
|
Vig
|
|
VisionTransformer
|
|
ViTSAM
|
|
XCiT
|
|
ViTEVA02
|
|
|
|
.. module:: mmpretrain.models.necks
|
|
|
|
Necks
|
|
------------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
BEiTV2Neck
|
|
CAENeck
|
|
ClsBatchNormNeck
|
|
DenseCLNeck
|
|
GeneralizedMeanPooling
|
|
GlobalAveragePooling
|
|
HRFuseScales
|
|
LinearNeck
|
|
MAEPretrainDecoder
|
|
MILANPretrainDecoder
|
|
MixMIMPretrainDecoder
|
|
MoCoV2Neck
|
|
NonLinearNeck
|
|
SimMIMLinearDecoder
|
|
SwAVNeck
|
|
|
|
.. module:: mmpretrain.models.heads
|
|
|
|
Heads
|
|
------------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
ArcFaceClsHead
|
|
BEiTV1Head
|
|
BEiTV2Head
|
|
CAEHead
|
|
CSRAClsHead
|
|
ClsHead
|
|
ConformerHead
|
|
ContrastiveHead
|
|
DeiTClsHead
|
|
EfficientFormerClsHead
|
|
LatentCrossCorrelationHead
|
|
LatentPredictHead
|
|
LeViTClsHead
|
|
LinearClsHead
|
|
MAEPretrainHead
|
|
MIMHead
|
|
MixMIMPretrainHead
|
|
MoCoV3Head
|
|
MultiLabelClsHead
|
|
MultiLabelLinearClsHead
|
|
MultiTaskHead
|
|
SimMIMHead
|
|
StackedLinearClsHead
|
|
SwAVHead
|
|
VigClsHead
|
|
VisionTransformerClsHead
|
|
|
|
.. module:: mmpretrain.models.losses
|
|
|
|
Losses
|
|
------------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
AsymmetricLoss
|
|
CAELoss
|
|
CosineSimilarityLoss
|
|
CrossCorrelationLoss
|
|
CrossEntropyLoss
|
|
FocalLoss
|
|
LabelSmoothLoss
|
|
PixelReconstructionLoss
|
|
SeesawLoss
|
|
SwAVLoss
|
|
|
|
.. module:: mmpretrain.models.utils
|
|
|
|
models.utils
|
|
------------
|
|
|
|
This package includes some helper functions and common components used in various networks.
|
|
|
|
.. _components:
|
|
|
|
Common Components
|
|
^^^^^^^^^^^^^^^^^
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
ConditionalPositionEncoding
|
|
CosineEMA
|
|
HybridEmbed
|
|
InvertedResidual
|
|
LayerScale
|
|
MultiheadAttention
|
|
PatchEmbed
|
|
PatchMerging
|
|
SELayer
|
|
ShiftWindowMSA
|
|
WindowMSA
|
|
WindowMSAV2
|
|
|
|
.. _helpers:
|
|
|
|
Helper Functions
|
|
^^^^^^^^^^^^^^^^
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
channel_shuffle
|
|
is_tracing
|
|
make_divisible
|
|
resize_pos_embed
|
|
resize_relative_position_bias_table
|
|
to_ntuple
|