Commit Graph

14 Commits (17a886cb5825cd8c26df4e65f7112d404b99fe12)

Author SHA1 Message Date
ZhangYiqin da1da48eb6
[Enhance] Add iTPN Supports for Non-three channel image (#1735)
* Add channel argments to mae_head

When trying iTPN pretrain, it only supports images with 3 channels. One of the restrictions is from MAEHead.

* Transfer other argments from iTPNHiViT to HiViT

The HiViT supports specifying channels, but the iTPNHiViT class can't pass channel argments to it. This is one of the reasons that iTPNHiViT implementation only support images with 3 channels.

* Update itpn.py

Fix hint problem
2023-09-04 13:11:16 +08:00
fangyixiao18 58a2243d99 Merge branch 'main' into dev 2023-07-28 15:35:55 +08:00
marouane amzil e7fc25cf64
[Fix] Fix nested predict for multi-task prediction. (#1716)
* fix: multi task predict

* change the loop

---------

Co-authored-by: Pierre Colle <piercus@gmail.com>
2023-07-28 13:44:12 +08:00
Nripesh Niketan 4d1dbafaa2
[Enhance] Add GPU Acceleration Apple silicon mac (#1699)
* Add GPU Acceleration Apple silicon mac

* lint fix

* Update launch.py

* Use  to refactor the device selection.

* Update launch.py

---------

Co-authored-by: mzr1996 <mzr1996@163.com>
2023-07-26 17:51:00 +08:00
Yixiao Fang a1cfe888e2
[Feature] Support SparK. (#1531)
* add spark configs

* fix configs

* remove repeat aug

* add module codes

* support lr layer decay of resnet

* update

* fix lint

* add metafile and readme

* fix lint

* add models and logs

* refactor codes

* fix lint

* update model rst

* update name

* add docstring

* add ut

* fix lint

---------

Co-authored-by: Ma Zerun <mzr1996@163.com>
2023-06-19 11:27:50 +08:00
Yixiao Fang e4c4a81b56
[Feature] Support iTPN and HiViT (#1584)
* hivit added

* Update hivit.py

* Update hivit.py

* Add files via upload

* Update __init__.py

* Add files via upload

* Update __init__.py

* Add files via upload

* Update hivit.py

* Add files via upload

* Add files via upload

* Add files via upload

* Add files via upload

* Update itpn.py

* Add files via upload

* Update __init__.py

* Update mae_hivit-base-p16.py

* Delete mim_itpn-base-p16.py

* Add files via upload

* Update itpn_hivit-base-p16.py

* Update itpn.py

* Update hivit.py

* Update __init__.py

* Update mae.py

* Delete hivit.py

* Update __init__.py

* Delete configs/itpn directory

* Add files via upload

* Add files via upload

* Delete configs/hivit directory

* Add files via upload

* refactor and add metafile and readme

* update clip

* add ut

* update ut

* update

* update docstring

* update model.rst

---------

Co-authored-by: 田运杰 <48153283+sunsmarterjie@users.noreply.github.com>
2023-05-26 12:08:34 +08:00
Ma Zerun 6847d20d57
[Feature] Support multiple multi-modal algorithms and inferencers. (#1561)
* [Feat] Migrate blip caption to mmpretrain. (#50)

* Migrate blip caption to mmpretrain

* minor fix

* support train

* [Feature] Support OFA caption task. (#51)

* [Feature] Support OFA caption task.

* Remove duplicated files.

* [Feature] Support OFA vqa task. (#58)

* [Feature] Support OFA vqa task.

* Fix lint.

* [Feat] Add BLIP retrieval to mmpretrain. (#55)

* init

* minor fix for train

* fix according to comments

* refactor

* Update Blip retrieval. (#62)

* [Feature] Support OFA visual grounding task. (#59)

* [Feature] Support OFA visual grounding task.

* minor add TODO

---------

Co-authored-by: yingfhu <yingfhu@gmail.com>

* [Feat] Add flamingos coco caption and vqa. (#60)

* first init

* init flamingo coco

* add vqa

* minor fix

* remove unnecessary modules

* Update config

* Use `ApplyToList`.

---------

Co-authored-by: mzr1996 <mzr1996@163.com>

* [Feature]: BLIP2 coco retrieval  (#53)

* [Feature]: Add blip2 retriever

* [Feature]: Add blip2 all modules

* [Feature]: Refine model

* [Feature]: x1

* [Feature]: Runnable coco ret

* [Feature]: Runnable version

* [Feature]: Fix lint

* [Fix]: Fix lint

* [Feature]: Use 364 img size

* [Feature]: Refactor blip2

* [Fix]: Fix lint

* refactor files

* minor fix

* minor fix

---------

Co-authored-by: yingfhu <yingfhu@gmail.com>

* Remove

* fix blip caption inputs (#68)

* [Feat] Add BLIP NLVR support. (#67)

* first init

* init flamingo coco

* add vqa

* add nlvr

* refactor nlvr

* minor fix

* minor fix

* Update dataset

---------

Co-authored-by: mzr1996 <mzr1996@163.com>

* [Feature]: BLIP2 Caption (#70)

* [Feature]: Add language model

* [Feature]: blip2 caption forward

* [Feature]: Reproduce the results

* [Feature]: Refactor caption

* refine config

---------

Co-authored-by: yingfhu <yingfhu@gmail.com>

* [Feat] Migrate BLIP VQA to mmpretrain (#69)

* reformat

* change

* change

* change

* change

* change

* change

* change

* change

* change

* change

* change

* change

* change

* change

* change

* change

* change

* change

* change

* refactor code

---------

Co-authored-by: yingfhu <yingfhu@gmail.com>

* Update RefCOCO dataset

* [Fix] fix lint

* [Feature] Implement inference APIs for multi-modal tasks. (#65)

* [Feature] Implement inference APIs for multi-modal tasks.

* [Project] Add gradio demo.

* [Improve] Update requirements

* Update flamingo

* Update blip

* Add NLVR inferencer

* Update flamingo

* Update hugging face model register

* Update ofa vqa

* Update BLIP-vqa (#71)

* Update blip-vqa docstring (#72)

* Refine flamingo docstring (#73)

* [Feature]: BLIP2 VQA (#61)

* [Feature]: VQA forward

* [Feature]: Reproduce accuracy

* [Fix]: Fix lint

* [Fix]: Add blank line

* minor fix

---------

Co-authored-by: yingfhu <yingfhu@gmail.com>

* [Feature]: BLIP2 docstring (#74)

* [Feature]: Add caption docstring

* [Feature]: Add docstring to blip2 vqa

* [Feature]: Add docstring to retrieval

* Update BLIP-2 metafile and README (#75)

* [Feature]: Add readme and docstring

* Update blip2 results

---------

Co-authored-by: mzr1996 <mzr1996@163.com>

* [Feature] BLIP Visual Grounding on MMPretrain Branch (#66)

* blip grounding merge with mmpretrain

* remove commit

* blip grounding test and inference api

* refcoco dataset

* refcoco dataset refine config

* rebasing

* gitignore

* rebasing

* minor edit

* minor edit

* Update blip-vqa docstring (#72)

* rebasing

* Revert "minor edit"

This reverts commit 639cec757c215e654625ed0979319e60f0be9044.

* blip grounding final

* precommit

* refine config

* refine config

* Update blip visual grounding

---------

Co-authored-by: Yiqin Wang 王逸钦 <wyq1217@outlook.com>
Co-authored-by: mzr1996 <mzr1996@163.com>

* Update visual grounding metric

* Update OFA docstring, README and metafiles. (#76)

* [Docs] Update installation docs and gradio demo docs. (#77)

* Update OFA name

* Update Visual Grounding Visualizer

* Integrate accelerate support

* Fix imports.

* Fix timm backbone

* Update imports

* Update README

* Update circle ci

* Update flamingo config

* Add gradio demo README

* [Feature]: Add scienceqa (#1571)

* [Feature]: Add scienceqa

* [Feature]: Change param name

* Update docs

* Update video

---------

Co-authored-by: Hubert <42952108+yingfhu@users.noreply.github.com>
Co-authored-by: yingfhu <yingfhu@gmail.com>
Co-authored-by: Yuan Liu <30762564+YuanLiuuuuuu@users.noreply.github.com>
Co-authored-by: Yiqin Wang 王逸钦 <wyq1217@outlook.com>
Co-authored-by: Rongjie Li <limo97@163.com>
2023-05-19 16:50:04 +08:00
Ezra-Yu 7f4eccbecf
[Fix] Fix multi-task-head loss potential bug (#1530)
* fix bug

* add comments
2023-05-06 18:04:34 +08:00
Wangbo Zhao(黑色枷锁) e954cf0aaf
[Fix] Fix the bug in binary cross entropy loss (#1499)
* [Fix] Fix the bug in binary cross entropy loss

 Fix the bug in binary cross entropy loss when using multi-label datasets e.g.VOC2007

* update ci

---------

Co-authored-by: Ezra-Yu <18586273+Ezra-Yu@users.noreply.github.com>
2023-04-19 13:53:31 +08:00
Ma Zerun dbf3df21a3
[Refactor] Use `out_type` to specify ViT-like backbone output. (#1408)
* [Refactor] Use  to specify ViT-like backbone output.

* Fix ClsBatchNormNeck

* Update mmpretrain/models/necks/mae_neck.py

---------

Co-authored-by: Yixiao Fang <36138628+fangyixiao18@users.noreply.github.com>
2023-03-09 11:02:58 +08:00
Yixiao Fang 08dc8c75d3
[Refactor] Add selfsup algorithms. (#1389)
* remove basehead

* add moco series

* add byol simclr simsiam

* add ut

* update configs

* add simsiam hook

* add and refactor beit

* update ut

* add cae

* update extract_feat

* refactor cae

* add mae

* refactor data preprocessor

* update heads

* add maskfeat

* add milan

* add simmim

* add mixmim

* fix lint

* fix ut

* fix lint

* add eva

* add densecl

* add barlowtwins

* add swav

* fix lint

* update readtherdocs rst

* update docs

* update

* Decrease UT memory usage

* Fix docstring

* update DALLEEncoder

* Update model docs

* refactor dalle encoder

* update docstring

* fix ut

* fix config error

* add val_cfg and test_cfg

* refactor clip generator

* fix lint

* pass check

* fix ut

* add lars

* update type of BEiT in configs

* Use MMEngine style momentum in EMA.

* apply mmpretrain solarize

---------

Co-authored-by: mzr1996 <mzr1996@163.com>
2023-03-06 16:53:15 +08:00
Yixiao Fang 63d9f27fde
[Refactor] Add necks, heads and losses for the self-supervised task. (#1376)
* add necks

* refactor linear neck

* rename simmim neck

* add heads

* add losses

* fix

* add unittest

* update

* update cae

* remove mim head

* update config
2023-02-28 10:05:00 +08:00
Ma Zerun 36bea13fca
[Refactor] Refactor ClsDatasample to a union DataSample. (#1371)
* [Refactor] Refactor ClsDatasample to a union DataSample.

* Add  method

* Fix docstring

* Update docstring.
2023-02-23 10:07:53 +08:00
mzr1996 0979e78573 Rename the package name to `mmpretrain`. 2023-02-17 15:20:55 +08:00