mmclassification/README_zh-CN.md at b058912c0cc79312cf656ca98b4068c30a9f9cdd

mirror of https://github.com/open-mmlab/mmclassification.git synced 2025-06-03 21:53:55 +08:00

[Feature] Support multiple multi-modal algorithms and inferencers. (#1561 )

* [Feat] Migrate blip caption to mmpretrain. (#50)

* Migrate blip caption to mmpretrain

* minor fix

* support train

* [Feature] Support OFA caption task. (#51)

* [Feature] Support OFA caption task.

* Remove duplicated files.

* [Feature] Support OFA vqa task. (#58)

* [Feature] Support OFA vqa task.

* Fix lint.

* [Feat] Add BLIP retrieval to mmpretrain. (#55)

* init

* minor fix for train

* fix according to comments

* refactor

* Update Blip retrieval. (#62)

* [Feature] Support OFA visual grounding task. (#59)

* [Feature] Support OFA visual grounding task.

* minor add TODO

---------

Co-authored-by: yingfhu <yingfhu@gmail.com>

* [Feat] Add flamingos coco caption and vqa. (#60)

* first init

* init flamingo coco

* add vqa

* minor fix

* remove unnecessary modules

* Update config

* Use `ApplyToList`.

---------

Co-authored-by: mzr1996 <mzr1996@163.com>

* [Feature]: BLIP2 coco retrieval  (#53)

* [Feature]: Add blip2 retriever

* [Feature]: Add blip2 all modules

* [Feature]: Refine model

* [Feature]: x1

* [Feature]: Runnable coco ret

* [Feature]: Runnable version

* [Feature]: Fix lint

* [Fix]: Fix lint

* [Feature]: Use 364 img size

* [Feature]: Refactor blip2

* [Fix]: Fix lint

* refactor files

* minor fix

* minor fix

---------

Co-authored-by: yingfhu <yingfhu@gmail.com>

* Remove

* fix blip caption inputs (#68)

* [Feat] Add BLIP NLVR support. (#67)

* first init

* init flamingo coco

* add vqa

* add nlvr

* refactor nlvr

* minor fix

* minor fix

* Update dataset

---------

Co-authored-by: mzr1996 <mzr1996@163.com>

* [Feature]: BLIP2 Caption (#70)

* [Feature]: Add language model

* [Feature]: blip2 caption forward

* [Feature]: Reproduce the results

* [Feature]: Refactor caption

* refine config

---------

Co-authored-by: yingfhu <yingfhu@gmail.com>

* [Feat] Migrate BLIP VQA to mmpretrain (#69)

* reformat

* change

* change

* change

* change

* change

* change

* change

* change

* change

* change

* change

* change

* change

* change

* change

* change

* change

* change

* change

* refactor code

---------

Co-authored-by: yingfhu <yingfhu@gmail.com>

* Update RefCOCO dataset

* [Fix] fix lint

* [Feature] Implement inference APIs for multi-modal tasks. (#65)

* [Feature] Implement inference APIs for multi-modal tasks.

* [Project] Add gradio demo.

* [Improve] Update requirements

* Update flamingo

* Update blip

* Add NLVR inferencer

* Update flamingo

* Update hugging face model register

* Update ofa vqa

* Update BLIP-vqa (#71)

* Update blip-vqa docstring (#72)

* Refine flamingo docstring (#73)

* [Feature]: BLIP2 VQA (#61)

* [Feature]: VQA forward

* [Feature]: Reproduce accuracy

* [Fix]: Fix lint

* [Fix]: Add blank line

* minor fix

---------

Co-authored-by: yingfhu <yingfhu@gmail.com>

* [Feature]: BLIP2 docstring (#74)

* [Feature]: Add caption docstring

* [Feature]: Add docstring to blip2 vqa

* [Feature]: Add docstring to retrieval

* Update BLIP-2 metafile and README (#75)

* [Feature]: Add readme and docstring

* Update blip2 results

---------

Co-authored-by: mzr1996 <mzr1996@163.com>

* [Feature] BLIP Visual Grounding on MMPretrain Branch (#66)

* blip grounding merge with mmpretrain

* remove commit

* blip grounding test and inference api

* refcoco dataset

* refcoco dataset refine config

* rebasing

* gitignore

* rebasing

* minor edit

* minor edit

* Update blip-vqa docstring (#72)

* rebasing

* Revert "minor edit"

This reverts commit 639cec757c215e654625ed0979319e60f0be9044.

* blip grounding final

* precommit

* refine config

* refine config

* Update blip visual grounding

---------

Co-authored-by: Yiqin Wang 王逸钦 <wyq1217@outlook.com>
Co-authored-by: mzr1996 <mzr1996@163.com>

* Update visual grounding metric

* Update OFA docstring, README and metafiles. (#76)

* [Docs] Update installation docs and gradio demo docs. (#77)

* Update OFA name

* Update Visual Grounding Visualizer

* Integrate accelerate support

* Fix imports.

* Fix timm backbone

* Update imports

* Update README

* Update circle ci

* Update flamingo config

* Add gradio demo README

* [Feature]: Add scienceqa (#1571)

* [Feature]: Add scienceqa

* [Feature]: Change param name

* Update docs

* Update video

---------

Co-authored-by: Hubert <42952108+yingfhu@users.noreply.github.com>
Co-authored-by: yingfhu <yingfhu@gmail.com>
Co-authored-by: Yuan Liu <30762564+YuanLiuuuuuu@users.noreply.github.com>
Co-authored-by: Yiqin Wang 王逸钦 <wyq1217@outlook.com>
Co-authored-by: Rongjie Li <limo97@163.com>

2023-05-19 16:50:04 +08:00

16 KiB

Raw Blame History

OpenMMLab 官网 ^HOT OpenMMLab 开放平台 ^{TRY IT OUT}

📘 中文文档 | 🛠️ 安装教程 | 👀 模型库 | 🆕 更新日志 | 🤔 报告问题

English | 简体中文

Introduction

MMPreTrain 是一款基于 PyTorch 的开源深度学习预训练工具箱，是 OpenMMLab 项目的成员之一

主分支代码目前支持 PyTorch 1.8 以上的版本。

主要特性

支持多样的主干网络与预训练模型
支持多种训练策略（有监督学习，无监督学习，多模态学习等）
提供多种训练技巧
大量的训练配置文件
高效率和高可扩展性
功能强大的工具箱，有助于模型分析和实验
支持多种开箱即用的推理任务
- 图像分类
- 图像描述（Image Caption）
- 视觉问答（Visual Question Answering）
- 视觉定位（Visual Grounding）
- 检索（图搜图，图搜文，文搜图）

https://github.com/open-mmlab/mmpretrain/assets/26739999/e4dcd3a2-f895-4d1b-a351-fbc74a04e904

更新日志

🌟 2023/4/7 发布了 v1.0.0rc7 版本

整和来自 MMSelfSup 的自监督学习算法，例如 MAE, BEiT 等
支持了 RIFormer，简单但有效的视觉主干网络，却移除了 token mixer
支持 t-SNE 可视化
重构数据管道可视化

之前版本更新内容

支持了 LeViT, XCiT, ViG, ConvNeXt-V2, EVA, RevViT, EfficientnetV2, CLIP, TinyViT 和 MixMIM 等骨干网络结构
复现了 ConvNeXt 和 RepVGG 的训练精度。
支持混淆矩阵计算和画图。
支持了 多任务 训练和测试。
支持了测试时增强（TTA）。
更新了主要 API 接口，用以方便地获取 MMPreTrain 中预定义的模型。
重构 BEiT 主干网络结构，并支持 v1 和 v2 模型的推理。

这个版本引入一个全新的，可扩展性强的训练和测试引擎，但目前仍在开发中。欢迎根据文档进行试用。

同时，新版本中存在一些与旧版本不兼容的修改。请查看迁移文档来详细了解这些变动。

发布历史和更新细节请参考更新日志。

安装

以下是安装的简要步骤：

conda create -n open-mmlab python=3.8 pytorch==1.10.1 torchvision==0.11.2 cudatoolkit=11.3 -c pytorch -y
conda activate open-mmlab
pip3 install openmim
git clone https://github.com/open-mmlab/mmpretrain.git
cd mmpretrain
mim install -e .

更详细的步骤请参考安装指南进行安装。

如果需要多模态模型，请使用如下方式安装额外的依赖：

mim install -e ".[multimodal]"

基础教程

我们为新用户提供了一系列基础教程：

关于更多的信息，请查阅我们的相关文档。

模型库

相关结果和模型可在模型库中获得。

概览

支持的主干网络

自监督学习

其它

图像检索任务：

ArcFace (CVPR'2019)

训练和测试 Tips:

参与贡献

我们非常欢迎任何有助于提升 MMPreTrain 的贡献，请参考贡献指南来了解如何参与贡献。

致谢

MMPreTrain 是一款由不同学校和公司共同贡献的开源项目。我们感谢所有为项目提供算法复现和新功能支持的贡献者，以及提供宝贵反馈的用户。我们希望该工具箱和基准测试可以为社区提供灵活的代码工具，供用户复现现有算法并开发自己的新模型，从而不断为开源社区提供贡献。

引用

如果你在研究中使用了本项目的代码或者性能基准，请参考如下 bibtex 引用 MMPreTrain。

@misc{2023mmpretrain,
    title={OpenMMLab's Pre-training Toolbox and Benchmark},
    author={MMPreTrain Contributors},
    howpublished = {\url{https://github.com/open-mmlab/mmpretrain}},
    year={2023}
}

许可证

该项目开源自 Apache 2.0 license.

OpenMMLab 的其他项目

MMEngine: OpenMMLab 深度学习模型训练基础库
MMCV: OpenMMLab 计算机视觉基础库
MIM: MIM 是 OpenMMlab 项目、算法、模型的统一入口
MMEval: 统一开放的跨框架算法评测库
MMPreTrain: OpenMMLab 深度学习预训练工具箱
MMDetection: OpenMMLab 目标检测工具箱
MMDetection3D: OpenMMLab 新一代通用 3D 目标检测平台
MMRotate: OpenMMLab 旋转框检测工具箱与测试基准
MMYOLO: OpenMMLab YOLO 系列工具箱与测试基准
MMSegmentation: OpenMMLab 语义分割工具箱
MMOCR: OpenMMLab 全流程文字检测识别理解工具包
MMPose: OpenMMLab 姿态估计工具箱
MMHuman3D: OpenMMLab 人体参数化模型工具箱与测试基准
MMSelfSup: OpenMMLab 自监督学习工具箱与测试基准
MMRazor: OpenMMLab 模型压缩工具箱与测试基准
MMFewShot: OpenMMLab 少样本学习工具箱与测试基准
MMAction2: OpenMMLab 新一代视频理解工具箱
MMTracking: OpenMMLab 一体化视频目标感知平台
MMFlow: OpenMMLab 光流估计工具箱与测试基准
MMagic: OpenMMLab 新一代人工智能内容生成（AIGC）工具箱
MMGeneration: OpenMMLab 图片视频生成模型工具箱
MMDeploy: OpenMMLab 模型部署框架
Playground: 收集和展示 OpenMMLab 相关的前沿、有趣的社区项目

欢迎加入 OpenMMLab 社区

扫描下方的二维码可关注 OpenMMLab 团队的知乎官方账号，加入 OpenMMLab 团队的官方交流 QQ 群或联络 OpenMMLab 官方微信小助手

我们会在 OpenMMLab 社区为大家

📢 分享 AI 框架的前沿核心技术
💻 解读 PyTorch 常用模块源码
📰 发布 OpenMMLab 的相关新闻
🚀 介绍 OpenMMLab 开发的前沿算法
🏃 获取更高效的问题答疑和意见反馈
🔥 提供与各行各业开发者充分交流的平台

干货满满 📘，等你来撩 💗，OpenMMLab 社区期待您的加入 👬

16 KiB Raw Blame History Unescape Escape