This commit is contained in:
mzr1996 2023-07-20 10:21:15 +08:00
parent ae7a7b7560
commit 60d780f99e
6 changed files with 11 additions and 11 deletions

View File

@ -21,7 +21,7 @@ Instruction tuning large language models (LLMs) using machine-generated instruct
According to the license of LLaMA, we cannot provide the merged checkpoint directly. Please use the below According to the license of LLaMA, we cannot provide the merged checkpoint directly. Please use the below
script to download and get the merged the checkpoint. script to download and get the merged the checkpoint.
```baseh ```shell
python tools/model_converters/llava-delta2mmpre.py huggyllama/llama-7b liuhaotian/LLaVA-Lightning-7B-delta-v1-1 ./LLaVA-Lightning-7B-delta-v1-1.pth python tools/model_converters/llava-delta2mmpre.py huggyllama/llama-7b liuhaotian/LLaVA-Lightning-7B-delta-v1-1 ./LLaVA-Lightning-7B-delta-v1-1.pth
``` ```

View File

@ -7,7 +7,7 @@
- Support inference of more **multi-modal** algorithms, such as **LLaVA**, **MiniGPT-4**, **Otter**, etc. - Support inference of more **multi-modal** algorithms, such as **LLaVA**, **MiniGPT-4**, **Otter**, etc.
- Support around **10 multi-modal datasets**! - Support around **10 multi-modal datasets**!
- Add **iTPN**, **SparK** self-supervised learning algorithms. - Add **iTPN**, **SparK** self-supervised learning algorithms.
- Provide examples of [New Config](./mmpretrain/configs/) and [DeepSpeed/FSDP](./configs/mae/benchmarks/). - Provide examples of [New Config](https://github.com/open-mmlab/mmpretrain/tree/main/mmpretrain/configs/) and [DeepSpeed/FSDP](https://github.com/open-mmlab/mmpretrain/tree/main/configs/mae/benchmarks/).
### New Features ### New Features

View File

@ -1,10 +1,10 @@
## Shape Bias Tool Usage # Shape Bias Tool Usage
Shape bias measures how a model relies the shapes, compared to texture, to sense the semantics in images. For more details, Shape bias measures how a model relies the shapes, compared to texture, to sense the semantics in images. For more details,
we recommend interested readers to this [paper](https://arxiv.org/abs/2106.07411). MMPretrain provide an off-the-shelf toolbox to we recommend interested readers to this [paper](https://arxiv.org/abs/2106.07411). MMPretrain provide an off-the-shelf toolbox to
obtain the shape bias of a classification model. You can following these steps below: obtain the shape bias of a classification model. You can following these steps below:
### Prepare the dataset ## Prepare the dataset
First you should download the [cue-conflict](https://github.com/bethgelab/model-vs-human/releases/download/v0.1/cue-conflict.tar.gz) to `data` folder, First you should download the [cue-conflict](https://github.com/bethgelab/model-vs-human/releases/download/v0.1/cue-conflict.tar.gz) to `data` folder,
and then unzip this dataset. After that, you `data` folder should have the following structure: and then unzip this dataset. After that, you `data` folder should have the following structure:
@ -18,7 +18,7 @@ data
| |── truck | |── truck
``` ```
### Modify the config for classification ## Modify the config for classification
We run the shape-bias tool on a ViT-base model with masked autoencoder pretraining. Its config file is `configs/mae/benchmarks/vit-base-p16_8xb128-coslr-100e_in1k.py`, and its checkpoint is downloaded from [this link](https://download.openmmlab.com/mmselfsup/1.x/mae/mae_vit-base-p16_8xb512-fp16-coslr-1600e_in1k/vit-base-p16_ft-8xb128-coslr-100e_in1k/vit-base-p16_ft-8xb128-coslr-100e_in1k_20220825-cf70aa21.pth). Replace the original test_pipeline, test_dataloader and test_evaluation with the following configurations: We run the shape-bias tool on a ViT-base model with masked autoencoder pretraining. Its config file is `configs/mae/benchmarks/vit-base-p16_8xb128-coslr-100e_in1k.py`, and its checkpoint is downloaded from [this link](https://download.openmmlab.com/mmselfsup/1.x/mae/mae_vit-base-p16_8xb512-fp16-coslr-1600e_in1k/vit-base-p16_ft-8xb128-coslr-100e_in1k/vit-base-p16_ft-8xb128-coslr-100e_in1k_20220825-cf70aa21.pth). Replace the original test_pipeline, test_dataloader and test_evaluation with the following configurations:
@ -55,7 +55,7 @@ test_evaluator = dict(
Please note you should make custom modifications to the `csv_dir` and `model_name` above. I renamed my modified sample config file as `vit-base-p16_8xb128-coslr-100e_in1k_shape-bias.py` in the folder `configs/mae/benchmarks/`. Please note you should make custom modifications to the `csv_dir` and `model_name` above. I renamed my modified sample config file as `vit-base-p16_8xb128-coslr-100e_in1k_shape-bias.py` in the folder `configs/mae/benchmarks/`.
### Inference your model with above modified config file ## Inference your model with above modified config file
Then you should inferece your model on the `cue-conflict` dataset with the your modified config file. Then you should inferece your model on the `cue-conflict` dataset with the your modified config file.
@ -77,7 +77,7 @@ bash tools/dist_test.sh configs/mae/benchmarks/vit-base-p16_8xb128-coslr-100e_in
After that, you should obtain a csv file in `csv_dir` folder, named `cue-conflict_model-name_session-1.csv`. Besides this file, you should also download these [csv files](https://github.com/bethgelab/model-vs-human/tree/master/raw-data/cue-conflict) to the After that, you should obtain a csv file in `csv_dir` folder, named `cue-conflict_model-name_session-1.csv`. Besides this file, you should also download these [csv files](https://github.com/bethgelab/model-vs-human/tree/master/raw-data/cue-conflict) to the
`csv_dir`. `csv_dir`.
### Plot shape bias ## Plot shape bias
Then we can start to plot the shape bias: Then we can start to plot the shape bias:

View File

@ -23,7 +23,7 @@ class Flamingo(BaseModel):
zeroshot_prompt (str): Prompt used for zero-shot inference. zeroshot_prompt (str): Prompt used for zero-shot inference.
Defaults to '<image>Output:'. Defaults to '<image>Output:'.
shot_prompt_tmpl (str): Prompt used for few-shot inference. shot_prompt_tmpl (str): Prompt used for few-shot inference.
Defaults to '<image>Output:{caption}<|endofchunk|>'. Defaults to ``<image>Output:{caption}<|endofchunk|>``.
final_prompt_tmpl (str): Final part of prompt used for inference. final_prompt_tmpl (str): Final part of prompt used for inference.
Defaults to '<image>Output:'. Defaults to '<image>Output:'.
generation_cfg (dict): The extra generation config, accept the keyword generation_cfg (dict): The extra generation config, accept the keyword

View File

@ -36,7 +36,7 @@ class MiniGPT4(BaseModel):
raw_prompts (list): Prompts for training. Defaults to None. raw_prompts (list): Prompts for training. Defaults to None.
max_txt_len (int): Max token length while doing tokenization. Defaults max_txt_len (int): Max token length while doing tokenization. Defaults
to 32. to 32.
end_sym (str): Ended symbol of the sequence. Defaults to '\n'. end_sym (str): Ended symbol of the sequence. Defaults to '\\n'.
generation_cfg (dict): The config of text generation. Defaults to generation_cfg (dict): The config of text generation. Defaults to
dict(). dict().
data_preprocessor (:obj:`BaseDataPreprocessor`): Used for data_preprocessor (:obj:`BaseDataPreprocessor`): Used for

View File

@ -20,8 +20,8 @@ class Otter(Flamingo):
zeroshot_prompt (str): Prompt used for zero-shot inference. zeroshot_prompt (str): Prompt used for zero-shot inference.
Defaults to an. Defaults to an.
shot_prompt_tmpl (str): Prompt used for few-shot inference. shot_prompt_tmpl (str): Prompt used for few-shot inference.
Defaults to '<image>User:Please describe the image. Defaults to ``<image>User:Please describe the image.
GPT:<answer>{caption}<|endofchunk|>'. GPT:<answer>{caption}<|endofchunk|>``.
final_prompt_tmpl (str): Final part of prompt used for inference. final_prompt_tmpl (str): Final part of prompt used for inference.
Defaults to '<image>User:Please describe the image. GPT:<answer>'. Defaults to '<image>User:Please describe the image. GPT:<answer>'.
generation_cfg (dict): The extra generation config, accept the keyword generation_cfg (dict): The extra generation config, accept the keyword