mirror of
https://github.com/huggingface/pytorch-image-models.git
synced 2025-06-03 15:01:08 +08:00
Fix typos
This commit is contained in:
parent
3677f67902
commit
8d81fdf3d9
@ -10,7 +10,7 @@ Code linting and auto-format (black) are not currently in place but open to cons
|
|||||||
|
|
||||||
A few specific differences from Google style (or black)
|
A few specific differences from Google style (or black)
|
||||||
1. Line length is 120 char. Going over is okay in some cases (e.g. I prefer not to break URL across lines).
|
1. Line length is 120 char. Going over is okay in some cases (e.g. I prefer not to break URL across lines).
|
||||||
2. Hanging indents are always prefered, please avoid aligning arguments with closing brackets or braces.
|
2. Hanging indents are always preferred, please avoid aligning arguments with closing brackets or braces.
|
||||||
|
|
||||||
Example, from Google guide, but this is a NO here:
|
Example, from Google guide, but this is a NO here:
|
||||||
```
|
```
|
||||||
|
@ -238,7 +238,7 @@ Add a set of new very well trained ResNet & ResNet-V2 18/34 (basic block) weight
|
|||||||
### May 14, 2024
|
### May 14, 2024
|
||||||
* Support loading PaliGemma jax weights into SigLIP ViT models with average pooling.
|
* Support loading PaliGemma jax weights into SigLIP ViT models with average pooling.
|
||||||
* Add Hiera models from Meta (https://github.com/facebookresearch/hiera).
|
* Add Hiera models from Meta (https://github.com/facebookresearch/hiera).
|
||||||
* Add `normalize=` flag for transorms, return non-normalized torch.Tensor with original dytpe (for `chug`)
|
* Add `normalize=` flag for transforms, return non-normalized torch.Tensor with original dytpe (for `chug`)
|
||||||
* Version 1.0.3 release
|
* Version 1.0.3 release
|
||||||
|
|
||||||
### May 11, 2024
|
### May 11, 2024
|
||||||
|
@ -93,7 +93,7 @@
|
|||||||
### May 14, 2024
|
### May 14, 2024
|
||||||
* Support loading PaliGemma jax weights into SigLIP ViT models with average pooling.
|
* Support loading PaliGemma jax weights into SigLIP ViT models with average pooling.
|
||||||
* Add Hiera models from Meta (https://github.com/facebookresearch/hiera).
|
* Add Hiera models from Meta (https://github.com/facebookresearch/hiera).
|
||||||
* Add `normalize=` flag for transorms, return non-normalized torch.Tensor with original dytpe (for `chug`)
|
* Add `normalize=` flag for transforms, return non-normalized torch.Tensor with original dytpe (for `chug`)
|
||||||
* Version 1.0.3 release
|
* Version 1.0.3 release
|
||||||
|
|
||||||
### May 11, 2024
|
### May 11, 2024
|
||||||
@ -125,7 +125,7 @@
|
|||||||
### April 11, 2024
|
### April 11, 2024
|
||||||
* Prepping for a long overdue 1.0 release, things have been stable for a while now.
|
* Prepping for a long overdue 1.0 release, things have been stable for a while now.
|
||||||
* Significant feature that's been missing for a while, `features_only=True` support for ViT models with flat hidden states or non-std module layouts (so far covering `'vit_*', 'twins_*', 'deit*', 'beit*', 'mvitv2*', 'eva*', 'samvit_*', 'flexivit*'`)
|
* Significant feature that's been missing for a while, `features_only=True` support for ViT models with flat hidden states or non-std module layouts (so far covering `'vit_*', 'twins_*', 'deit*', 'beit*', 'mvitv2*', 'eva*', 'samvit_*', 'flexivit*'`)
|
||||||
* Above feature support achieved through a new `forward_intermediates()` API that can be used with a feature wrapping module or direclty.
|
* Above feature support achieved through a new `forward_intermediates()` API that can be used with a feature wrapping module or directly.
|
||||||
```python
|
```python
|
||||||
model = timm.create_model('vit_base_patch16_224')
|
model = timm.create_model('vit_base_patch16_224')
|
||||||
final_feat, intermediates = model.forward_intermediates(input)
|
final_feat, intermediates = model.forward_intermediates(input)
|
||||||
@ -360,7 +360,7 @@ Datasets & transform refactoring
|
|||||||
* 0.8.15dev0
|
* 0.8.15dev0
|
||||||
|
|
||||||
### Feb 20, 2023
|
### Feb 20, 2023
|
||||||
* Add 320x320 `convnext_large_mlp.clip_laion2b_ft_320` and `convnext_lage_mlp.clip_laion2b_ft_soup_320` CLIP image tower weights for features & fine-tune
|
* Add 320x320 `convnext_large_mlp.clip_laion2b_ft_320` and `convnext_large_mlp.clip_laion2b_ft_soup_320` CLIP image tower weights for features & fine-tune
|
||||||
* 0.8.13dev0 pypi release for latest changes w/ move to huggingface org
|
* 0.8.13dev0 pypi release for latest changes w/ move to huggingface org
|
||||||
|
|
||||||
### Feb 16, 2023
|
### Feb 16, 2023
|
||||||
@ -745,7 +745,7 @@ More models, more fixes
|
|||||||
* Add 'group matching' API to all models to allow grouping model parameters for application of 'layer-wise' LR decay, lr scale added to LR scheduler
|
* Add 'group matching' API to all models to allow grouping model parameters for application of 'layer-wise' LR decay, lr scale added to LR scheduler
|
||||||
* Gradient checkpointing support added to many models
|
* Gradient checkpointing support added to many models
|
||||||
* `forward_head(x, pre_logits=False)` fn added to all models to allow separate calls of `forward_features` + `forward_head`
|
* `forward_head(x, pre_logits=False)` fn added to all models to allow separate calls of `forward_features` + `forward_head`
|
||||||
* All vision transformer and vision MLP models update to return non-pooled / non-token selected features from `foward_features`, for consistency with CNN models, token selection or pooling now applied in `forward_head`
|
* All vision transformer and vision MLP models update to return non-pooled / non-token selected features from `forward_features`, for consistency with CNN models, token selection or pooling now applied in `forward_head`
|
||||||
|
|
||||||
### Feb 2, 2022
|
### Feb 2, 2022
|
||||||
* [Chris Hughes](https://github.com/Chris-hughes10) posted an exhaustive run through of `timm` on his blog yesterday. Well worth a read. [Getting Started with PyTorch Image Models (timm): A Practitioner’s Guide](https://towardsdatascience.com/getting-started-with-pytorch-image-models-timm-a-practitioners-guide-4e77b4bf9055)
|
* [Chris Hughes](https://github.com/Chris-hughes10) posted an exhaustive run through of `timm` on his blog yesterday. Well worth a read. [Getting Started with PyTorch Image Models (timm): A Practitioner’s Guide](https://towardsdatascience.com/getting-started-with-pytorch-image-models-timm-a-practitioners-guide-4e77b4bf9055)
|
||||||
@ -1058,7 +1058,7 @@ More models, more fixes
|
|||||||
* Add 'group matching' API to all models to allow grouping model parameters for application of 'layer-wise' LR decay, lr scale added to LR scheduler
|
* Add 'group matching' API to all models to allow grouping model parameters for application of 'layer-wise' LR decay, lr scale added to LR scheduler
|
||||||
* Gradient checkpointing support added to many models
|
* Gradient checkpointing support added to many models
|
||||||
* `forward_head(x, pre_logits=False)` fn added to all models to allow separate calls of `forward_features` + `forward_head`
|
* `forward_head(x, pre_logits=False)` fn added to all models to allow separate calls of `forward_features` + `forward_head`
|
||||||
* All vision transformer and vision MLP models update to return non-pooled / non-token selected features from `foward_features`, for consistency with CNN models, token selection or pooling now applied in `forward_head`
|
* All vision transformer and vision MLP models update to return non-pooled / non-token selected features from `forward_features`, for consistency with CNN models, token selection or pooling now applied in `forward_head`
|
||||||
|
|
||||||
### Feb 2, 2022
|
### Feb 2, 2022
|
||||||
* [Chris Hughes](https://github.com/Chris-hughes10) posted an exhaustive run through of `timm` on his blog yesterday. Well worth a read. [Getting Started with PyTorch Image Models (timm): A Practitioner’s Guide](https://towardsdatascience.com/getting-started-with-pytorch-image-models-timm-a-practitioners-guide-4e77b4bf9055)
|
* [Chris Hughes](https://github.com/Chris-hughes10) posted an exhaustive run through of `timm` on his blog yesterday. Well worth a read. [Getting Started with PyTorch Image Models (timm): A Practitioner’s Guide](https://towardsdatascience.com/getting-started-with-pytorch-image-models-timm-a-practitioners-guide-4e77b4bf9055)
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
# Adversarial Inception v3
|
# Adversarial Inception v3
|
||||||
|
|
||||||
**Inception v3** is a convolutional neural network architecture from the Inception family that makes several improvements including using [Label Smoothing](https://paperswithcode.com/method/label-smoothing), Factorized 7 x 7 convolutions, and the use of an [auxiliary classifer](https://paperswithcode.com/method/auxiliary-classifier) to propagate label information lower down the network (along with the use of batch normalization for layers in the sidehead). The key building block is an [Inception Module](https://paperswithcode.com/method/inception-v3-module).
|
**Inception v3** is a convolutional neural network architecture from the Inception family that makes several improvements including using [Label Smoothing](https://paperswithcode.com/method/label-smoothing), Factorized 7 x 7 convolutions, and the use of an [auxiliary classifier](https://paperswithcode.com/method/auxiliary-classifier) to propagate label information lower down the network (along with the use of batch normalization for layers in the sidehead). The key building block is an [Inception Module](https://paperswithcode.com/method/inception-v3-module).
|
||||||
|
|
||||||
This particular model was trained for study of adversarial examples (adversarial training).
|
This particular model was trained for study of adversarial examples (adversarial training).
|
||||||
|
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
# (Gluon) Inception v3
|
# (Gluon) Inception v3
|
||||||
|
|
||||||
**Inception v3** is a convolutional neural network architecture from the Inception family that makes several improvements including using [Label Smoothing](https://paperswithcode.com/method/label-smoothing), Factorized 7 x 7 convolutions, and the use of an [auxiliary classifer](https://paperswithcode.com/method/auxiliary-classifier) to propagate label information lower down the network (along with the use of batch normalization for layers in the sidehead). The key building block is an [Inception Module](https://paperswithcode.com/method/inception-v3-module).
|
**Inception v3** is a convolutional neural network architecture from the Inception family that makes several improvements including using [Label Smoothing](https://paperswithcode.com/method/label-smoothing), Factorized 7 x 7 convolutions, and the use of an [auxiliary classifier](https://paperswithcode.com/method/auxiliary-classifier) to propagate label information lower down the network (along with the use of batch normalization for layers in the sidehead). The key building block is an [Inception Module](https://paperswithcode.com/method/inception-v3-module).
|
||||||
|
|
||||||
The weights from this model were ported from [Gluon](https://cv.gluon.ai/model_zoo/classification.html).
|
The weights from this model were ported from [Gluon](https://cv.gluon.ai/model_zoo/classification.html).
|
||||||
|
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
# Inception v3
|
# Inception v3
|
||||||
|
|
||||||
**Inception v3** is a convolutional neural network architecture from the Inception family that makes several improvements including using [Label Smoothing](https://paperswithcode.com/method/label-smoothing), Factorized 7 x 7 convolutions, and the use of an [auxiliary classifer](https://paperswithcode.com/method/auxiliary-classifier) to propagate label information lower down the network (along with the use of batch normalization for layers in the sidehead). The key building block is an [Inception Module](https://paperswithcode.com/method/inception-v3-module).
|
**Inception v3** is a convolutional neural network architecture from the Inception family that makes several improvements including using [Label Smoothing](https://paperswithcode.com/method/label-smoothing), Factorized 7 x 7 convolutions, and the use of an [auxiliary classifier](https://paperswithcode.com/method/auxiliary-classifier) to propagate label information lower down the network (along with the use of batch normalization for layers in the sidehead). The key building block is an [Inception Module](https://paperswithcode.com/method/inception-v3-module).
|
||||||
|
|
||||||
## How do I use this model on an image?
|
## How do I use this model on an image?
|
||||||
|
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
# (Tensorflow) Inception v3
|
# (Tensorflow) Inception v3
|
||||||
|
|
||||||
**Inception v3** is a convolutional neural network architecture from the Inception family that makes several improvements including using [Label Smoothing](https://paperswithcode.com/method/label-smoothing), Factorized 7 x 7 convolutions, and the use of an [auxiliary classifer](https://paperswithcode.com/method/auxiliary-classifier) to propagate label information lower down the network (along with the use of batch normalization for layers in the sidehead). The key building block is an [Inception Module](https://paperswithcode.com/method/inception-v3-module).
|
**Inception v3** is a convolutional neural network architecture from the Inception family that makes several improvements including using [Label Smoothing](https://paperswithcode.com/method/label-smoothing), Factorized 7 x 7 convolutions, and the use of an [auxiliary classifier](https://paperswithcode.com/method/auxiliary-classifier) to propagate label information lower down the network (along with the use of batch normalization for layers in the sidehead). The key building block is an [Inception Module](https://paperswithcode.com/method/inception-v3-module).
|
||||||
|
|
||||||
The weights from this model were ported from [Tensorflow/Models](https://github.com/tensorflow/models).
|
The weights from this model were ported from [Tensorflow/Models](https://github.com/tensorflow/models).
|
||||||
|
|
||||||
|
@ -954,7 +954,7 @@ def augment_and_mix_transform(config_str: str, hparams: Optional[Dict] = None):
|
|||||||
Args:
|
Args:
|
||||||
config_str (str): String defining configuration of random augmentation. Consists of multiple sections separated
|
config_str (str): String defining configuration of random augmentation. Consists of multiple sections separated
|
||||||
by dashes ('-'). The first section defines the specific variant of rand augment (currently only 'rand').
|
by dashes ('-'). The first section defines the specific variant of rand augment (currently only 'rand').
|
||||||
The remaining sections, not order sepecific determine
|
The remaining sections, not order specific determine
|
||||||
'm' - integer magnitude (severity) of augmentation mix (default: 3)
|
'm' - integer magnitude (severity) of augmentation mix (default: 3)
|
||||||
'w' - integer width of augmentation chain (default: 3)
|
'w' - integer width of augmentation chain (default: 3)
|
||||||
'd' - integer depth of augmentation chain (-1 is random [1, 3], default: -1)
|
'd' - integer depth of augmentation chain (-1 is random [1, 3], default: -1)
|
||||||
|
@ -52,7 +52,7 @@ class ImageNetInfo(DatasetInfo):
|
|||||||
subset = re.sub(r'[-_\s]', '', subset.lower())
|
subset = re.sub(r'[-_\s]', '', subset.lower())
|
||||||
assert subset in _SUBSETS, f'Unknown imagenet subset {subset}.'
|
assert subset in _SUBSETS, f'Unknown imagenet subset {subset}.'
|
||||||
|
|
||||||
# WordNet synsets (part-of-speach + offset) are the unique class label names for ImageNet classifiers
|
# WordNet synsets (part-of-speech + offset) are the unique class label names for ImageNet classifiers
|
||||||
synset_file = _SUBSETS[subset]
|
synset_file = _SUBSETS[subset]
|
||||||
synset_data = pkgutil.get_data(__name__, os.path.join('_info', synset_file))
|
synset_data = pkgutil.get_data(__name__, os.path.join('_info', synset_file))
|
||||||
self._synsets = synset_data.decode('utf-8').splitlines()
|
self._synsets = synset_data.decode('utf-8').splitlines()
|
||||||
|
@ -80,7 +80,7 @@ class ReaderHfids(Reader):
|
|||||||
self.num_samples = split_info.num_examples
|
self.num_samples = split_info.num_examples
|
||||||
else:
|
else:
|
||||||
raise ValueError(
|
raise ValueError(
|
||||||
"Dataset length is unknown, please pass `num_samples` explicitely. "
|
"Dataset length is unknown, please pass `num_samples` explicitly. "
|
||||||
"The number of steps needs to be known in advance for the learning rate scheduler."
|
"The number of steps needs to be known in advance for the learning rate scheduler."
|
||||||
)
|
)
|
||||||
|
|
||||||
|
@ -25,7 +25,7 @@ def find_images_and_targets(
|
|||||||
""" Walk folder recursively to discover images and map them to classes by folder names.
|
""" Walk folder recursively to discover images and map them to classes by folder names.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
folder: root of folder to recrusively search
|
folder: root of folder to recursively search
|
||||||
types: types (file extensions) to search for in path
|
types: types (file extensions) to search for in path
|
||||||
class_to_idx: specify mapping for class (folder name) to class index if set
|
class_to_idx: specify mapping for class (folder name) to class index if set
|
||||||
leaf_name_only: use only leaf-name of folder walk for class names
|
leaf_name_only: use only leaf-name of folder walk for class names
|
||||||
|
@ -124,7 +124,7 @@ def _parse_split_info(split: str, info: Dict):
|
|||||||
|
|
||||||
|
|
||||||
def log_and_continue(exn):
|
def log_and_continue(exn):
|
||||||
"""Call in an exception handler to ignore exceptions, isssue a warning, and continue."""
|
"""Call in an exception handler to ignore exceptions, issue a warning, and continue."""
|
||||||
_logger.warning(f'Handling webdataset error ({repr(exn)}). Ignoring.')
|
_logger.warning(f'Handling webdataset error ({repr(exn)}). Ignoring.')
|
||||||
# NOTE: try force an exit on errors that are clearly code / config and not transient
|
# NOTE: try force an exit on errors that are clearly code / config and not transient
|
||||||
if isinstance(exn, TypeError):
|
if isinstance(exn, TypeError):
|
||||||
@ -277,7 +277,7 @@ class ReaderWds(Reader):
|
|||||||
target_img_mode: str = '',
|
target_img_mode: str = '',
|
||||||
filename_key: str = 'filename',
|
filename_key: str = 'filename',
|
||||||
sample_shuffle_size: Optional[int] = None,
|
sample_shuffle_size: Optional[int] = None,
|
||||||
smaple_initial_size: Optional[int] = None,
|
sample_initial_size: Optional[int] = None,
|
||||||
):
|
):
|
||||||
super().__init__()
|
super().__init__()
|
||||||
if wds is None:
|
if wds is None:
|
||||||
@ -290,7 +290,7 @@ class ReaderWds(Reader):
|
|||||||
self.common_seed = seed # a seed that's fixed across all worker / distributed instances
|
self.common_seed = seed # a seed that's fixed across all worker / distributed instances
|
||||||
self.shard_shuffle_size = 500
|
self.shard_shuffle_size = 500
|
||||||
self.sample_shuffle_size = sample_shuffle_size or SAMPLE_SHUFFLE_SIZE
|
self.sample_shuffle_size = sample_shuffle_size or SAMPLE_SHUFFLE_SIZE
|
||||||
self.sample_initial_size = smaple_initial_size or SAMPLE_INITIAL_SIZE
|
self.sample_initial_size = sample_initial_size or SAMPLE_INITIAL_SIZE
|
||||||
|
|
||||||
self.input_key = input_key
|
self.input_key = input_key
|
||||||
self.input_img_mode = input_img_mode
|
self.input_img_mode = input_img_mode
|
||||||
|
@ -47,7 +47,7 @@ def sigmoid(x, inplace: bool = False):
|
|||||||
return x.sigmoid_() if inplace else x.sigmoid()
|
return x.sigmoid_() if inplace else x.sigmoid()
|
||||||
|
|
||||||
|
|
||||||
# PyTorch has this, but not with a consistent inplace argmument interface
|
# PyTorch has this, but not with a consistent inplace argument interface
|
||||||
class Sigmoid(nn.Module):
|
class Sigmoid(nn.Module):
|
||||||
def __init__(self, inplace: bool = False):
|
def __init__(self, inplace: bool = False):
|
||||||
super(Sigmoid, self).__init__()
|
super(Sigmoid, self).__init__()
|
||||||
@ -61,7 +61,7 @@ def tanh(x, inplace: bool = False):
|
|||||||
return x.tanh_() if inplace else x.tanh()
|
return x.tanh_() if inplace else x.tanh()
|
||||||
|
|
||||||
|
|
||||||
# PyTorch has this, but not with a consistent inplace argmument interface
|
# PyTorch has this, but not with a consistent inplace argument interface
|
||||||
class Tanh(nn.Module):
|
class Tanh(nn.Module):
|
||||||
def __init__(self, inplace: bool = False):
|
def __init__(self, inplace: bool = False):
|
||||||
super(Tanh, self).__init__()
|
super(Tanh, self).__init__()
|
||||||
|
@ -16,7 +16,7 @@ class MultiQueryAttentionV2(nn.Module):
|
|||||||
Fast Transformer Decoding: One Write-Head is All You Need
|
Fast Transformer Decoding: One Write-Head is All You Need
|
||||||
https://arxiv.org/pdf/1911.02150.pdf
|
https://arxiv.org/pdf/1911.02150.pdf
|
||||||
|
|
||||||
This is an acceletor optimized version - removing multiple unneccessary
|
This is an acceletor optimized version - removing multiple unnecessary
|
||||||
tensor transpose by re-arranging indices according to the following rules: 1)
|
tensor transpose by re-arranging indices according to the following rules: 1)
|
||||||
contracted indices are at the end, 2) other indices have the same order in the
|
contracted indices are at the end, 2) other indices have the same order in the
|
||||||
input and output tensores.
|
input and output tensores.
|
||||||
@ -87,7 +87,7 @@ class MultiQueryAttention2d(nn.Module):
|
|||||||
2. query_strides: horizontal & vertical strides on Query only.
|
2. query_strides: horizontal & vertical strides on Query only.
|
||||||
|
|
||||||
This is an optimized version.
|
This is an optimized version.
|
||||||
1. Projections in Attention is explict written out as 1x1 Conv2D.
|
1. Projections in Attention is explicit written out as 1x1 Conv2D.
|
||||||
2. Additional reshapes are introduced to bring a up to 3x speed up.
|
2. Additional reshapes are introduced to bring a up to 3x speed up.
|
||||||
"""
|
"""
|
||||||
fused_attn: torch.jit.Final[bool]
|
fused_attn: torch.jit.Final[bool]
|
||||||
|
@ -1,7 +1,7 @@
|
|||||||
""" NormAct (Normalizaiton + Activation Layer) Factory
|
""" NormAct (Normalization + Activation Layer) Factory
|
||||||
|
|
||||||
Create norm + act combo modules that attempt to be backwards compatible with separate norm + act
|
Create norm + act combo modules that attempt to be backwards compatible with separate norm + act
|
||||||
isntances in models. Where these are used it will be possible to swap separate BN + act layers with
|
instances in models. Where these are used it will be possible to swap separate BN + act layers with
|
||||||
combined modules like IABN or EvoNorms.
|
combined modules like IABN or EvoNorms.
|
||||||
|
|
||||||
Hacked together by / Copyright 2020 Ross Wightman
|
Hacked together by / Copyright 2020 Ross Wightman
|
||||||
|
@ -78,7 +78,7 @@ def trunc_normal_tf_(tensor, mean=0., std=1., a=-2., b=2.):
|
|||||||
|
|
||||||
NOTE: this 'tf' variant behaves closer to Tensorflow / JAX impl where the
|
NOTE: this 'tf' variant behaves closer to Tensorflow / JAX impl where the
|
||||||
bounds [a, b] are applied when sampling the normal distribution with mean=0, std=1.0
|
bounds [a, b] are applied when sampling the normal distribution with mean=0, std=1.0
|
||||||
and the result is subsquently scaled and shifted by the mean and std args.
|
and the result is subsequently scaled and shifted by the mean and std args.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
tensor: an n-dimensional `torch.Tensor`
|
tensor: an n-dimensional `torch.Tensor`
|
||||||
|
@ -490,7 +490,7 @@ class MobileAttention(nn.Module):
|
|||||||
# https://arxiv.org/abs/2102.10882
|
# https://arxiv.org/abs/2102.10882
|
||||||
# 1. Rather than adding one CPE before the attention blocks, we add a CPE
|
# 1. Rather than adding one CPE before the attention blocks, we add a CPE
|
||||||
# into every attention block.
|
# into every attention block.
|
||||||
# 2. We replace the expensive Conv2D by a Seperable DW Conv.
|
# 2. We replace the expensive Conv2D by a Separable DW Conv.
|
||||||
if use_cpe:
|
if use_cpe:
|
||||||
self.conv_cpe_dw = create_conv2d(
|
self.conv_cpe_dw = create_conv2d(
|
||||||
in_chs, in_chs,
|
in_chs, in_chs,
|
||||||
|
@ -32,7 +32,7 @@ def feature_take_indices(
|
|||||||
) -> Tuple[List[int], int]:
|
) -> Tuple[List[int], int]:
|
||||||
""" Determine the absolute feature indices to 'take' from.
|
""" Determine the absolute feature indices to 'take' from.
|
||||||
|
|
||||||
Note: This function can be called in forwar() so must be torchscript compatible,
|
Note: This function can be called in forward() so must be torchscript compatible,
|
||||||
which requires some incomplete typing and workaround hacks.
|
which requires some incomplete typing and workaround hacks.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
|
@ -611,7 +611,7 @@ class RepVggBlock(nn.Module):
|
|||||||
return kernel_final, bias_final
|
return kernel_final, bias_final
|
||||||
|
|
||||||
def _fuse_bn_tensor(self, branch) -> Tuple[torch.Tensor, torch.Tensor]:
|
def _fuse_bn_tensor(self, branch) -> Tuple[torch.Tensor, torch.Tensor]:
|
||||||
""" Method to fuse batchnorm layer with preceeding conv layer.
|
""" Method to fuse batchnorm layer with preceding conv layer.
|
||||||
Reference: https://github.com/DingXiaoH/RepVGG/blob/main/repvgg.py#L95
|
Reference: https://github.com/DingXiaoH/RepVGG/blob/main/repvgg.py#L95
|
||||||
"""
|
"""
|
||||||
if isinstance(branch, ConvNormAct):
|
if isinstance(branch, ConvNormAct):
|
||||||
@ -800,7 +800,7 @@ class MobileOneBlock(nn.Module):
|
|||||||
return kernel_final, bias_final
|
return kernel_final, bias_final
|
||||||
|
|
||||||
def _fuse_bn_tensor(self, branch) -> Tuple[torch.Tensor, torch.Tensor]:
|
def _fuse_bn_tensor(self, branch) -> Tuple[torch.Tensor, torch.Tensor]:
|
||||||
""" Method to fuse batchnorm layer with preceeding conv layer.
|
""" Method to fuse batchnorm layer with preceding conv layer.
|
||||||
Reference: https://github.com/DingXiaoH/RepVGG/blob/main/repvgg.py#L95
|
Reference: https://github.com/DingXiaoH/RepVGG/blob/main/repvgg.py#L95
|
||||||
"""
|
"""
|
||||||
if isinstance(branch, ConvNormAct):
|
if isinstance(branch, ConvNormAct):
|
||||||
|
@ -21,7 +21,7 @@ Modifications and additions for timm hacked together by / Copyright 2021, Ross W
|
|||||||
|
|
||||||
|
|
||||||
"""
|
"""
|
||||||
Modifed from Timm. https://github.com/rwightman/pytorch-image-models/blob/master/timm/models/vision_transformer.py
|
Modified from Timm. https://github.com/rwightman/pytorch-image-models/blob/master/timm/models/vision_transformer.py
|
||||||
|
|
||||||
"""
|
"""
|
||||||
from functools import partial
|
from functools import partial
|
||||||
|
@ -246,7 +246,7 @@ class LocalWindowAttention(torch.nn.Module):
|
|||||||
def forward(self, x):
|
def forward(self, x):
|
||||||
H = W = self.resolution
|
H = W = self.resolution
|
||||||
B, C, H_, W_ = x.shape
|
B, C, H_, W_ = x.shape
|
||||||
# Only check this for classifcation models
|
# Only check this for classification models
|
||||||
_assert(H == H_, f'input feature has wrong size, expect {(H, W)}, got {(H_, W_)}')
|
_assert(H == H_, f'input feature has wrong size, expect {(H, W)}, got {(H_, W_)}')
|
||||||
_assert(W == W_, f'input feature has wrong size, expect {(H, W)}, got {(H_, W_)}')
|
_assert(W == W_, f'input feature has wrong size, expect {(H, W)}, got {(H_, W_)}')
|
||||||
if H <= self.window_resolution and W <= self.window_resolution:
|
if H <= self.window_resolution and W <= self.window_resolution:
|
||||||
|
@ -231,7 +231,7 @@ class MobileOneBlock(nn.Module):
|
|||||||
def _fuse_bn_tensor(
|
def _fuse_bn_tensor(
|
||||||
self, branch: Union[nn.Sequential, nn.BatchNorm2d]
|
self, branch: Union[nn.Sequential, nn.BatchNorm2d]
|
||||||
) -> Tuple[torch.Tensor, torch.Tensor]:
|
) -> Tuple[torch.Tensor, torch.Tensor]:
|
||||||
"""Method to fuse batchnorm layer with preceeding conv layer.
|
"""Method to fuse batchnorm layer with preceding conv layer.
|
||||||
Reference: https://github.com/DingXiaoH/RepVGG/blob/main/repvgg.py#L95
|
Reference: https://github.com/DingXiaoH/RepVGG/blob/main/repvgg.py#L95
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
|
@ -78,7 +78,7 @@ class FocalModulation(nn.Module):
|
|||||||
x = self.f(x)
|
x = self.f(x)
|
||||||
q, ctx, gates = torch.split(x, self.input_split, 1)
|
q, ctx, gates = torch.split(x, self.input_split, 1)
|
||||||
|
|
||||||
# context aggreation
|
# context aggregation
|
||||||
ctx_all = 0
|
ctx_all = 0
|
||||||
for l, focal_layer in enumerate(self.focal_layers):
|
for l, focal_layer in enumerate(self.focal_layers):
|
||||||
ctx = focal_layer(ctx)
|
ctx = focal_layer(ctx)
|
||||||
@ -353,7 +353,7 @@ class FocalNet(nn.Module):
|
|||||||
focal_levels: How many focal levels at all stages. Note that this excludes the finest-grain level.
|
focal_levels: How many focal levels at all stages. Note that this excludes the finest-grain level.
|
||||||
focal_windows: The focal window size at all stages.
|
focal_windows: The focal window size at all stages.
|
||||||
use_overlap_down: Whether to use convolutional embedding.
|
use_overlap_down: Whether to use convolutional embedding.
|
||||||
use_post_norm: Whether to use layernorm after modulation (it helps stablize training of large models)
|
use_post_norm: Whether to use layernorm after modulation (it helps stabilize training of large models)
|
||||||
layerscale_value: Value for layer scale.
|
layerscale_value: Value for layer scale.
|
||||||
drop_rate: Dropout rate.
|
drop_rate: Dropout rate.
|
||||||
drop_path_rate: Stochastic depth rate.
|
drop_path_rate: Stochastic depth rate.
|
||||||
|
@ -618,7 +618,7 @@ class MetaFormer(nn.Module):
|
|||||||
return x
|
return x
|
||||||
|
|
||||||
|
|
||||||
# this works but it's long and breaks backwards compatability with weights from the poolformer-only impl
|
# this works but it's long and breaks backwards compatibility with weights from the poolformer-only impl
|
||||||
def checkpoint_filter_fn(state_dict, model):
|
def checkpoint_filter_fn(state_dict, model):
|
||||||
if 'stem.conv.weight' in state_dict:
|
if 'stem.conv.weight' in state_dict:
|
||||||
return state_dict
|
return state_dict
|
||||||
|
@ -175,7 +175,7 @@ def create_shortcut(
|
|||||||
class Bottleneck(nn.Module):
|
class Bottleneck(nn.Module):
|
||||||
""" RegNet Bottleneck
|
""" RegNet Bottleneck
|
||||||
|
|
||||||
This is almost exactly the same as a ResNet Bottlneck. The main difference is the SE block is moved from
|
This is almost exactly the same as a ResNet Bottleneck. The main difference is the SE block is moved from
|
||||||
after conv3 to after conv2. Otherwise, it's just redefining the arguments for groups/bottleneck channels.
|
after conv3 to after conv2. Otherwise, it's just redefining the arguments for groups/bottleneck channels.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
@ -250,7 +250,7 @@ class Bottleneck(nn.Module):
|
|||||||
class PreBottleneck(nn.Module):
|
class PreBottleneck(nn.Module):
|
||||||
""" RegNet Bottleneck
|
""" RegNet Bottleneck
|
||||||
|
|
||||||
This is almost exactly the same as a ResNet Bottlneck. The main difference is the SE block is moved from
|
This is almost exactly the same as a ResNet Bottleneck. The main difference is the SE block is moved from
|
||||||
after conv3 to after conv2. Otherwise, it's just redefining the arguments for groups/bottleneck channels.
|
after conv3 to after conv2. Otherwise, it's just redefining the arguments for groups/bottleneck channels.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
|
@ -4,7 +4,7 @@ A PyTorch implementation of ResNetV2 adapted from the Google Big-Transfer (BiT)
|
|||||||
at https://github.com/google-research/big_transfer to match timm interfaces. The BiT weights have
|
at https://github.com/google-research/big_transfer to match timm interfaces. The BiT weights have
|
||||||
been included here as pretrained models from their original .NPZ checkpoints.
|
been included here as pretrained models from their original .NPZ checkpoints.
|
||||||
|
|
||||||
Additionally, supports non pre-activation bottleneck for use as a backbone for Vision Transfomers (ViT) and
|
Additionally, supports non pre-activation bottleneck for use as a backbone for Vision Transformers (ViT) and
|
||||||
extra padding support to allow porting of official Hybrid ResNet pretrained weights from
|
extra padding support to allow porting of official Hybrid ResNet pretrained weights from
|
||||||
https://github.com/google-research/vision_transformer
|
https://github.com/google-research/vision_transformer
|
||||||
|
|
||||||
@ -436,7 +436,7 @@ class ResNetV2(nn.Module):
|
|||||||
stem_chs (int): stem width (default: 64)
|
stem_chs (int): stem width (default: 64)
|
||||||
stem_type (str): stem type (default: '' == 7x7)
|
stem_type (str): stem type (default: '' == 7x7)
|
||||||
avg_down (bool): average pooling in residual downsampling (default: False)
|
avg_down (bool): average pooling in residual downsampling (default: False)
|
||||||
preact (bool): pre-activiation (default: True)
|
preact (bool): pre-activation (default: True)
|
||||||
act_layer (Union[str, nn.Module]): activation layer
|
act_layer (Union[str, nn.Module]): activation layer
|
||||||
norm_layer (Union[str, nn.Module]): normalization layer
|
norm_layer (Union[str, nn.Module]): normalization layer
|
||||||
conv_layer (nn.Module): convolution module
|
conv_layer (nn.Module): convolution module
|
||||||
|
@ -280,7 +280,7 @@ class PatchEmbed(nn.Module):
|
|||||||
|
|
||||||
|
|
||||||
class Twins(nn.Module):
|
class Twins(nn.Module):
|
||||||
""" Twins Vision Transfomer (Revisiting Spatial Attention)
|
""" Twins Vision Transformer (Revisiting Spatial Attention)
|
||||||
|
|
||||||
Adapted from PVT (PyramidVisionTransformer) class at https://github.com/whai362/PVT.git
|
Adapted from PVT (PyramidVisionTransformer) class at https://github.com/whai362/PVT.git
|
||||||
"""
|
"""
|
||||||
|
@ -364,7 +364,7 @@ class VisionTransformerSAM(nn.Module):
|
|||||||
img_size: Input image size.
|
img_size: Input image size.
|
||||||
patch_size: Patch size.
|
patch_size: Patch size.
|
||||||
in_chans: Number of image input channels.
|
in_chans: Number of image input channels.
|
||||||
num_classes: Mumber of classes for classification head.
|
num_classes: Number of classes for classification head.
|
||||||
global_pool: Type of global pooling for final sequence (default: 'token').
|
global_pool: Type of global pooling for final sequence (default: 'token').
|
||||||
embed_dim: Transformer embedding dimension.
|
embed_dim: Transformer embedding dimension.
|
||||||
depth: Depth of transformer.
|
depth: Depth of transformer.
|
||||||
@ -667,7 +667,7 @@ def _cfg(url='', **kwargs):
|
|||||||
|
|
||||||
default_cfgs = generate_default_cfgs({
|
default_cfgs = generate_default_cfgs({
|
||||||
|
|
||||||
# Segment-Anyhing Model (SAM) pretrained - https://github.com/facebookresearch/segment-anything (no classifier head, for fine-tune/features only)
|
# Segment-Anything Model (SAM) pretrained - https://github.com/facebookresearch/segment-anything (no classifier head, for fine-tune/features only)
|
||||||
'samvit_base_patch16.sa1b': _cfg(
|
'samvit_base_patch16.sa1b': _cfg(
|
||||||
url='https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth',
|
url='https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth',
|
||||||
hf_hub_id='timm/',
|
hf_hub_id='timm/',
|
||||||
|
@ -507,7 +507,7 @@ class VOLO(nn.Module):
|
|||||||
)
|
)
|
||||||
r = patch_size
|
r = patch_size
|
||||||
|
|
||||||
# inital positional encoding, we add positional encoding after outlooker blocks
|
# initial positional encoding, we add positional encoding after outlooker blocks
|
||||||
patch_grid = (img_size[0] // patch_size // pooling_scale, img_size[1] // patch_size // pooling_scale)
|
patch_grid = (img_size[0] // patch_size // pooling_scale, img_size[1] // patch_size // pooling_scale)
|
||||||
self.pos_embed = nn.Parameter(torch.zeros(1, patch_grid[0], patch_grid[1], embed_dims[-1]))
|
self.pos_embed = nn.Parameter(torch.zeros(1, patch_grid[0], patch_grid[1], embed_dims[-1]))
|
||||||
self.pos_drop = nn.Dropout(p=pos_drop_rate)
|
self.pos_drop = nn.Dropout(p=pos_drop_rate)
|
||||||
|
@ -35,7 +35,7 @@ class AdaBelief(Optimizer):
|
|||||||
|
|
||||||
For a complete table of recommended hyperparameters, see https://github.com/juntang-zhuang/Adabelief-Optimizer'
|
For a complete table of recommended hyperparameters, see https://github.com/juntang-zhuang/Adabelief-Optimizer'
|
||||||
For example train/args for EfficientNet see these gists
|
For example train/args for EfficientNet see these gists
|
||||||
- link to train_scipt: https://gist.github.com/juntang-zhuang/0a501dd51c02278d952cf159bc233037
|
- link to train_script: https://gist.github.com/juntang-zhuang/0a501dd51c02278d952cf159bc233037
|
||||||
- link to args.yaml: https://gist.github.com/juntang-zhuang/517ce3c27022b908bb93f78e4f786dc3
|
- link to args.yaml: https://gist.github.com/juntang-zhuang/517ce3c27022b908bb93f78e4f786dc3
|
||||||
"""
|
"""
|
||||||
|
|
||||||
|
@ -80,7 +80,7 @@ class Adahessian(torch.optim.Optimizer):
|
|||||||
|
|
||||||
def zero_hessian(self):
|
def zero_hessian(self):
|
||||||
"""
|
"""
|
||||||
Zeros out the accumalated hessian traces.
|
Zeros out the accumulated hessian traces.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
for p in self.get_params():
|
for p in self.get_params():
|
||||||
|
@ -10,7 +10,7 @@ import torch
|
|||||||
def set_jit_legacy():
|
def set_jit_legacy():
|
||||||
""" Set JIT executor to legacy w/ support for op fusion
|
""" Set JIT executor to legacy w/ support for op fusion
|
||||||
This is hopefully a temporary need in 1.5/1.5.1/1.6 to restore performance due to changes
|
This is hopefully a temporary need in 1.5/1.5.1/1.6 to restore performance due to changes
|
||||||
in the JIT exectutor. These API are not supported so could change.
|
in the JIT executor. These API are not supported so could change.
|
||||||
"""
|
"""
|
||||||
#
|
#
|
||||||
assert hasattr(torch._C, '_jit_set_profiling_executor'), "Old JIT behavior doesn't exist!"
|
assert hasattr(torch._C, '_jit_set_profiling_executor'), "Old JIT behavior doesn't exist!"
|
||||||
|
@ -62,7 +62,7 @@ class ActivationStatsHook:
|
|||||||
Inspiration from https://docs.fast.ai/callback.hook.html.
|
Inspiration from https://docs.fast.ai/callback.hook.html.
|
||||||
|
|
||||||
Refer to https://gist.github.com/amaarora/6e56942fcb46e67ba203f3009b30d950 for an example
|
Refer to https://gist.github.com/amaarora/6e56942fcb46e67ba203f3009b30d950 for an example
|
||||||
on how to plot Signal Propogation Plots using `ActivationStatsHook`.
|
on how to plot Signal Propagation Plots using `ActivationStatsHook`.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
def __init__(self, model, hook_fn_locs, hook_fns):
|
def __init__(self, model, hook_fn_locs, hook_fns):
|
||||||
@ -96,7 +96,7 @@ def extract_spp_stats(
|
|||||||
hook_fns,
|
hook_fns,
|
||||||
input_shape=[8, 3, 224, 224]):
|
input_shape=[8, 3, 224, 224]):
|
||||||
"""Extract average square channel mean and variance of activations during
|
"""Extract average square channel mean and variance of activations during
|
||||||
forward pass to plot Signal Propogation Plots (SPP).
|
forward pass to plot Signal Propagation Plots (SPP).
|
||||||
|
|
||||||
Paper: https://arxiv.org/abs/2101.08692
|
Paper: https://arxiv.org/abs/2101.08692
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user