* Move 'test time pool' to Module that can be used by any model, remove from DPN
* Remove ResNext model file and combine with ResNet
* Remove fbresnet200 as it was an old conversion and pretrained performance not worth param count
* Cleanup adaptive avgmax pooling and add back conctat variant
* Factor out checkpoint load fn