* add non_local module
* rewrite non local module comments
* perfect docstring and adjust init function
* not to init norm layer
* Correct initialize when there is a norm
* set normal method for both embedded_gaussian and dot_product
* add building bricks of cnn
* add unit tests
* use registry for building bricks
* minor updates
* add scale layer
* add test for scale
* add doc string
Co-authored-by: Jiarui XU <xvjiarui0826@gmail.com>