* add focal loss * apply class wise sum * fix doctring * do not apply sum over classes and fix docstring * fix docstring * fix weight shape * fix weight shape