## CUDA ops We implement common CUDA ops used in detection, segmentation, etc. - AssignScoreWithK - BallQuery - BBoxOverlaps - CARAFE - CrissCrossAttention - ContextBlock - CornerPool - Deformable Convolution v1/v2 - Deformable RoIPool - DynamicScatter - GatherPoints - FurthestPointSample - FurthestPointSampleWithDist - GeneralizedAttention - GroupPoints - KNN - MaskedConv - NMS - PSAMask - RoIPointPool3d - RoIPool - RoIAlign - RoIAwarePool3d - SimpleRoIAlign - SigmoidFocalLoss - SoftmaxFocalLoss - SoftNMS - Synchronized BatchNorm - Voxelization - ThreeInterpolate - ThreeNN - Weight standardization - Correlation