Summary: the combination of tensors on multiple devices in get_rel_pos
was preventing cuda graphs from correctly optimizing things
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
ghstack-source-id: 2256f130bb8249403710e1048ef69385ff71aed2
Pull Request resolved: https://github.com/facebookresearch/segment-anything/pull/393