Commit Graph

13 Commits (bc3d4eb403b9f40c23ed5bedd8875bba911e3cdc)

Author SHA1 Message Date
Alexander Soare bc3d4eb403 wip -rebase 2021-11-12 20:45:05 +00:00
Ross Wightman c02334d9fa Add weights for regnetz_d and haloregnetz_c, update regnetz_c weights. Add commented PyTorch XLA code for halo attention 2021-10-19 12:32:09 -07:00
Ross Wightman 02daf2ab94 Add option to include relative pos embedding in the attention scaling as per references. See discussion #912 2021-10-12 15:37:01 -07:00
Ross Wightman e2b8d44ff0 Halo, bottleneck attn, lambda layer additions and cleanup along w/ experimental model defs
* align interfaces of halo, bottleneck attn and lambda layer
* add qk_ratio to all of above, control q/k dim relative to output dim
* add experimental haloregnetz, and trionet (lambda + halo + bottle) models
2021-10-06 16:32:48 -07:00
Ross Wightman 007bc39323 Some halo and bottleneck attn code cleanup, add halonet50ts weights, use optimal crop ratios 2021-10-02 15:51:42 -07:00
Ross Wightman 5bd04714e4 Cleanup weight init for byob/byoanet and related 2021-09-05 15:34:05 -07:00
Ross Wightman 8642401e88 Swap botnet 26/50 weights/models after realizing a mistake in arch def, now figuring out why they were so low... 2021-09-05 15:17:19 -07:00
Ross Wightman 492c0a4e20 Update HaloAttn comment 2021-09-01 17:14:31 -07:00
Ross Wightman 3b9032ea48 Use Tensor.unfold().unfold() for HaloAttn, fast like as_strided but more clarity 2021-08-27 12:45:53 -07:00
Ross Wightman 8449ba210c Improve performance of HaloAttn, change default dim calc. Some cleanup / fixes for byoanet. Rename resnet26ts to tfs to distinguish (extra fc). 2021-08-26 21:56:44 -07:00
Ross Wightman 0721559511 Improved (hopefully) init for SA/SA-like layers used in ByoaNets 2021-05-04 21:40:39 -07:00
Ross Wightman e15c3886ba Defaul lambda r=7. Define '26t' stage 4/5 256x256 variants for all of bot/halo/lambda nets for experiment. Add resnet50t for exp. Fix a few comments. 2021-04-29 10:58:49 -07:00
Ross Wightman ce62f96d4d ByoaNet with bottleneck transformer, lambda resnet, and halo net experiments 2021-04-12 09:38:02 -07:00