SpirvShader: Optimize stores with static equal offsets
This is heavily used in dEQP-VK.ssbo.*. Avoiding generating the scatter
is profitable on all non-AVX512-capable targets;
ScalarizeMaskedMemIntrin is incredibly slow.
Reduces runtime on dEQP-VK.ssbo.layout.random.all_shared_buffer.5 from
24s to 14s on my threadripper (on top of stack of other optimizations).
Bug: b/135609394
Change-Id: I2d6840522a5bd30b4fd532b9c7e2a4712879caa9
Reviewed-on: https://swiftshader-review.googlesource.com/c/SwiftShader/+/33289Tested-by:
Chris Forbes <chrisforbes@google.com>
Presubmit-Ready: Chris Forbes <chrisforbes@google.com>
Reviewed-by:
Ben Clayton <bclayton@google.com>
Showing
Please
register
or
sign in
to comment