Add a fast multisample resolve code path
For whole-image 4x8-bit normalized format multisample resolves, use a
specialized code path instead of a generic blit routine.
Benchmark results:
Run on (48 X 2594 MHz CPU s)
CPU Caches:
L1 Data 32 KiB (x24)
L1 Instruction 32 KiB (x24)
L2 Unified 256 KiB (x24)
L3 Unified 30720 KiB (x2)
---------------------------------------------------------------
Benchmark Time CPU Iterations
---------------------------------------------------------------
(LLVM, before)
Triangle/Hello 1.02 ms 0.500 ms 1000
Triangle/Multisample 19.3 ms 0.984 ms 1000
(LLVM, after)
Triangle/Hello 0.845 ms 0.439 ms 1673
Triangle/Multisample 6.95 ms 0.781 ms 1000
(Subzero, before)
Triangle/Hello 1.15 ms 0.516 ms 1120
Triangle/Multisample 40.3 ms 0.469 ms 100
(Subzero, after)
Triangle/Hello 1.19 ms 0.474 ms 1120
Triangle/Multisample 11.8 ms 0.920 ms 747
Bug: b/147802090
Change-Id: I15729552f01a509a5cfce20cd7de06d0b764cf0a
Reviewed-on: https://swiftshader-review.googlesource.com/c/SwiftShader/+/47969
Presubmit-Ready: Nicolas Capens <nicolascapens@google.com>
Tested-by:
Nicolas Capens <nicolascapens@google.com>
Reviewed-by:
Alexis Hétu <sugoi@google.com>
Showing
Please
register
or
sign in
to comment