-
Optimize multisample resolve with SSE2 instructions · a2e6c1a1Nicolas Capens authored
Benchmark results: Run on (48 X 2594 MHz CPU s) CPU Caches: L1 Data 32 KiB (x24) L1 Instruction 32 KiB (x24) L2 Unified 256 KiB (x24) L3 Unified 30720 KiB (x2) --------------------------------------------------------------- Benchmark Time CPU Iterations --------------------------------------------------------------- (LLVM, before) Triangle/Hello 0.845 ms 0.439 ms 1673 Triangle/Multisample 6.95 ms 0.781 ms 1000 (LLVM, after) Triangle/Hello 0.861 ms 0.450 ms 1493 Triangle/Multisample 4.03 ms 0.753 ms 747 (Subzero, before) Triangle/Hello 1.19 ms 0.474 ms 1120 Triangle/Multisample 11.8 ms 0.920 ms 747 (Subzero, after) Triangle/Hello 0.907 ms 0.486 ms 1673 Triangle/Multisample 4.62 ms 0.781 ms 1000 Bug: b/147802090 Change-Id: Iea8498f2b745c86cf578db5c0f7ef2329b73c736 Reviewed-on: https://swiftshader-review.googlesource.com/c/SwiftShader/+/47970 Presubmit-Ready: Nicolas Capens <nicolascapens@google.com> Tested-by:
Nicolas Capens <nicolascapens@google.com> Reviewed-by:
Alexis Hétu <sugoi@google.com> Kokoro-Result: kokoro <noreply+kokoro@google.com>
a2e6c1a1
×