-
Optimize transcendentals for Subzero · 9c14bda0Antonio Maiorano authored
With this change, we can now select implementations of most transcendentals from either the "emulated" or "optimal" namespaces. The emulated versions generally call the math.h standard function on each component for vector types, while the optimal versions typically implement some approximation in Reactor to produce vectorized code. Most of the optimal versions were taken directly from ShaderCore.cpp, except for ASin, for which I implemented an 8-term approximation. The new versions are faster, and pass all deqp precision tests. Here's a table of benchmarks that show the performance improvements that were made. Note that Asin and Acos now take a Precision parameter for Full and Relaxed precision: Before After rr_Sin 48.6 ns 10.6 ns rr_Cos 67.1 ns 9.62 ns rr_Tan 75.5 ns 19.4 ns rr_Asin_fullp 24.2 ns 23.0 ns rr_Asin_relaxedp N/A 9.31 ns rr_Acos_fullp 14.3 ns 6.35 ns rr_Acos_relaxedp N/A 4.56 ns rr_Atan 66.8 ns 12.9 ns rr_Sinh 79.7 ns 11.5 ns rr_Cosh 80.1 ns 11.5 ns rr_Tanh 62.9 ns 12.1 ns rr_Asinh 104 ns 9.44 ns rr_Acosh 14.4 ns 10.2 ns rr_Atanh 170 ns 9.81 ns rr_Atan2 73.5 ns 22.8 ns rr_Pow 87.9 ns 16.3 ns rr_Exp 40.2 ns 5.72 ns rr_Log 44.0 ns 7.35 ns rr_Exp2 101 ns 5.38 ns rr_Log2 106 ns 9.24 ns Bug: b/147818976 Change-Id: I791893bd9f005dbbae4770fb474de338a04845be Reviewed-on: https://swiftshader-review.googlesource.com/c/SwiftShader/+/48588 Kokoro-Result: kokoro <noreply+kokoro@google.com> Reviewed-by:Nicolas Capens <nicolascapens@google.com> Tested-by:
Antonio Maiorano <amaiorano@google.com>
9c14bda0
×