The problem is that given code like this: a = b + c d = a + e ... ... (use of a) ... Lowering may produce code like this, at least on x86: T1 = b T1 += c a = T1 T2 = a T2 += e d = T2 ... ... (use of a) ... If "a" has a long live range, it may not get a register, resulting in clumsy code in the middle of the sequence like "a=reg; reg=a". Normally one might expect store forwarding to make the clumsy code fast, but it does presumably add an extra instruction-retirement cycle to the critical path in a pointer-chasing loop, and makes a big difference on some benchmarks. The simple fix here is, at the end of lowering "a=b+c", keep track of the final "a=T1" assignment. Then, when lowering "d=a+e" and we look up "a", we can substitute "T1". This slightly increases the live range of T1, but it does a great job of avoiding the redundant reload of the register from the stack location. A more general fix (in the future) might be to do live range splitting and let the register allocator handle it. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4095 R=kschimpf@google.com Review URL: https://codereview.chromium.org/1385433002 .
| Name |
Last commit
|
Last update |
|---|---|---|
| bloat | Loading commit data... | |
| crosstest | Loading commit data... | |
| pydir | Loading commit data... | |
| runtime | Loading commit data... | |
| src | Loading commit data... | |
| tests_lit | Loading commit data... | |
| unittest | Loading commit data... | |
| .dir-locals.el | Loading commit data... | |
| .gitignore | Loading commit data... | |
| ALLOCATION.rst | Loading commit data... | |
| CMakeLists.txt | Loading commit data... | |
| DESIGN.rst | Loading commit data... | |
| Doxyfile | Loading commit data... | |
| LICENSE.TXT | Loading commit data... | |
| LOWERING.rst | Loading commit data... | |
| Makefile | Loading commit data... | |
| Makefile.standalone | Loading commit data... | |
| OWNERS | Loading commit data... | |
| README.rst | Loading commit data... | |
| codereview.settings | Loading commit data... |