Commits · cbd3dbc6f00a7784f523132910cc9adbdfa09bf2 · Chen Yisong / swiftshader

09 Aug, 2016 1 commit

Subzero: Implemented codegen for poisoning and unpoisoning stack redzones · cbd3dbc6

authored Aug 09, 2016

BUG=https://bugs.chromium.org/p/nativeclient/issues/detail?id=4374
R=kschimpf@google.com, stichnot@chromium.org

Review URL: https://codereview.chromium.org/2194853003 .

cbd3dbc6

08 Aug, 2016 2 commits

Subzero: More documentation for the NACLENV arg passthrough mechanism. · a64156e8
Jim Stichnoth authored Aug 08, 2016
```
BUG= none

Review URL: https://codereview.chromium.org/2215623002 .
```
a64156e8

Subzero: Embed the revision string into translated output. · 54cf1a2f

authored Aug 08, 2016

Modify the Makefiles to pass in the current git hash, which is embedded into the translated output. As a side effect, it is also embedded into the Subzero translator binary. This is useful for two reasons:

1. The PNaCl component update process is somewhat manual, making it tricky long after the fact to know exactly which revision was pushed, e.g. when trying to reproduce a bug or crash.

2. A translated binary can be inspected to make sure Chrome used the expected revision of Subzero. (And also to verify that pnacl-sz was used rather than pnacl-llc.)

The revision string is suppressed for lit tests, because a number of tests seem overly strict about global initializer expectations.

BUG= none
R=jpp@chromium.org

Review URL: https://codereview.chromium.org/2218363002 .

54cf1a2f

05 Aug, 2016 3 commits

Subzero: Use Cfg::getOptLevel() instead of ClFlags version. · 386b52ed

authored Aug 05, 2016

The opt level (O2 versus Om1) should be tested using Cfg::getOptLevel() instead of getFlags().getOptLevel() whenever possible.

This is because if you run "-Om1 -force-O2=foo", and you're compiling foo, the first form tells you O2 while the second form tells you Om1.

BUG= none
R=eholk@chromium.org

Review URL: https://codereview.chromium.org/2210773002 .

386b52ed

Subzero: Use the memset inline threshold for memset. · 35e16002

authored Aug 05, 2016

Memset lowering was using the memcpy inline threshold instead of the memset threshold.

Using the memset threshold as specified (16) seems to make spec2k performance slightly worse, so change it to the original value (8).

BUG= none
R=eholk@chromium.org

Review URL: https://codereview.chromium.org/2217983003 .

35e16002

Documentation for LCSE, LICM, Short-Circuit, Global-Splitting · a41e9a14

authored Aug 05, 2016

LCSE is local common sub-expression elimination.
LICM is loop invariant code motion.
Short circuit splits basic blocks and introduces early jumps.
Global Splitting is a post regalloc live range splitting pass.

BUG=none
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2217773003 .

a41e9a14

04 Aug, 2016 6 commits

Aggressive LEA · 5b7e1c06

authored Aug 04, 2016

Convert adds with a constant operand to lea on -aggressive-lea

BUG=none
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2135403002 .

5b7e1c06

Float Constant CSE · 5bcc6caf

authored Aug 04, 2016

Load multiple uses of a floating point constant (between two call
instructions or block start/end) into a variable before its first use.
  t1 = b + 1.0
  t2 = c + 1.0
Gets transformed to:
  t0 = 1.0
  t0_1 = t0
  t1 = b + t0_1
  t2 = c + t0_1
Call instructions reset the procedure, but uses the same variable, just
in case it got a register. We are assuming floating point registers are
not calee saved in general. Example, continuing from before:
  result = call <some function>
  t3 = d + 1.0
Gets transformed to:
  result = call <some function>
  t0_2 = t0
  t3 = d + t0_2

BUG= none
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2208523002 .

5bcc6caf

Live Range Splitting after initial Register Allocation · 7cd926d6

authored Aug 04, 2016

After register allocation is done once, this pass targets
the variables that do not get registers, break them into
multiple variables with shorter (at most spanning a basic
block) live ranges. After discarding the new variables with
too few uses, the register allocator is run again and
the new variables that manage to get registers are inserted.

BUG=None
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2172313002 .

7cd926d6

Subzero: Improved quality of ASan error messages · fb068e84

authored Aug 04, 2016

Added load/store and stack/heap/global information.

BUG=https://bugs.chromium.org/p/nativeclient/issues/detail?id=4374
R=kschimpf@google.com

Review URL: https://codereview.chromium.org/2211733002 .

fb068e84

Subzero: Fix sign issues for inlined memset lowering. · 59ce6153

authored Aug 04, 2016

For certain cases of inlined memset lowering, the 8-bit value wasn't being properly spread/replicated into the 32-bit immediate to be stored.

Specifically, if the 8-bit value is between -128 and -1 (i.e. 0x80 to 0xff), the spread value would be something like 0xffffff80 instead of 0x80808080.

BUG= b/30502279
R=jpp@chromium.org

Review URL: https://codereview.chromium.org/2215553002 .

59ce6153

Subzero: Fix formatting. · 58c66b93

authored Aug 03, 2016

Previous CL forgot to "make format".

BUG= none

Review URL: https://codereview.chromium.org/2206743003 .

58c66b93

03 Aug, 2016 1 commit

Subzero: removed loops from ASan access checking · 7a934724

authored Aug 03, 2016

BUG=https://bugs.chromium.org/p/nativeclient/issues/detail?id=4374
R=kschimpf@google.com

Review URL: https://codereview.chromium.org/2209563002 .

7a934724

02 Aug, 2016 1 commit

SubZero: Adding support for all Reg pairs in getI64PairFirst/SecondGPRNum · eec5621d

authored Aug 02, 2016

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2197233002 .

Patch from Mohit Bhakkad <mohit.bhakkad@imgtec.com>.

eec5621d

01 Aug, 2016 2 commits

Enable Local CSE by default · 53c8fbdf

authored Aug 01, 2016

Reduce the default number of iterations to 1
Put the optional code behind the -lcse-no-ssa flag, which is disabled by
default. This brings down the overhead of enabling this to about 2%.

BUG=
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2185193002 .

53c8fbdf

Subzero: Local variable splitting. · b9a84728

authored Aug 01, 2016

The linear-scan register allocator takes an all-or-nothing approach -- either the variable's entire live range gets a register, or none of it does.

To help with this, we add a pass that splits successive uses of a variable within a basic block into a chain of linked variables. This gives the register allocator the chance to allocate registers to subsets of the original live range.

The split variables are linked to each other so that if they don't get a register, they share a stack slot with the original variable, and redundant writes to that stack slot are recognized and elided.

This pass is executed after target lowering and right before register allocation. As such, it has to deal with some idiosyncrasies of target lowering, specifically the possibility of intra-block control flow. We experimented with doing this as a pre-lowering pass. However, the transformations interfered with some of the target lowering's pattern matching, such as bool folding, so we concluded that post-lowering was a better place for it.

Note: Some of the lit tests are overly specific about registers, and in these cases it was the path of least resistance to just disable local variable splitting.

BUG= none
R=eholk@chromium.org, jpp@chromium.org

Review URL: https://codereview.chromium.org/2177033002 .

b9a84728

27 Jul, 2016 1 commit

Subzero: Removed unnecessary global access checks · 181a9bcb

authored Jul 27, 2016

BUG=https://bugs.chromium.org/p/nativeclient/issues/detail?id=4374
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2183683003 .

181a9bcb

26 Jul, 2016 1 commit

Subzero: Elide checks of known valid accesses of locals · ac27c516

authored Jul 26, 2016

BUG=https://bugs.chromium.org/p/nativeclient/issues/detail?id=4374
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2183643002 .

ac27c516

25 Jul, 2016 1 commit

Bisection debugging helper script · 34e88480

authored Jul 25, 2016

BUG=none
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2162123002 .

34e88480

21 Jul, 2016 4 commits

Subzero: small cleanups · 0aa3f710

authored Jul 21, 2016

BUG=https://bugs.chromium.org/p/nativeclient/issues/detail?id=4374
R=kschimpf@google.com

Review URL: https://codereview.chromium.org/2165393002 .

0aa3f710

Selectively invert ICMP operands for better address optimization · 0c704176

authored Jul 21, 2016

Results in lower code size and more loads folded into cmp instructions.

BUG=none
R=eholk@chromium.org, jpp@chromium.org, stichnot@chromium.org

Review URL: https://codereview.chromium.org/2124973005 .

0c704176

[Subzero][MIPS32] Fix stack offset assignment of spilled variables on MIPS32 · 752e59fa

authored Jul 21, 2016

BUG=none
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2166643003 .

Patch from Sagar Thakur <sagar.thakur@imgtec.com>.

752e59fa

Subzero: Instrumented realloc · 1608a913

authored Jul 20, 2016

BUG=https://bugs.chromium.org/p/nativeclient/issues/detail?id=4374
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2148413003 .

1608a913

20 Jul, 2016 1 commit

Subzero: Fixed deadlock when _start is first function · 2c9992a5

authored Jul 20, 2016

It was previously the case that instrumentStart in ASanInstrumentation would block until instrumentGlobals had completed. This was because instrumentStart depends on the global redzones having been inserted. However, instrumentGlobals was not called until the first function was popped off the emit queue, and when _start was the first function, it was not placed on the emit queue until after it had been instrumented and lowered. instrumentStart was waiting for instrumentGlobals, which could not happen until instrumentStart completed.

BUG=https://bugs.chromium.org/p/nativeclient/issues/detail?id=4374
R=kschimpf@google.com, stichnot@chromium.org

Review URL: https://codereview.chromium.org/2165493002 .

2c9992a5

19 Jul, 2016 2 commits

Improve LoopAnalyzer Interface · adf352bc

authored Jul 19, 2016

Make LoopAnalyzer compute loop bodies and depth only.
Move the logic for finding out loop headers and pre-headers to LoopInfo, which provides a visitor to iterate over the loops and easy access to the information.
This does not change the core algorithm.

BUG=None
R=jpp@chromium.org, stichnot@chromium.org

Review URL: https://codereview.chromium.org/2149803005 .

adf352bc

Subzero: Fix lowering for x86 div/rem instructions. · 017a5538

authored Jul 19, 2016

The x86 lowering sequences for sdiv/udiv/srem/urem all have a problem, in that they don't reflect the fact that two registers are affected by the instruction.

For example, the urem instruction:
dest = src0 urem src1
lowers to something like this:
t1:eax = src0
t2:edx = 0
t2:edx = (t1:eax and t2:edx) div src1
dest = t2:edx

The problem is that there is no indication that the div instruction smashes eax. As such, it's possible that the register allocator could erroneously assume that src0 is still available in eax after the div instruction.

To fix this, we make use of the FakeDef instruction. In this example, we change the div instruction to "officially" produce eax as its result, then fakedef edx in terms of eax. This means that as long as the urem result is actually used, the definitions of eax and edx will be preserved, but if the urem result is unused, then the whole sequence can be dead-code eliminated.

t1:eax = src0
t2:edx = 0
t1:eax = (t1:eax and t2:edx) div src1 # dest var changed to t1:eax
t2:edx = fakedef t1:eax # fakedef instruction added
dest = t2:edx

BUG= none
R=jpp@chromium.org

Review URL: https://codereview.chromium.org/2158213002 .

017a5538

14 Jul, 2016 2 commits

[Subzero][MIPS32] Implement post lower legalizer for MIPS32 · 5674c915

authored Jul 14, 2016

BUG=none
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2148593003 .

Patch from Sagar Thakur <sagar.thakur@imgtec.com>.

5674c915

implemented wrapper script to replace calls to calloc() · f0f80654

authored Jul 14, 2016

BUG=https://bugs.chromium.org/p/nativeclient/issues/detail?id=4374
R=kschimpf@google.com, stichnot@chromium.org

Review URL: https://codereview.chromium.org/2145213002 .

f0f80654

13 Jul, 2016 2 commits

Updates in preparation of wrapper script · f6c41e46

authored Jul 13, 2016

BUG=https://bugs.chromium.org/p/nativeclient/issues/detail?id=4374
R=kschimpf@google.com

Review URL: https://codereview.chromium.org/2145063003 .

f6c41e46

SubZero: Correct parenthesis for mem operands with labels in MIPS32 · 6a661ced

authored Jul 13, 2016

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2143243003 .

Patch from Mohit Bhakkad <mohit.bhakkad@imgtec.com>.

6a661ced

12 Jul, 2016 3 commits

Loop Invariant Code Motion · f47d520c

authored Jul 12, 2016

Implemented behind the new -licm flag.
Hoists invariant arithmetic instructions from loop bodies to pre-headers.
Does not trigger for loops where headers have two incoming edges from
outside the loop.
Also enables multi block address optimization, because most of the
instructions hoisted are address calculations coming from gep.

Does not touch memory operations.
This algorithm does not seem to work well for load-hoisting.

BUG=none
R=jpp@chromium.org, stichnot@chromium.org

Review URL: https://codereview.chromium.org/2138443002 .

f47d520c

[Subzero][MIPS32] Implements variable alloca for MIPS32 · c930d59b

authored Jul 12, 2016

BUG=none
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2138383002 .

Patch from Sagar Thakur <sagar.thakur@imgtec.com>.

c930d59b

Subzero, MIPS32: Handling floating point instructions fadd, fsub, fmul, fdiv · ab6a04f6

authored Jul 11, 2016

This patch adds handling of floating point instructions
fadd, fsub, fmul and fdiv. Regarding frem, Mips32 does not have
instruction that calculates partial reminder, so it has to be
emulated with a set of instructions. Emulating frem will be addressed
in separate patch, when floating point format conversion instructions
are fully implemented.

BUG=
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2027773002 .

Patch from Srdjan Obucina <Srdjan.Obucina@imgtec.com>.

ab6a04f6

10 Jul, 2016 1 commit

Subzero: Allow deeper levels of variable splitting. · fe62f0a2

authored Jul 10, 2016

This fixes some existing problems with the Variable::LinkedTo splitting/linking mechanism. The problem was that if B is linked to A, and B needs a stack slot, but A doesn't get a stack slot, B's stack offset would never get initialized. This could happen if A ends up with no explicit references in the code, or A's live range gets truncated such that it actually has a register while B doesn't.

It gets even more complicated if you have a link chain like A<--B<--C<--D etc. where some of them have stack slots (which should ultimately all be the same slot) and some don't.

The solution here is that if B is linked to the root A, and B has a stack slot but A doesn't, we can do a tree rotation so that B is the new root and A links to B.

In addition, we initialize Variable::StackOffset to an invalid value and always make sure a value used is valid. Earlier attempts at extending the variable splitting would sometimes silently fail because the default StackOffset value of 0 ended up being used.

BUG= none
R=jpp@chromium.org

Review URL: https://codereview.chromium.org/2116213002 .

fe62f0a2

07 Jul, 2016 2 commits

Blacklisted instrumenting _Balloc. · 3f97afb1

authored Jul 07, 2016

Increases number of spec2k tests that run successfully with ASan from 2 to 6.

BUG=https://bugs.chromium.org/p/nativeclient/issues/detail?id=4374
R=kschimpf@google.com

Review URL: https://codereview.chromium.org/2128383002 .

3f97afb1

SubZero: legalize for f32/f64 constants in MIPS32 · d1e97776

authored Jul 07, 2016

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2123723002 .

Patch from Mohit Bhakkad <mohit.bhakkad@imgtec.com>.

d1e97776

06 Jul, 2016 3 commits

Subzero, MIPS32: Extend InstMIPS32Mov to support different data types · 36847bdd

authored Jul 06, 2016

This patch extends InstMIPS32Mov instruction to support different datatypes, and emit proper low level instruction depending on operands properties and data types.

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/2122043002 .

Patch from Srdjan Obucina <Srdjan.Obucina@imgtec.com>.

36847bdd

Fixed instruction corruption bug for multiple returns. · a7e5a951

authored Jul 06, 2016

BUG=https://bugs.chromium.org/p/nativeclient/issues/detail?id=4374
R=kschimpf@google.com

Review URL: https://codereview.chromium.org/2128643002 .

a7e5a951

Implemented loose checking for potential widened loads · cf062799

authored Jul 06, 2016

BUG=https://bugs.chromium.org/p/nativeclient/issues/detail?id=4374
R=kschimpf@google.com

Review URL: https://codereview.chromium.org/2115693002 .

cf062799

30 Jun, 2016 1 commit

Implemented aligning and poisoning global redzones · aedc5e49

authored Jun 30, 2016

BUG=https://bugs.chromium.org/p/nativeclient/issues/detail?id=4374
R=kschimpf@google.com, stichnot@chromium.org

Review URL: https://codereview.chromium.org/2108083002 .

aedc5e49