Commits · 253dc8a870cbd144ba217a44efbd07e6bcd71e97 · Chen Yisong / swiftshader

22 Jun, 2015 1 commit

Add constant blinding/pooling option for X8632 code translation. · 253dc8a8

authored Jun 22, 2015

GOAL:
The goal is to remove the ability of an attacker to control immediates emitted into the text section.

OPTION:
The option -randomize-pool-immediates is set to none by default (-randomize-pool-immediates=none). To turn on constant blinding, set -randomize-pool-immediates=randomize; to turn on constant pooling, use -randomize-pool-immediates=pool.

Not all constant integers in the input pexe file will be randomized or pooled. The signed representation of a candidate constant integer must be between -randomizeOrPoolImmediatesThreshold/2 and +randomizeOrPoolImmediatesThreshold/2. This threshold value can be set with command line option: "-randomize-pool-threshold". By default this threshold is set to 0xffff.

The constants introduced by instruction lowering (e.g. constants in shifting, masking) and argument lowering are not blinded in this way. The mask used for sandboxing is not affected either.

APPROACH:
We use GAS syntax in these examples.

Constant blinding for immediates:
Original:
add 0x1234, eax
After:
mov 0x1234+cookie, temp_reg
lea -cookie[temp_reg], temp_reg
add temp_reg, eax

Constant blinding for memory addressing offsets:
Original:
mov 0x1234(eax, esi, 1), ebx
After:
lea 0x1234+cookie(eax), temp_reg
mov -cookie(temp_reg, esi, 1), ebx

We use "lea" here because it won't affect flag register, so it is safer to transform immediate-involved instructions.

Constant pooling for immediates:
Original:
add 0x1234, eax
After:
mov [memory label of 0x1234], temp_reg
add temp_reg, eax

Constant pooling for addressing offsets:
Original:
mov 0x1234, eax
After:
mov [memory label of 0x1234], temp_reg
mov temp_reg, eax

Note in both cases, temp_reg may be assigned with "eax" here, depends on the
liveness analysis. So this approach may not require extra register.

IMPLEMENTATION:
Processing:
TargetX8632::randomizeOrPoolImmediate(Constant *Immediate, int32_t RegNum);
TargetX8632::randomizeOrPoolImmediate(OperandX8632Mem *Memoperand, int32_t RegNum);

Checking eligibility:
ConstantInteger32::shouldBeRandomizedOrPooled(const GlobalContext *Ctx);

ISSUES:
1. bool Ice::TargetX8632::RandomizationPoolingPaused is used to guard some translation phases to disable constant blinding/pooling temporally. Helper class BoolFlagSaver is added to latch the value of RandomizationPoolingPaused.

Known phases that need to be guarded are: doLoadOpt() and advancedPhiLowering(). However, during advancedPhiLowering(), if the destination variable has a physical register allocated, constant blinding and pooling are allowed. Stopping blinding/pooling for doLoadOpt() won't hurt our randomization or pooling as the optimized addressing operands will be processed again in genCode() phase.

2. i8 and i16 constants are collected with different constant pools now, instead of sharing a same constant pool with i32 constants. This requires emitting two more pools during constants lowering, hence create two more read-only data sections in the resulting ELF and ASM. No runtime issues have been observed so far.

BUG=
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1185703004.

253dc8a8

18 Jun, 2015 4 commits

ARM: Assign "actuals" at call site to the appropriate GPR/stack slot. · b0a8c24e

authored Jun 18, 2015

Actually assign arguments to r0-r3 at the call site. Previously
this was left unhandled. There was only logic for pulling
formal parameters out of r0-r3.

Refactor the GPR counter and move it into a class so that the
rounding up for i64 arguments is in one place for callsites
and for pulling out of parameters. We might be able to use a
similar pattern to count the FP/SIMD registers later.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1187513006.

b0a8c24e

Subzero: Add more kinds of RMW lowering. · cac003e8

authored Jun 18, 2015

Specifically: sub, and, or, xor; for all integer types.

Turns out that RMW is not possible for fadd/fsub/fmul/fdiv as well as operations on vector types, because the corresponding x86 instructions require the result to be in a physical register.

Refactors the assembler's implementations of add/or/adc/sbb/and/sub/xor/cmp to avoid repetition.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4095
R=jvoung@chromium.org

Review URL: https://codereview.chromium.org/1186713010

cac003e8

Subzero: Correct the cross test's diagnostic message for a test failure. · a9eeb420
Jim Stichnoth authored Jun 17, 2015
```
BUG= none
R=jvoung@chromium.org

Review URL: https://codereview.chromium.org/1195553002
```
a9eeb420

Subzero: Transform suitable Load/Arith/Store sequences into RMW ops. · e4f65d86

authored Jun 17, 2015

Search for sequences of Load/Arith/Store instructions that can be transformed into single non-atomic Read-Modify-Write instructions. Corresponding operands must match up, and it is limited to the operator/type combinations that have simple lowerings.

For suitable sequences, an RMW pseudo-instruction is added. Extra variables are attached to the RMW instruction and the original Store instruction, to make it easy to figure out whether to retain the original Store instruction or the new RMW instruction (but never both).

The RMW instructions are similar to their non-RMW counterparts, except that the RMW instruction has no Dest variable - the Src[0] operand doubles as the memory-operand dest.

The x86-32 integrated assembler has some new forms of existing instructions added.

Note: this CL puts the machinery in place to identify, lower, and emit RMW operations only for the "add" instruction operating on i32/i16/i8 operands. The next CL will fill in the rest of the options.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4095
R=jvoung@chromium.org

Review URL: https://codereview.chromium.org/1182603004

e4f65d86

17 Jun, 2015 2 commits

Fix a bug that would cause subzero to fail when --threads=0. · 8b1a7051

authored Jun 17, 2015

Creates a single TargetDataLowering.

BUG= None
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1179313004.

8b1a7051

Set up crosstest to run simple loop in Om1 on ARM. · 8e32fed5

authored Jun 17, 2015

We can't run O2 yet because some of the advanced Phi lowering
hooks aren't implemented for O2 yet.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1160873006.

8e32fed5

16 Jun, 2015 1 commit

Add a basic enum for ARM InstructionSet / cpu features. · d062f73a

authored Jun 15, 2015

That way, we don't have to use -mattr=sse2 for ARM in
cross tests, etc.

Default to NEON for now. Also put in an entry for HW
divide in ARM mode. There's bunches of features that are
possible though, e.g.,:

https://github.com/llvm-mirror/llvm/blob/master/lib/Target/ARM/ARM.td

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1191573003.

d062f73a

15 Jun, 2015 2 commits

Move lowerGlobal() from target-specific code to emitGlobal() in generic code. · 58eea4d1

authored Jun 15, 2015

Emitting the global initializers is mostly the same across
each architecture (same filling, alignment, etc.). The only difference
is in assembler-directive quirks. E.g., on ARM for ".align N" N is
the exponent for a power of 2, while on x86 N is the actual number
of bytes. To avoid target-specific directives, use .p2align which
is always a power of 2. Similarly, use % instead of @. Either one
may be a comment character for *some* architecture, but for the
architectures we care about % is not a comment character while @
is sometimes (ARM).

Usually MIPS uses ".space N" for ".zero", but the assembler seems
to accept ".zero" so don't change that for now.

May need to adjust .long in the future too.
.word for AArch64 and .4byte for MIPS?

Potentially we can refactor the lowerGlobals() dispatcher
(ELF vs ASM vs IASM). The only thing target-specific about that
is *probably* just the relocation type.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1188603002.

58eea4d1

Removes const qualification for two methods in TargetDataLowering. · 0f86d03c

authored Jun 15, 2015

Removes const qualifier for TargetDataLowering::lowerGlobals() and TargetDataLowering::lowerConstants()

BUG= None
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1177873003.

0f86d03c

12 Jun, 2015 3 commits

Build ARM SZ runtime files. Use le32-nacl-objcopy in various places. · 050deaa6

authored Jun 12, 2015

Use PNaCl built binutils, which is known to support ARM and MIPS.
Otherwise the system-provided binutils may or may not have that
support (mine did not and perhaps expected a prefix like
arm-xxx-objcopy for the version that did support arm).

Split off from CL to run crosstests for ARM under qemu.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076
R=jpp@chromium.org, stichnot@chromium.org

Review URL: https://codereview.chromium.org/1185703006.

050deaa6

Subzero: Strength-reduce mul by certain constants. · 0933c0cf

authored Jun 12, 2015

These all appear to some degree in spec2k.

This is implemented for i8/i16/i32 types. It is done as part of core lowering, so in theory all optimization levels could benefit, but it is explicitly disabled for Om1/O0 to keep things simple there.

While clang appears to strength-reduce udiv/urem by a constant power of 2, for some reason it does not always strength-reduce multiplies (given that they appear in the spec2k bitcode).

For multiplies by 3, 5, or 9, we can make use of the lea instruction. We can do combinations of shift and lea to multiply by other constants, e.g. 100=5*5*4. If too many operations would be required, just give up and use the mul instruction.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4095
R=jpp@chromium.org, jvoung@chromium.org

Review URL: https://codereview.chromium.org/1146803002

0933c0cf

Subzero: Fix compilation error in MINIMAL=1 or NOASSERT=1 mode. · 326534a3
Jim Stichnoth authored Jun 11, 2015
```
BUG= none
R=jpp@chromium.org

Review URL: https://codereview.chromium.org/1182673003
```
326534a3

11 Jun, 2015 5 commits

Emit ARM build-attributes in the file scope (as header). · fb79284d

authored Jun 11, 2015

The ARM linker will check that .o files declare compatible
build attributes (e.g., all claim hard-float calling convention,
all claim VFP-vX ,etc.). Thus, in order to set up cross tests that
link LLC generated code against and Subzero generated code,
we need the build attributes to be compatible.

Pick ARMv7, hard-float calling convention, and neon, etc. which
we use for PNaCl LLVM.

Will probably have to reorganize to keep in sync once the ELF
writer also emits this.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076
R=kschimpf@google.com, stichnot@chromium.org

Review URL: https://codereview.chromium.org/1171563002.

fb79284d

Unittest fixes. · 8eefffad

authored Jun 11, 2015

Adjusts the expected unittest output.

BUG= None
R=kschimpf@google.com, stichnot@chromium.org

Review URL: https://codereview.chromium.org/1173353003.

8eefffad

First patch for Mips subzero compiler · 6da4cef7

authored Jun 11, 2015

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4167

Move issue https://codereview.chromium.org/1159823004/ here so that
it's under the proper email.

Review URL: https://codereview.chromium.org/1169533003

6da4cef7

Subzero: Fix lit and cross tests broken in . · d9f1f9fc

authored Jun 11, 2015

1. The data symbol __Sz_block_profile_info should never be mangled (for cross tests), similar to runtime helper calls. Add a SuppressMangling override for such variable declarations.

2. When cross tests contain more than one translated object file, we end up with multiple definitions of __Sz_block_profile_info . Work around this by making that symbol weak.

3. Don't try to attach global inits to an EmitterWorkItem that represents a translation error.

4. Update one lit test to reflect the additional profiling value in the data section.

5. Update one lit test to reflect that global initializers are emitted at the end instead of the beginning.

The check-unit test is still broken and will be fixed in a separate CL.

BUG= none
R=kschimpf@google.com

Review URL: https://codereview.chromium.org/1180883002

d9f1f9fc

Fixes a bug in that caused IceAssembler to use Allocator before it was initialized. · 1a9043e7

authored Jun 11, 2015

Initializes IceAssembler::Allocator before IceAssembler::Buffer.

BUG= None
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1177843006.

1a9043e7

10 Jun, 2015 2 commits

Renames the assembler* files. · aff4ccf9

authored Jun 10, 2015

Renames the assembler* files to IceAssembler*. Fixes whatever breaks.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4077
R=jvoung@chromium.org

Review URL: https://codereview.chromium.org/1179563004.

aff4ccf9

Subzero: Basic Block Profiler. · f8b4cc84

authored Jun 09, 2015

BUG= None
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1147023007.

f8b4cc84

08 Jun, 2015 1 commit

Clean up unit munging unit tests using common NaCl API. · cbb1d3d7

authored Jun 08, 2015

Simplify the munging unit tests to follow the new NaCl utilities
for munging tests.

Note that this CL takes advantage of changes added by
CL https://codereview.chromium.org/1140153004

BUG=None
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1149423011

cbb1d3d7

05 Jun, 2015 4 commits

Merge branch 'master' of… · 8af4aac7

authored Jun 05, 2015

Merge branch 'master' of https://chromium.googlesource.com/native_client/pnacl-subzero into subzero-ownership

8af4aac7

Subzero: adding jpp@chromium.org to OWNERS. · af9032fc
John Porto authored Jun 05, 2015
```
BUG= None
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1149213006
```
af9032fc
Subzero: adding jpp@chromium.org to OWNERS. · 09a18657
John Porto authored Jun 05, 2015

09a18657

Subzero ARM32: Lower shift and zext, sext, and trunc. · 66c3d5ec

authored Jun 04, 2015

Sext, etc. usually uses shifts (especially for i1 and i64)
so implement shift, then implement those casts.

Implement just enough of bitcast to handle accessing
global addresses (used by some tests). Otherwise,
most other bitcasts are from GPR to FP and FP regs
aren't modeled yet.

Generally following the GCC style for 64-bit shifts.
This takes advantage of the flexible second operand in a "orr",
and takes advantage of the shift-beyond bitwidth saturation.
LLVM is almost the same, but only seems to take advantage
on one side of the 32-bits, not the other side. Should really
get some of the execution tests running to test this behavior!

Fix InstARM32Str::dump(). Str doesn't have a Dest, so use Src.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1143323013

66c3d5ec

04 Jun, 2015 2 commits

Subzero: Legalize FP constants directly into memory operands. · 03ffa585

authored Jun 04, 2015

Previously, the legalize() function would always force a floating point constant into an xmm register before it could be used in an instruction. This uses an extra register unnecessarily when the instruction allows a memory operand for that operand.

We improve this by lowering the FP constant operand to an OperandX8632Mem that wraps a ConstantRelocatable representing the label for the constant pool entry, e.g. [.L$float$0]. (This may end up being copied into an xmm register if the instruction doesn't allow a memory operand for that operand.)

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4095
R=jvoung@chromium.org

Review URL: https://codereview.chromium.org/1163943005

03ffa585

Use report_fatal_error before destroying input object on error. · 2f7f2b7e

authored Jun 03, 2015

The input object may be a QueueStreamer, which the compile
server will still have a reference to (even though
downstream the memory object API and parser API thinks it
has a unique_ptr). Terminate the thread quickly on error,
instead of free'ing and causing a use-after-free.

Also set up a report_fatal_error handler which has access
to the server's state. This allows the server to record the
error and stop pushing bytes to the QueueStreamer.
Otherwise the QueueStreamer can get full without a consumer
still active to unblock.

Unfortunately the fatal error handler only terminates the
current thread, and not all worker threads. NaCl doesn't
have support for signals or pthread_kill.
E.g., with pthread_kill(std_thread.native_handle(), SIGABRT).
So, other worker/emitter threads will have to hang waiting on
more input or something.

Random clang-format edits from 3.7.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4163
TEST= tbd:

I manually ran the translator a dummy text file (invalid bitcode
header), and observed that this no longer crashes. Instead the SRPC
calls finish and I see:

3> [17812,4147750656:14:23:02.025382] Streaming file at 100000 bps
[17812,4147750656:14:23:12.511574] RPC call failed: Rpc application returned an error.
[17812,4147750656:14:23:12.511625] StreamChunk failed
[17812,4147750656:14:23:12.511655] stream_file: SendDataChunk failed, but returning without failing. Expect call to StreamEnd.4> rpc call initiated StreamEnd::isss
[17812,4147750656:14:23:12.511931] RPC call failed: Rpc application returned an error.
rpc call complete StreamEnd::isss
output 0:  i(0)
output 1:  s("")
output 2:  s("")
output 3:  s("Invalid PNaCl bitcode header")
[17812,4147750656:14:23:12.512102] Command [rpc] failed.

R=kschimpf@google.com, stichnot@chromium.org

Review URL: https://codereview.chromium.org/1168543002

2f7f2b7e

03 Jun, 2015 2 commits

Subzero: Improve/refactor folding loads into the next instruction. · 8e6bf6e1

authored Jun 03, 2015

This is turned into a separate (O2-only) pass that looks for opportunities:
1. A Load instruction, or an AtomicLoad intrinsic that would be lowered just like a Load instruction
2. Followed immediately by an instruction with a whitelisted kind that uses the Load dest variable as one of its operands
3. Where the whitelisted instruction ends the live range of the Load dest variable.

In such cases, the original two instructions are deleted and a new instruction is added that folds the load into the whitelisted instruction.

We also do some work to splice the liveness information (Inst::LiveRangesEnded and Inst::isLastUse()) into the new instruction, so that the target lowering pass might still take advantage. Currently this is used quite sparingly, but in the future we could use that along with operator commutativity to choose among different lowering sequences to reduce register pressure.

The whitelisted instruction kinds are chosen based primarily on whether the main operation's native instruction can use a memory operand - e.g., arithmetic (add/sub/imul/etc), compare (cmp/ucomiss), cast (movsx/movzx/etc). Notably, call and ret are not included because arg passing is done through simple assignments which normal lowering is sufficient for.

BUG= none
R=jvoung@chromium.org, mtrofin@chromium.org

Review URL: https://codereview.chromium.org/1169493002

8e6bf6e1

Subzero: Change pnacl_newlib ==> pnacl_newlib_raw in scripts. · bb9d11a5
Jim Stichnoth authored Jun 03, 2015
```
BUG= none
R=jvoung@chromium.org, kschimpf@google.com

Review URL: https://codereview.chromium.org/1162903003
```
bb9d11a5

02 Jun, 2015 1 commit

Subzero ARM: lowerLoad and lowerStore. · befd03ab

authored Jun 02, 2015

Thought leaving "mov" simple and not handle memory operands,
but then we'd have to duplicate some of the lowerAssign code
for lowerLoad =/

BUG=  https://code.google.com/p/nativeclient/issues/detail?id=4076
R=kschimpf@google.com, stichnot@chromium.org

Review URL: https://codereview.chromium.org/1152703006

befd03ab

01 Jun, 2015 4 commits

Subzero: Changes needed for LLVM 3.7 integration. · e5b58fbe

authored Jun 01, 2015

1. Change Makefile.standalone from 3.6 to 3.7.

2. Update to new load instruction .ll syntax.  This includes changing InstLoad::dump() to match.

BUG= none
R=jvoung@chromium.org

Review URL: https://codereview.chromium.org/1161543005

e5b58fbe

Subzero: Remove a compile-time warning. · 0769299d
Jim Stichnoth authored Jun 01, 2015
```
BUG= none
R=kschimpf@google.com

Review URL: https://codereview.chromium.org/1161353002
```
0769299d

Subzero ARM: addProlog/addEpilogue -- share some code with x86. · 0fa6c5a0

authored Jun 01, 2015

Split out some of the addProlog code from x86 and
reuse that for ARM. Mainly, the code that doesn't
concern preserved registers or stack arguments is split out.

ARM push and pop take a whole list of registers (not
necessarily consecutive, but should be in ascending order).
There is also "vpush" for callee-saved float/vector
registers but we do not handle that yet (the register
numbers for that have to be consecutive).

Enable some of the int-arg.ll tests, which relied on
addPrologue's finishArgumentLowering to pull from the
correct argument stack slot.

Test some of the frame pointer usage (push/pop) when
handling a variable sized alloca.

Also change the classification of LR, and PC so that
they are not "CalleeSave". We don't want to push LR
if it isn't overwritten by another call. It will certainly be
"used" by the return however. The prologue code only checks
if a CalleeSave register is used somewhere before deciding
to preserve it. We could make that stricter and check if
the register is also written to, but there are some
additional writes that are not visible till after the
push/pop are generated (e.g., copy from argument stack slot
to the argument register). Instead, keep checking use
only, and handle LR as a special case (IsLeafFunction).

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1159013002

0fa6c5a0

Subzero: Fold the load instruction into the next cast instruction. · c77f817f

authored May 31, 2015

This is similar to the way a load instruction may be folded into the next arithmetic instruction.

Usually the effect is to improve a sequence like:
mov ax, WORD PTR [mem]
movsx eax, ax
into this:
movsx eax, WORD PTR [mem]
without actually improving register allocation, though other kinds of casts may have different improvements.

Existing tests needed to be fixed when they "inadvertently" did a cast to i32 return type and triggered the optimization when it wasn't wanted. These were fixed by inserting a "dummy" instruction between the load and the cast.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4095
R=jvoung@chromium.org

Review URL: https://codereview.chromium.org/1152783006

c77f817f

27 May, 2015 3 commits

Use ldr for movs out of stack slots (instead of mov reg, [sp/fp]). · c207d51e

authored May 27, 2015

So far we've been using ldr/str (32-bit) to load/store
the whole stack slot, independent of the variable type.

Toggle on some tests that didn't have an Om1 variant
previously. Didn't toggle everything since there are still
some problems with liveness from code being unimplemented.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1144923008

c207d51e

Subzero: More asm-verbose fixes. · b82baf2f

authored May 27, 2015

It turns out that code deleted in 9a05aea8 actually had a legitimate purpose, so it is added back, this time with more extensive comments justifying it.

Also, takes the instruction's IsDestNonKillable flag into account when updating the live register usage count (along with extra comments on why that is necessary).

Furthermore, removes an unnecessary assert that otherwise fails when --asm-verbose is used with --filetype=iasm or --filetype-obj.

BUG= none
R=jvoung@chromium.org

Review URL: https://codereview.chromium.org/1158113002

b82baf2f

Remove the FrameSizeLocals field which appears to be unused (write-only). · 0d9faeac

authored May 27, 2015

Might have gotten replaced by some other field, but don't
quite remember. Spotted while looking for ways to share
the addProlog() code between targets.

BUG=none
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1158713005

0d9faeac

26 May, 2015 2 commits

Subzero: Fix/improve -asm-verbose output. · 9a05aea8

authored May 26, 2015

Fixes a bug where a num-uses counter wasn't being updated because of C
operator && semantics.  The code was something like "if (A && --B) ..."
but we want --B to happen even when A is false.

Sorts the LiveIn and LiveOut lists by regnum so that the lists always
display the set of registers in a consistent/familiar order.

BUG= none
R=jvoung@chromium.org

Review URL: https://codereview.chromium.org/1152813003

9a05aea8

Subzero ARM: lower alloca instruction. · 55500dbc

authored May 26, 2015

Lower alloca in a way similar to x86. Subtract the stack
and align if needed, then copy that stack address to dest.
Sometimes use "bic" for the mask, sometimes use "and",
depending on what fits better.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1156713003

55500dbc

22 May, 2015 1 commit

Subzero ARM: do lowerIcmp, lowerBr, and a bit of lowerCall. · 3bfd99a3

authored May 22, 2015

Allow instructions to be predicated and use that in lower icmp
and branch. Tracking the predicate for almost every instruction
is a bit overkill, but technically possible. Add that to most of
the instruction constructors except ret and call for now.

This doesn't yet do compare + branch fusing, but it does handle
the branch fallthrough to avoid branching twice.

I can't yet test 8bit and 16bit, since those come from "trunc"
and "trunc" is not lowered yet (or load, which also isn't
handled yet).

Adds basic "call(void)" lowering, just to get the call markers
showing up in tests.

64bit.pnacl.ll no longer explodes with liveness consistency errors,
so risk running that and backfill some of the 64bit arith tests.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1151663004

3bfd99a3