Commits · 83ccadcf9ee6408bde0dbfe28a764de813a7bf31 · Chen Yisong / swiftshader

12 Aug, 2015 1 commit

Remove error-recovery TODO comments from bitcode parser. · 83ccadcf

authored Aug 12, 2015

The bitcode parser contained a number of TODO commments about
removing error recovery once the parser was implemented. This
was based on the fact that (1) pnacl-sz in the browser didn't
need it, and (2) it would be trivial to remove.

The advantage of leaving error recovery in is that multiple errors can
be found at once.

It turns out it isn't so easy to remove, since several methods assume
that recovery can be applied, and hence do not need propagate back up
an (optional) error code. Parsing backs out relatively quickly anyway,
since the code exits between bitcode records anyway.

BUG=None
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1286513002 .

83ccadcf

10 Aug, 2015 2 commits

Subzero: Misc fixes/cleanup. · 992f91dd

authored Aug 10, 2015

1. Fix MINIMAL build.
  (a) Add a void cast to a var only used in asserts.
  (b) Use "REQUIRES:" instead of "REQUIRES" in a .ll file.
2. Use StrError instead of StrDump for errors.
3. Use a lambda instead of a functor because C++11.
4. Explicit check for -filetype=obj in a non-dump-enabled build, to avoid cryptic downstream error messages.
5. Run "make format" which was neglected earlier.

BUG= none
R=kschimpf@google.com

Review URL: https://codereview.chromium.org/1284493003.

992f91dd

Fix processing of local variable indices in fuction blocks. · c6acf08f

authored Aug 10, 2015

The previous code used a vector to hold local values associated with
indices in the bitcode file. The problem was that the vector would be
expanded to match the index of a "variable index forward reference".
If the index was very large, the program would freeze the computer
trying to allocate an array large enough to contain the index.

This patch fixes this by using a local unordered map instead of a
vector.  Hence, forward index references just add a sinle entry into
the map.

Note that this fix doesn't have a corresponding issue. However, the
problem was made apparent from the problems noted in issues 4257 and
4261.

BUG=None
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1282523002 .

c6acf08f

09 Aug, 2015 1 commit

Add the ARM32 FP register table entries, simple arith, and args. · 86ebec12

authored Aug 09, 2015

Lower some instructions, without much guarantee of
correctness. *Running* generated code will be risky
because the register allocator isn't aware of register
aliasing.

Fill in v{add,div,mul,sub}.f{32,64}, vmov, vldr
and vsqrt.f{32,64}. I tried to make the nacl-other-intrinsics
test not explode, so added vsqrt too. That was pretty
easy for sqrt, but then fabs tests also exploded. Those are not
truly fixed but are currently "fixed" by adding a FakeDef to
satisfy liveness.

Propagate float/double arguments to the right register
in lowerArguments, lowerCall, and propagate to s0/d0/q0
for lowerReturn. May need to double check the calling convention.
Currently can't test call-ret because vpush/vpop for prologues
and epilogues isn't done.

Legalize FP immediates to make the nacl-other-intrinsics sqrt
test happy. Use the correct type of load (vldr (.32 and .64 are
optional) instead of ldr{b,h,,d}).

Whether or not the float/vector instructions can be
predicated is a bit interesting. The float/double ones
can, but the SIMD versions cannot. E.g.

vadd<cond>.f32 s0, s0, s1 is okay
vadd<cond>.f32 q0, q0, q1 is not okay.

For now, just omit conditions from instructions that may
end up being reused for SIMD.

Split up the fp.pnacl.ll test into multiple ones so that
parts of lowering can be tested incrementally.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1266263003 .

86ebec12

08 Aug, 2015 1 commit

Subzero: Fix a memory leak. · f4fbf7fd

authored Aug 08, 2015

The problem was that Translator and subclasses need to have virtual destructors since they are used as unique_ptr<>. As a result, only the Translator base class destructor was being invoked.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4290
TEST= make -f Makefile.standalone ASAN=1 check-lit
R=ascull@google.com

Review URL: https://codereview.chromium.org/1281003003.

f4fbf7fd

07 Aug, 2015 3 commits

Fix processing of global variable indices in the global vars block. · aa0ce790

authored Aug 07, 2015

The code used to use a vector to hold global variables associated with
indices. The problem was that the count record in the global vars
block would generate variables for the given count (even if very
large).

To fix this, we created a local unordered map to associate indices
with defined/referenced globals. After processing the global vars
block, this unordered map is used to verify the size makes sense, and
then install the recognized global variables into the (top-level)
contents.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4257
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1278793006 .

aa0ce790

Inline memcpy for small constant sizes. · 9df4a379

authored Aug 07, 2015

Combined with memset inlining, this has shown an improvement of over 11% on
the eon benchmark. This the only C++ program in spec2k and it seems C++
programs have a significantly larger number of memset/memcpy calls. Other
benchmarks also showed improvement of ~5% (perlbmk, parser) while most had
a 1-2% improvement.

This commit also includes a refactoring of lowerMemset which is much more
readable and also removed the fake use of the destination pointer register.

BUG=
R=jvoung@chromium.org, stichnot@chromium.org

Review URL: https://codereview.chromium.org/1279833005.

9df4a379

Subzero: Completely remove tracking of stack pointer live range. · f9df4523

authored Aug 06, 2015

Specifically, if a variable is marked with IgnoreLiveness=true, then:
  1. Completely avoid adding any segments to its live range
  2. Assert that no one tries to add segments to its live range

This is done in part by incorporating Variable::IgnoreLiveness into Liveness::RangeMask.

Also, change a functor into a lambda because C++11.

BUG= none
R=jvoung@chromium.org

Review URL: https://codereview.chromium.org/1273823003.

f9df4523

06 Aug, 2015 4 commits

Subzero: Fix szbuild.py linking. · 88ab5ca4

authored Aug 06, 2015

The symbol __Sz_block_profile_info needs to be explicitly globalized when combining llc and Subzero .o output, i.e. when the --include or --exclude options are used.

BUG= none
R=jpp@chromium.org

Review URL: https://codereview.chromium.org/1275983003.

88ab5ca4

Fix translator handling of basic block indices. · ac7d7344

authored Aug 06, 2015

The code used to use a vector to hold basic blocks associated with
indices. This had two problem: 1) The "number of blocks" record would
generate a vector of the given size (even if very large); and 2)
Indices would expand the vector to define the index (even if very
large).

In most cases, such large values are incorrect. To fix this, this
patch uses a map from block index, to the corresponding basic block.
Only after all basic block indices have been processed, we check that
the size makes sense, and convert it to a vector.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4261
R=jpp@chromium.org, stichnot@chromium.org

Review URL: https://codereview.chromium.org/1275953002 .

ac7d7344

Introduce the ability to insert IACA (Intel Architecture Code Analyzer) marks. · 2c862522

authored Aug 06, 2015

The IACI marks identify the code which should be analyzed with the IACA.

The generated binaries are not executable due to the marks. This feature should
only be used during develpoment when analyzing generated code so it is protected
behind the --allow-iaca-marks flag.

ScopedIacaMark is a helper class which opens mark and closes it at the end of
the scope. This is useful when there are many returns as you don't have to write
`_iaca_end()` before them all.

BUG=
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1260093003.

2c862522

Subzero. Moves code around in preparations for 64-bit lowering. · 729b5f6c

authored Aug 06, 2015

Specifically, it moves

lowerArguments
lowerRet
addProlog
addEpilog

from the x86 lowering template to the concrete lowering implementations.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4077
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1261383002.

729b5f6c

05 Aug, 2015 5 commits

Subzero: Fix an Om1 crash from memset lowering. · f6f9825e

authored Aug 05, 2015

With a certain combination of memset arguments, legalizeToReg() is called but its result is unused. Om1 register allocation does not like this because it sees an infinite-weight variable with a definition but no uses. The simplest fix is to add a fake use.

The problem shows up when building spec2k with -Om1.

BUG= none
R=ascull@google.com

Review URL: https://codereview.chromium.org/1272823004.

f6f9825e

Subzero: Slight improvement to phi lowering. · 552490c2

authored Aug 05, 2015

When doing post phi lowering register allocation, the original code limited register allocation to pre-colored or infinite-weight variables with a non-empty live range within the new edge-split nodes.  This limitation ends up missing some opportunities.  Specifically, when a temporary is introduced to break a dependency cycle, e.g.:
  // a = phi(b)
  // b = phi(a)
  t = a
  a = b
  b = t
then t was always stack-allocated, even if a physical register was available.

In the new design, the RangeMask bitvector specifies which variables should have their live ranges tracked and computed.  For normal liveness analysis, all variables are tracked.  For post phi lowering liveness analysis, all variables created from phi lowering, plus all pre-colored variables, plus all infinite-weight variables, are tracked.

The result is slightly better code quality, and sometimes the frame size is 1 or 2 words smaller.

The hope was to narrow the 10% translation-time degradation in pnacl-llc.pexe compared to the old, hackish way of phi lowering, but that goal still proves to be elusive.

BUG= none
R=jvoung@chromium.org

Review URL: https://codereview.chromium.org/1271923002.

552490c2

Subzero. Implements x86-64 lowerCall. · e0d9afa8

authored Aug 05, 2015

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4077
R=jvoung@chromium.org, stichnot@chromium.org

Review URL: https://codereview.chromium.org/1266673003.

e0d9afa8

Clarify which type "Label" refers to (generic vs X86) · c2ec5817

authored Aug 05, 2015

There is a generic Label that can be used as-is for ARM
and MIPS, and there is an x86 derived class Label which
adds the concept of near and far (for 8-bit vs 32-bit
jumps). Previously, one method "getOrCreateCfgNodeLabel"
would say that it returns a Label when it means the generic
one in some cases and the x86 one in other cases.

Split that into getCfgNodeLabel and getOrCreateCfgNodeLabel
where getCfgNodeLabel returns the generic one and
getOrCreateCfgNodeLabel is part of the x86 code and
returns the x86 one.

BUG=none
R=ascull@google.com, stichnot@chromium.org

Review URL: https://codereview.chromium.org/1265023003 .

c2ec5817

Order jump tables for deterministic or randomized emission. · 1eda90a1
Andrew Scull authored Aug 04, 2015
```
BUG=
R=stichnot@chromium.org, jvoung, stichnot

Review URL: https://codereview.chromium.org/1260183008.
```
1eda90a1

04 Aug, 2015 2 commits

Subzero: Fix x86 lowering for shift-by-relocatable-constant. · c5c8957b

authored Aug 04, 2015

Trying to shift by a ConstantRelocatable causes an assertion failure in the integrated assembler, because the shift value must be Imm8. Maybe this is possible to express via ELF relocations, but it doesn't seem worth it. Especially since ConstantRelocatables refer to addresses and it doesn't make sense to shift by an address.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4256
R=jvoung@chromium.org, kschimpf@google.com

Review URL: https://codereview.chromium.org/1262003003.

c5c8957b

Inline memset when there is a constant value and count. · 713dbdec
Andrew Scull authored Aug 04, 2015
```
BUG=
R=jvoung@chromium.org, jvoung, stichnot

Review URL: https://codereview.chromium.org/1255053008.
```
713dbdec

02 Aug, 2015 1 commit

Subzero: Expand the liveness consistency check. · b3bfcbcb

authored Aug 01, 2015

Liveness analysis includes a consistency check on each node, to verify that variables referenced in only one block do not appear to be live coming into a block (and are therefore live across multiple blocks). This check was disabled in the entry block because there might be function arguments that are referenced only in the entry block but are still live coming in.

It seems that this entry-block exclusion has been largely unnecessary for some time. This is because input arguments and other special variables are now pre-marked as multi-block. The exclusion masks problems in some single-block lit tests, so it's best if it can be removed.

This CL removes the exclusion, and fixes some minor issues uncovered in the MIPS and ARM target lowering. A key issue is that when implementing a new target lowering and using --skip-unimplemented to make progress with existing tests, it may be necessary to add FakeDef instructions to avoid liveness inconsistency errors.

Note that when this patch is applied to 448c16f0, it correctly identifies the liveness consistency error (as shown by a "make check-lit" failure) that was fixed in 59f2d925.

BUG= none
R=jpp@chromium.org

Review URL: https://codereview.chromium.org/1265093002.

b3bfcbcb

31 Jul, 2015 5 commits

Subzero. Buildable, non-functional TargetLoweringX8664. · 453660ff

authored Jul 31, 2015

This CL adds a TargetLoweringX8664 that inherits from TargetX86Base, but
other than that it does nothing to generate runnable code.

Things that need to be addressed in follow up CLs:
1) lowerCall
2) lowerArguments
3) lowerRet
4) addPrologue
5) addEpilogue
6) Native 64-bit arithmetic
7) 32- to 64-bit addressing

(7) will be particularly interesting. Pointers in Pexes are always
32-bit wide, so pexes have a de facto 32-bit address space. In
Sandboxed mode that's solved by using RZP (i.e., r15) as a base
register. For native codegen, we still need to decide what to do
-- very likely we will start targeting X32.

NOTE: This CL also

s/IceType_ForceRexW/RexTypeForceRexW/g

because I forgot to do it in the X8664 assembler cl.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4077
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1257643004.

453660ff

Subzero. Misc fixes. · 59f2d925

authored Jul 31, 2015

This CL disables the X86 assembler tests by default. They take too long
to compile, so there's very little point in running them with the other
unittests.

This CL fixes a bug introduced in
https://codereview.chromium.org/1260163003/ that caused liveness
analysis to assert due to a uninitialized Variable.

BUG=
R=jvoung@chromium.org, stichnot@chromium.org

Review URL: https://codereview.chromium.org/1266863002.

59f2d925

ARM: Add a postRA pass to legalize stack offsets. Greedy approach (reserve IP). · 28068adb

authored Jul 31, 2015

Make a post-register allocation and post-addProlog pass to
go through variables with stack offsets and legalize them
in case the offsets are not encodeable. The naive approach
is to reserve IP, and use IP to movw/movt the offset, then
add/sub the frame/stack pointer to IP and use IP as the new
base instead of the frame/stack pointer. We do some amount
of CSE within a basic block, and share the IP base pointer
when it is (a) within range for later stack references,
and (b) IP hasn't been clobbered (e.g., by a function call).
I chose to do this greedy approach for both Om1 and O2,
since it should just be a linear pass, and it reduces the
amount of variables/instructions created compared to the
super-naive peephole approach (so might be faster?).

Introduce a test-only flag and use that to artificially
bloat the stack frame so that spill offsets are out
of range for ARM. Use that flag for cross tests to
stress this new code a bit more (than would have been
stressed by simply doing a lit test + FileCheck).

Also, the previous version of emitVariable() was using the
Var's type to determine the range (only +/- 255 for i16,
vs +/- 4095 for i32), even though mov's emit() always
uses a full 32-bit "ldr" instead of a 16-bit "ldrh".
Use a common legality check, which uses the stackSlotType
instead of the Var's type. This previously caused the
test_bitmanip to spuriously complain, even though the
offsets for Om1 were "only" in the 300 byte range. With this
fixed, we can then enable the test_bitmanip test too.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1241763002 .

28068adb

Add -reorder-basic-blocks option and fix nop insertion · 969f6a33

authored Jul 31, 2015

1. Basic block reordering can be enabled with -reorder-basic-blocks option enabled.

Blocks will be sorted according to the Reversed Post Traversal Order, but the next
node to visit among all candidate children nodes is selected 'randomly'.

Example:
  A
 / \
B   C
 \ /
  D

This CFG can generate two possible layouts:
A-B-C-D or A-C-B-D

2. Fix nop insetion

Add checks to avoiding insertions in empty basic blocks(dead blocks) and bundle locked instructions.

BUG=
R=jpp@chromium.org, jvoung@chromium.org, stichnot@chromium.org

Review URL: https://codereview.chromium.org/1255303004.

969f6a33

Fix a -Wcovered-switch-default warning in emitJumpTables. · c2648c2d

authored Jul 30, 2015

The Subzero build inside of the LLVM build system turns this on.

BUG=none
R=ascull@google.com

Review URL: https://codereview.chromium.org/1264913005 .

c2648c2d

30 Jul, 2015 2 commits

Iasm and obj lowering for advanced switch lowering. · 86df4e9e

authored Jul 30, 2015

Jump table emission is delayed until offsets are known. X86 local jumps can be
near or far. Sanboxing is applied to indirect jumps from jump table.

BUG=
R=stichnot@chromium.org, jvoung

Review URL: https://codereview.chromium.org/1257283004.

86df4e9e

Subzero: Cleanly implement register allocation after phi lowering. · a3f57b9a

authored Jul 30, 2015

After finding a valid linearization of phi assignments, the old approach calls a complicated target-specific method that lowers and ad-hoc register allocates the phi assignments.

In the new approach, we use existing target lowering to lower assignments into mov/whatever instructions, and enhance the register allocator to be able to forcibly spill and reload a register if one is needed but none are available.

The new approach incrementally updates liveness and live ranges for newly added nodes and variable uses, to avoid having to expensively recompute it all.

Advanced phi lowering is enabled now on ARM, and constant blinding no longer needs to be disabled during phi lowering.

Some of the metadata regarding which CfgNode a local variable belongs to, needed to be made non-const, in order to add spill/fill instructions to a CfgNode during register allocation.

Most of the testing came from spec2k. There are some minor differences in the output regarding stack frame offsets, probably related to the order that new nodes are phi-lowered. The changes related to constant blinding were tested by running spec with "-randomize-pool-immediates=randomize -randomize-pool-threshold=8".

Unfortunately, this appears to add about 10% to the translation time for 176.gcc. The cost is clear in the -timing output so it can be investigated later. There is a TODO suggesting the possible cause and solution, for later investigation.

BUG= none
R=jvoung@chromium.org

Review URL: https://codereview.chromium.org/1253833002.

a3f57b9a

28 Jul, 2015 2 commits

Removes references to ah. · 448c16f0

authored Jul 28, 2015

AH is a thorn in the flesh for our X86-64 backend. The assembler was
designed to always encode the low 8-bit registers, so %ah would become
%spl. While it is true we **could** force %spl to always be encoded as
%ah, that would not work if the instruction has a rex prefix.

This CL removes references to %ah from TargetX86Base. There used to be
2 uses of ah in the target lowering:

1) To zero-extend %al before an unsigned div:
    mov <<src0>>, %al
    mov 0, %ah
    div <<src1>>

This pattern has been changed to
    xor %eax, %eax
    mov <<src0>>, %al
    div <<src1>>

2) To access the 8-bit remainder for 8-bit division:
    mov %ah, <<dest>>

This pattern has been changed to
    shr $8, %eax
    mov %al, <<Dest>>

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4077
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1260163003.

448c16f0

Adds the x86-64 assembler. · 2fea26ca

authored Jul 28, 2015

As part of this CL, x86-32 assembler tests are also introduced. They were implemented before the x86 base assembler template was modified for x86-64 support.

BUG=https://code.google.com/p/nativeclient/issues/detail?id=4077
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1224173006.

2fea26ca

23 Jul, 2015 4 commits

Handle UINT64_MAX edge case in switch lowering. · 016c56d9
Andrew Scull authored Jul 23, 2015
```
BUG=
R=jvoung@chromium.org, jvoung

Review URL: https://codereview.chromium.org/1247833003.
```
016c56d9

Avoid heap allocation for binary search work stack. · 8447bbae

authored Jul 23, 2015

During switch lowering a binary search tree is created. The height of this
tree is usually small so no need for heap allocation.

BUG=
R=jvoung@chromium.org, jvoung, stichnot

Review URL: https://codereview.chromium.org/1240323005.

8447bbae

Modify how textual bitcode is injected into pnacl-sz. · cb6e95aa

authored Jul 23, 2015

Added command line flag "--bitcode-as-text", and triggered
the acceptance of textual bitcode on this flag. To make sure
this isn't added to the sandboxed translator, the reading of
bitcode text is also dependent on the translator not being a
minimal build.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4222
R=jvoung@chromium.org, stichnot@chromium.org

Review URL: https://codereview.chromium.org/1215463014 .

cb6e95aa

Remove jumps over empty blocks. · 713278ab

authored Jul 22, 2015

After contracting a node reorder it as unreachable even if it
hasn't yet been placed.

BUG=
R=jvoung@chromium.org, jvoung, stichnot

Review URL: https://codereview.chromium.org/1246173003.

713278ab

21 Jul, 2015 5 commits

Make ARM RegNames[] static like X86 (no ARM syms in X86-only build). · 0dab0324

authored Jul 21, 2015

The X86 code was switch out here:
https://codereview.chromium.org/1216933015/diff/150001/src/IceTargetLoweringX86Base.h

The important bit might be that it's static const char * instead of
static IceString. This removes static ctor/dtor for that array,
which LTO doesn't seem to be able to optimize out, leaving ARM
and MIPS symbols in the X86-only build. After changing it to static
const char *, LTO is able to optimize out the ARM and MIPS
symbols in the x86-only build, saving about 3KB of .text and
few bytes of .rodata.

BUG=none
R=jpp@chromium.org

Review URL: https://codereview.chromium.org/1246013004 .

0dab0324

Changes the TargetX8632 to inherit from TargetX86Base<TargetX8632>. · 5aeed955

authored Jul 21, 2015

Previously, TargetX8632 was defined as

class TargetX8632 : public TargetLowering;

and its create method would do

TargetX8632 *TargetX8632::create() {
  return TargetX86Base<TargetX8632>::create()
}

TargetX86Base<M> was defined was

template <class M> class TargetX86Base : public M;

which meant TargetX8632 had no way to access methods defined in
TargetX86Base<M>. This used to not be a problem, but with the X8664
backend around the corner it became obvious that the actual TargetX86
targets (e.g., X8632. X8664SysV, X8664Win) would need access to some
methods in TargetX86Base (e.g., _mov, _fld, _fstp etc.)

This CL changes the class hierarchy to something like

TargetLowering <-- TargetX86Base<X8632> <-- X8632
               <-- TargetX86Base<X8664SysV> <-- X8664SysV (TODO)
               <-- TargetX86Base<X8664Win> <-- X8664Win (TODO)

One problem with this new design is that TargetX86Base<M> needs to be
able to invoke methods in the actual backends. For example, each
backend will have its own way of lowering llvm.nacl.read.tp. This
creates a chicken/egg problem that is solved with (you guessed)
template machinery (some would call it voodoo.)

In this CL, as a proof of concept, we introduce the

   TargetX86Base::dispatchToConcrete

template method. It is a very simple method: it downcasts "this" from
the template base class (TargetX86Base<TargetX8664>) to the actual
(concrete) class (TargetX8632), and then it invokes the requested
method. It uses perfect forwarding for passing arguments to the method
being invoked, and returns whatever that method returns.

A simple proof-of-concept for using dispatchToConcrete is introduced
with this CL: it is used to invoke createNaClReadTPSrcOperand on the
concrete target class. In a way, dispatchToConcrete is a poor man's
virtual method call, without the virtual method call overhead.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4077
R=jvoung@chromium.org, stichnot@chromium.org

Review URL: https://codereview.chromium.org/1217443024.

5aeed955

Only run adv-switch test when asm is allowed. · 8c8f3bc1

authored Jul 21, 2015

BUG=
R=stichnot@chromium.org, jvoung@chromium.org

Review URL: https://codereview.chromium.org/1248823003.

8c8f3bc1

Rename legalizeToVar to the more accurate legalizeToReg. · 97f460dc
Andrew Scull authored Jul 21, 2015
```
BUG=
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1245063003.
```
97f460dc

Fix --filetype=iasm non-pc-rel fixup offsets (double counted). · b7db1a52

authored Jul 21, 2015

For pc-rel fixups, we have a ConstantRelocatable referring
to Foo+0, and and the offset "-4" is encoded in the code
buffer (but not the ConstantRelocatable object). Thus we
need to load from the code buffer in order to
get that "-4" instead of just taking the +0 from Foo+0.

For non-pc-rel fixups, we have the ConstantRelocatable
with a true offset, and we also write that offset into the
code buffer (for ELF REL and not RELA, it expects the
offset in the code buffer). In this case, we want to choose
one and not double-count.

BUG=none
176.gcc seemed to be failing when compiled with --filetype=iasm...
load address for 64-bit pointers were +8 instead of +4

R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1241313002 .

b7db1a52

20 Jul, 2015 1 commit

Introduction of improved switch lowering. · 87f80c12

authored Jul 20, 2015

This includes the high level analysis of switches, the x86 lowering,
the repointing of targets in jump tables and ASM emission of jump
tables.

The technique uses jump tables, range test and binary search with
worst case O(lg n) which improves the previous worst case of O(n)
from a sequential search.

Use is hidden by the --adv-switch flag as the IAS emission still
needs to be implemented.

BUG=None
R=jvoung@chromium.org, stichnot@chromium.org

Review URL: https://codereview.chromium.org/1234803007.

87f80c12

16 Jul, 2015 1 commit

Factor out prelowerPhi for 32-bit targets. Disable adv phi lowering for ARM. · 53483691

authored Jul 16, 2015

This way, prelowerPhi can be shared between 32-bit targets (split 64-bit
values into 32-bit ones, and legalize undef). Suggestions from template
experts on how to share prelowerPhi welcome. I'm not particularly happy
with the first pass in that legalizeUndef has to be made public (though
other methods used are also public). Also the methods required from the
template type TargetT aren't clear without looking through the code.

The current advanced phi lowering code depends on lowerPhiAssignments.
That is a special case of lowerAssign that does some adhoc register
allocation. The current adhoc register allocation doesn't work as
well when a target may need to spill more than one register.
Disable that optimization for ARM for now, until we have a better
way that works for ARM, and enable O2 cross testing on ARM.

BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076
R=stichnot@chromium.org

Review URL: https://codereview.chromium.org/1223133007 .

53483691