1. 29 Oct, 2015 2 commits
  2. 28 Oct, 2015 2 commits
    • Sets the stage for enabling the use of the 8-bit high registers, but doesn't yet… · 5bff61c4
      Jim Stichnoth authored
      Sets the stage for enabling the use of the 8-bit high registers, but doesn't yet turn it on because more work is needed for correctness.
      
      In the lowering, typing is tightened up so that we don't specify e.g. eax when we really mean ax or al.  This gets rid of the ShiftHack hack.  The one exception is the pinsr instruction which always requires an r32 register even if the memory operand is m8 or m16.
      
      The x86 assembler unit tests are fixed, by not passing a GlobalContext arg to the Assembler ctor.
      
      Many constexpr and "auto *" upgrades are applied.  Sorry for not putting this into a separate CL - a few local fixes got out of hand...
      
      Tested in the following ways:
      - "make check-lit" - some .ll CHECK line changes due to register randomization
      - "make check-xtest"
      - "make check-xtest" with forced filetype=asm (via local .py hack)
      - spec2k with all -filetype options
      - compare before-and-after spec2k filetype=asm output - a few differences where the correct narrow register is used instead of the full-width register
      
      To do in the next CL:
      
      1. Add new register classes:
        (a) 32-bit GPR truncable to 8-bit (eax, ecx, edx, ebx)
        (b) 16-bit GPR truncable to 8-bit (ax, cx, dx, bx)
        (c) 8-bit truncable from 16/32-bit (al, bl, cl, dl)
        (c) 8-bit "mov"able from ah/bh/ch/dh
      
      2. Enable use of ah/bh/ch/dh for x86-32.
      
      3. Enable use of ah (but skip bh/ch/dh) for x86-64.
      
      4. Statically initialize register tables in the TargetLowering subclass.
      
      BUG= none
      R=jpp@chromium.org, kschimpf@google.com
      
      Review URL: https://codereview.chromium.org/1419903002 .
    • Subzero. ARM32. Implements the Availability Optimization. · 562233c8
      John Porto authored
      Implements the Availability optimization:
      
      a = b
      x = f(a, c)
      
      becomes
      
      a = b
      x = f(b, c)
      
      This only triggers if b is an infinite-weight temporary, and it
      prevents a potential spill at the cost of higher register pressure.
      
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076
      R=stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/1424873003 .
  3. 27 Oct, 2015 5 commits
  4. 23 Oct, 2015 1 commit
  5. 22 Oct, 2015 1 commit
  6. 21 Oct, 2015 1 commit
  7. 17 Oct, 2015 1 commit
  8. 16 Oct, 2015 5 commits
  9. 15 Oct, 2015 2 commits
    • Subzero: Various fixes in preparation for x86-32 register aliasing. · 1fb030c6
      Jim Stichnoth authored
      1. Helper function sameVarOrReg() also needs to return true if the two physical registers alias or overlap.  Otherwise advanced phi lowering may pick an incorrect ordering.
      
      2. With -asm-verbose, redundant truncation assignments expressed as _mov instructions, like "mov cl, ecx", need to have their register use counts updated properly, so that the LIVEEND= annotations are correct.
      
      3. The register allocator should consider suitably typed aliases when choosing a register preference.
      
      4. When evicting a variable, the register allocator should decrement the use count of all aliases.
      
      5. When saving/restoring callee-save registers in the prolog/epilog, map each register to its "canonical" register (e.g. %bl --> %ebx) and make sure each canonical register is only considered once.
      
      6. Remove some unnecessary Variable::setMustHaveReg() calls.
      
      7. When assigning bool results as a constant 0 or 1, use an 8-bit constant instead of 32-bit so that only the 8-bit register gets assigned.
      
      BUG= none
      TEST= make check, plus spec2k -asm-verbose output is unchanged
      R=kschimpf@google.com
      
      Review URL: https://codereview.chromium.org/1405643003 .
    • Optimize 64-bit compares with zero · 5c87542a
      David Sehr authored
      Comparisons with zero can be done with no branches in most cases and with
      simpler sequences of operations.
      
      BUG=
      R=stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/1406593003 .
  10. 14 Oct, 2015 1 commit
  11. 13 Oct, 2015 2 commits
  12. 12 Oct, 2015 1 commit
    • Subzero: Consider all instruction variables for register preference. · 28b71be4
      Jim Stichnoth authored
      The original code only looked at top-level source operands in the defining instruction, with a TODO to instead consider all inner variables in the instruction.
      
      The primary reason is so that we end up with more instructions like
        mov eax, eax
      which are later elided as redundant assignments.
      
      A secondary reason is to foster more instructions like:
        mov ecx, [ecx]
      rather than
        mov eax, [ecx]
      where ecx's live range ends.  This hopefully keeps eax (in the latter case) free for longer and maybe allow some other variable to get a register.  By considering all instruction variables, we enable this.
      
      BUG= none
      R=jpp@chromium.org
      
      Review URL: https://codereview.chromium.org/1392383003 .
  13. 09 Oct, 2015 5 commits
  14. 08 Oct, 2015 2 commits
  15. 07 Oct, 2015 3 commits
    • Create local copy of Dart assembler code. · 3e53dc99
      Karl Schimpf authored
      Creates a local version of the Dart assembler code, before being
      merged into our code base. The goal of these files is to track code as
      it is moved from the Dart implementation into our code base.
      
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4334
      R=jpp@chromium.org, stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/1394613002 .
    • Make sure that all globals are internal, except for "start" functions. · 57d31ac7
      Karl Schimpf authored
      The existing code, when run on a fuzzed example, generates a runtime
      assertion. The reason for this is that the input defines "memmove" as
      an external global. However, the code generator can generate calls to
      "memmove" which assumes it is internal (see PNaCl ABI). As a result,
      the assertion that checks that global names are unique (for memmove)
      fails.
      
      This code fixes the problem by checking that global names are
      internal, unless they are one of the "start" functions,
      or the function is an intrinsic. To allow for
      non-PNaCl ABI input, a flag was added to allow functions to be
      external. However, in such cases the external can't be one of
      Subzero's runtime helper functions.
      
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4330
      R=jpp@chromium.org, stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/1387963002 .
    • Generate better two address code by using commutativity · 487bad02
      David Sehr authored
      For operations such as
          t0 = t1 + t2
      Subzero's pattern for arithmetic operations generates two address code that
      looks like
          movl ...t1..., %ecx
          addl ...t2..., %ecx // t0 is in %ecx
      
      When register pressure is high this sometimes becomes:
          movl ...t2..., SPILL
          movl ...t1..., %ecx
          addl SPILL, %ecx // t0 is in %ecx
      
      This CL takes advantage of cases where the use of t2 is the last one, so the
      register that held t2 before the operation can be reused.  The optimization
      simply swaps the (commutative) operation to
          t0 = t2 + t1
      which then generates code as
          movl ...t2..., %ecx
          addl ...t1..., %ecx // t0 is in %ecx
      
      This optimization is used for any commutative operation, which now includes
      Fadd and Fmul, which were erroneously marked as non-commutative.  See the
      rationale in IceInst.def for the IEEE wordings.
      
      BUG=
      R=jfb@chromium.org, stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/1371703003 .
  16. 06 Oct, 2015 2 commits
  17. 05 Oct, 2015 3 commits
    • Subzero: Improve lowering sequence for "a=b*b". · ebbb5912
      Jim Stichnoth authored
      Originally, the lowering sequence looked like:
        T = b
        T *= b
        a = T
      Now it looks like:
        T = b
        T *= T
        a = T
      
      If "b" gets a register and its live range ends after this instruction, then the new lowering sequence allows its register to be reused for "T".  This decreases register pressure, and removes an instruction (register move) from what could be a critical path.
      
      This optimization is actually applicable for most arithmetic operations whose source operands are identical, but mul/fmul are the only ones that seem at all likely in practice.
      
      BUG= none
      R=kschimpf@google.com
      
      Review URL: https://codereview.chromium.org/1377213004 .
    • Subzero: Fix nondeterministic behavior in constant pool creation. · b36757e1
      Jim Stichnoth authored
      This issue was discovered as the result of a spurious "make check-lit" failure in undef.ll.
      
      The problem is that constant pool label strings depend on the order the constants are created, and this order can be different with multithreaded translation.
      
      Even -filetype=obj is affected by this, because the label string is put into the ELF .o file.  This means that different runs of Subzero on the same input could potentially produce slightly different output.
      
      The solution is to base the label name on the actual value of the constant.  We do this by using the hex representation of the constant, rather than the sequence number of the constant within the pool.  This actually simplifies things a bit, as we no longer need to track the sequence number.
      
      In addition, for floating-point constant labels in asm-verbose mode, include a human-readable rendering of the value in the label name.
      
      BUG= none
      R=kschimpf@google.com
      
      Review URL: https://codereview.chromium.org/1386593004 .
    • Subzero: With -asm-verbose, make the predecessor list more compact. · 9a63babb
      Jim Stichnoth authored
      Instead of a comment like this:
      
        # preds=.Lfv_update_nonbon$split___114___115_0,.Lfv_update_nonbon$split___138___115_1
      
      remove some redundancy and make the comment like this:
      
        # preds=$split___114___115_0,$split___138___115_1
      
      This makes it slightly easier to read, and less likely to exceed 80 columns.
      
      BUG= none
      R=kschimpf@google.com
      
      Review URL: https://codereview.chromium.org/1380323003 .
  18. 02 Oct, 2015 1 commit