1. 09 Nov, 2015 4 commits
    • Fixes LDR and STR instructions. Two types of mistakes were being made. · b9f27229
      Karl Schimpf authored
      First, the width was not being correctly defined for non-vector
      instructions.
      
      Second, the order of the width/condition was incorrect when the
      instruction was prefixed with a V. That is, for V prefixed instructions,
      the order is predicate/width while for non-V prefixed instructions the
      order is width/predicate.
      
      Also fixes bug in target lowering that did not always convert results
      of a compare to i1.
      
      BUG= https://bugs.chromium.org/p/nativeclient/issues/detail?id=4334
      R=stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/1415953007 .
    • Subzero: Refactor x86 register representation to actively use aliases. · c59288b3
      Jim Stichnoth authored
      Sets up additional register attributes, plus the notion of register classes, to enable robust usage of the high 8-bit GPRs (ah/bh/ch/dh), for both x86-32 and x86-64.  (Note that the x86-64 changes are currently untested.)
      
      We add a Register Class field to the Variable class.  The default register class is a value corresponding to the variable's type, but the target can extend the set of register class values, and the target lowering can assign different register classes as needed.  The register allocator uses the register class instead of the type to determine the set of registers to draw from.
      
      For x86-64, the high 8-bit registers are not included in the general register allocation pool, but there are explicit references to ah for lowering the div/rem instructions.
      
      The target lowering is modified as needed to make sure types are appropriate and register use in instructions is legalized.
      
      Some other fixes and cleanups are included in this CL:
      
      * Makefile.standalone changes.  Source files are reordered so that the more expensive compiles are done earlier, speeding up parallel builds by decreasing fragmentation.  A dependency error is fixed for check-spec.
      
      * A bug is fixed in advanced phi lowering.  When a temporary is introduced to break a cycle, we were neglecting to updated the predecessor count for one of the operands, leading to an assertion failure.  (Applying that fix to master resulted in no changes to spec2k code generation.)  A consistency check is added to help find future problems like this.  Also, refactored iteration over the Phi descriptor array to use range-based for loops and avoid directly indexing the array.
      
      * Removed most of the "IceType_" prefixes in x-macro tables for brevity.
      
      * Fix a correctness TODO in the register allocator.  This had no effect on spec2k code generation in master or in this CL, so we were probably just lucky.
      
      * Made some much-needed s/Dest->getType()/Ty/ changes for brevity, in the target lowering sections that needed other changes.
      
      BUG= https://bugs.chromium.org/p/nativeclient/issues/detail?id=4095
      R=jpp@chromium.org
      
      Review URL: https://codereview.chromium.org/1427973003 .
    • Subzero: Fix a bug in advanced phi lowering. · ea15bbe7
      Jim Stichnoth authored
      When a temporary is introduced to break a cycle, we neglected to update the predecessor count for one of the operands, leading to a possible assertion failure.
      
      This problem isn't currently seen in master, but it arises when we enable register aliases, as in https://codereview.chromium.org/1427973003/ .  No changes are seen in spec2k code generation as a result of this fix.
      
      A consistency check is added to help find future problems like this.
      
      Also, refactored iteration over the Phi descriptor array to use range-based for loops and avoid directly indexing the array.
      
      BUG= none
      R=jpp@chromium.org
      
      Review URL: https://codereview.chromium.org/1435543002 .
    • Subzero: Recognize single-block loops during loop depth analysis. · f49b2396
      Jim Stichnoth authored
      BUG= none
      R=jpp@chromium.org
      
      Review URL: https://codereview.chromium.org/1416113007 .
  2. 06 Nov, 2015 4 commits
  3. 05 Nov, 2015 3 commits
  4. 04 Nov, 2015 5 commits
  5. 02 Nov, 2015 2 commits
  6. 31 Oct, 2015 1 commit
  7. 30 Oct, 2015 12 commits
  8. 29 Oct, 2015 3 commits
  9. 28 Oct, 2015 2 commits
    • Sets the stage for enabling the use of the 8-bit high registers, but doesn't yet… · 5bff61c4
      Jim Stichnoth authored
      Sets the stage for enabling the use of the 8-bit high registers, but doesn't yet turn it on because more work is needed for correctness.
      
      In the lowering, typing is tightened up so that we don't specify e.g. eax when we really mean ax or al.  This gets rid of the ShiftHack hack.  The one exception is the pinsr instruction which always requires an r32 register even if the memory operand is m8 or m16.
      
      The x86 assembler unit tests are fixed, by not passing a GlobalContext arg to the Assembler ctor.
      
      Many constexpr and "auto *" upgrades are applied.  Sorry for not putting this into a separate CL - a few local fixes got out of hand...
      
      Tested in the following ways:
      - "make check-lit" - some .ll CHECK line changes due to register randomization
      - "make check-xtest"
      - "make check-xtest" with forced filetype=asm (via local .py hack)
      - spec2k with all -filetype options
      - compare before-and-after spec2k filetype=asm output - a few differences where the correct narrow register is used instead of the full-width register
      
      To do in the next CL:
      
      1. Add new register classes:
        (a) 32-bit GPR truncable to 8-bit (eax, ecx, edx, ebx)
        (b) 16-bit GPR truncable to 8-bit (ax, cx, dx, bx)
        (c) 8-bit truncable from 16/32-bit (al, bl, cl, dl)
        (c) 8-bit "mov"able from ah/bh/ch/dh
      
      2. Enable use of ah/bh/ch/dh for x86-32.
      
      3. Enable use of ah (but skip bh/ch/dh) for x86-64.
      
      4. Statically initialize register tables in the TargetLowering subclass.
      
      BUG= none
      R=jpp@chromium.org, kschimpf@google.com
      
      Review URL: https://codereview.chromium.org/1419903002 .
    • Subzero. ARM32. Implements the Availability Optimization. · 562233c8
      John Porto authored
      Implements the Availability optimization:
      
      a = b
      x = f(a, c)
      
      becomes
      
      a = b
      x = f(b, c)
      
      This only triggers if b is an infinite-weight temporary, and it
      prevents a potential spill at the cost of higher register pressure.
      
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076
      R=stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/1424873003 .
  10. 27 Oct, 2015 4 commits