1. 09 Aug, 2016 1 commit
  2. 08 Aug, 2016 2 commits
  3. 05 Aug, 2016 3 commits
  4. 04 Aug, 2016 6 commits
  5. 03 Aug, 2016 1 commit
  6. 02 Aug, 2016 1 commit
  7. 01 Aug, 2016 2 commits
    • Enable Local CSE by default · 53c8fbdf
      Manasij Mukherjee authored
      Reduce the default number of iterations to 1
      Put the optional code behind the -lcse-no-ssa flag, which is disabled by
      default. This brings down the overhead of enabling this to about 2%.
      
      BUG=
      R=stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/2185193002 .
    • Subzero: Local variable splitting. · b9a84728
      Jim Stichnoth authored
      The linear-scan register allocator takes an all-or-nothing approach -- either the variable's entire live range gets a register, or none of it does.
      
      To help with this, we add a pass that splits successive uses of a variable within a basic block into a chain of linked variables.  This gives the register allocator the chance to allocate registers to subsets of the original live range.
      
      The split variables are linked to each other so that if they don't get a register, they share a stack slot with the original variable, and redundant writes to that stack slot are recognized and elided.
      
      This pass is executed after target lowering and right before register allocation.  As such, it has to deal with some idiosyncrasies of target lowering, specifically the possibility of intra-block control flow.  We experimented with doing this as a pre-lowering pass.  However, the transformations interfered with some of the target lowering's pattern matching, such as bool folding, so we concluded that post-lowering was a better place for it.
      
      Note: Some of the lit tests are overly specific about registers, and in these cases it was the path of least resistance to just disable local variable splitting.
      
      BUG= none
      R=eholk@chromium.org, jpp@chromium.org
      
      Review URL: https://codereview.chromium.org/2177033002 .
  8. 27 Jul, 2016 1 commit
  9. 26 Jul, 2016 1 commit
  10. 25 Jul, 2016 1 commit
  11. 21 Jul, 2016 4 commits
  12. 20 Jul, 2016 1 commit
  13. 19 Jul, 2016 2 commits
    • Improve LoopAnalyzer Interface · adf352bc
      Manasij Mukherjee authored
      Make LoopAnalyzer compute loop bodies and depth only.
      Move the logic for finding out loop headers and pre-headers to LoopInfo, which provides a visitor to iterate over the loops and easy access to the information.
      This does not change the core algorithm.
      
      BUG=None
      R=jpp@chromium.org, stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/2149803005 .
    • Subzero: Fix lowering for x86 div/rem instructions. · 017a5538
      Jim Stichnoth authored
      The x86 lowering sequences for sdiv/udiv/srem/urem all have a problem, in that they don't reflect the fact that two registers are affected by the instruction.
      
      For example, the urem instruction:
        dest = src0 urem src1
      lowers to something like this:
        t1:eax = src0
        t2:edx = 0
        t2:edx = (t1:eax and t2:edx) div src1
        dest = t2:edx
      
      The problem is that there is no indication that the div instruction smashes eax.  As such, it's possible that the register allocator could erroneously assume that src0 is still available in eax after the div instruction.
      
      To fix this, we make use of the FakeDef instruction.  In this example, we change the div instruction to "officially" produce eax as its result, then fakedef edx in terms of eax.  This means that as long as the urem result is actually used, the definitions of eax and edx will be preserved, but if the urem result is unused, then the whole sequence can be dead-code eliminated.
      
        t1:eax = src0
        t2:edx = 0
        t1:eax = (t1:eax and t2:edx) div src1  # dest var changed to t1:eax
        t2:edx = fakedef t1:eax                # fakedef instruction added
        dest = t2:edx
      
      BUG= none
      R=jpp@chromium.org
      
      Review URL: https://codereview.chromium.org/2158213002 .
  14. 14 Jul, 2016 2 commits
  15. 13 Jul, 2016 2 commits
  16. 12 Jul, 2016 3 commits
  17. 10 Jul, 2016 1 commit
    • Subzero: Allow deeper levels of variable splitting. · fe62f0a2
      Jim Stichnoth authored
      This fixes some existing problems with the Variable::LinkedTo splitting/linking mechanism.  The problem was that if B is linked to A, and B needs a stack slot, but A doesn't get a stack slot, B's stack offset would never get initialized.  This could happen if A ends up with no explicit references in the code, or A's live range gets truncated such that it actually has a register while B doesn't.
      
      It gets even more complicated if you have a link chain like A<--B<--C<--D etc. where some of them have stack slots (which should ultimately all be the same slot) and some don't.
      
      The solution here is that if B is linked to the root A, and B has a stack slot but A doesn't, we can do a tree rotation so that B is the new root and A links to B.
      
      In addition, we initialize Variable::StackOffset to an invalid value and always make sure a value used is valid.  Earlier attempts at extending the variable splitting would sometimes silently fail because the default StackOffset value of 0 ended up being used.
      
      BUG= none
      R=jpp@chromium.org
      
      Review URL: https://codereview.chromium.org/2116213002 .
  18. 07 Jul, 2016 2 commits
  19. 06 Jul, 2016 3 commits
  20. 30 Jun, 2016 1 commit