1. 01 Dec, 2014 1 commit
    • Subzero: Fix a bug in postLower(). · 5d2fa0cf
      Jim Stichnoth authored
      In -O2 mode, postLower() is supposed to iterate over just the
      instructions that were most recently added.  Instead, it was iterating
      all the way to the end of the block, also post-lowering high-level ICE
      instructions that hadn't yet been lowered.  This was basically
      harmless, given that the spec2k asm code is identical after this
      patch, but it improves performance.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/721333004
  2. 26 Nov, 2014 1 commit
    • Subzero: Improve malloc/free behavior. · 9d801a01
      Jim Stichnoth authored
      Use a bigger block size in the bump-pointer allocators, since we
      basically know up front that we'll need lots of memory.  The 1MB value
      (versus the default of 4KB) was chosen somewhat arbitrarily, and
      succeeds in pretty much removing bump-pointer related mallocs from the
      profile.
      
      Pre-reserve the a priori known number of edges in getTerminatorEdges()
      to avoid vector resizing.
      
      BUG= none
      R=kschimpf@google.com
      
      Review URL: https://codereview.chromium.org/760973002
  3. 24 Nov, 2014 1 commit
  4. 21 Nov, 2014 1 commit
  5. 20 Nov, 2014 1 commit
    • Subzero: Simplify the constant pools. · d2cb4361
      Jim Stichnoth authored
      Internally, create a separate constant pool for each integer type, instead of a single i64 pool that uses the Ice::Type value as part of the key.  This means each constant pool key can be a simple primitive value, rather than a tuple.
      
      Represent the pools using std::unordered_map instead of std::map since we're using C++11 now.
      
      Use signed integers instead of unsigned integers for the integer constant pools, to benefit from sign extension and to be more consistent.
      
      Remove the SuppressMangling field from hash and comparison functions on RelocatableTuple, since we'll never have two symbols with the same name but different values of SuppressMangling.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/737513008
  6. 18 Nov, 2014 1 commit
  7. 17 Nov, 2014 1 commit
  8. 14 Nov, 2014 4 commits
    • Subzero: Use the linear-scan register allocator for Om1 as well. · 70d0a054
      Jim Stichnoth authored
      This removes the need for Om1's postLower() code which did its own ad-hoc register allocation.  And it actually speeds up Om1 translation significantly.
      
      This mode of register allocation only allocates for infinite-weight Variables, while respecting live ranges of pre-colored Variables.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/733643005
    • Add irt_random to szbuild link line. · edc115ec
      Jan Voung authored
      Seems to be part of the non-sfi link now:
      https://codereview.chromium.org/686723003/diff/180001/pnacl/driver/pnacl-translate.py
      
      Otherwise I get:
      x86-32-linux/lib/unsandboxed_irt.o:(.rodata+0x68): undefined reference to `nacl_secure_random'
      
      BUG=none
      R=stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/726093002
    • Subzero: Simplify the FakeKill instruction. · 87ff3a18
      Jim Stichnoth authored
      Even after earlier simplifications, FakeKill was still handled somewhat inefficiently for the register allocator.  For x86-32, any function containing call instructions would result in about 11 pre-colored Variables, each with an identical and relatively complex live range consisting of points.  They would start out on the UnhandledPrecolored list, then all move to the Inactive list, where they would be repeatedly compared against each register allocation candidate via overlapsRange().
      
      We improve this by keeping around a single copy of that live range and directly masking out the Free[] register set when that live range overlaps the current candidate's live range.  This saves ~10 overlaps() calculations per candidate while FakeKills are still pending.
      
      Also, slightly rearrange the initialization of the Unhandled etc. sets into a separate init routine, which will make it easier to reuse the register allocator in other situations such as Om1 post-lowering.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/720343003
    • Subzero: Auto-set -build-on-read=0 for .ll input files. · 51596d43
      Jim Stichnoth authored
      This is purely for convenience of personal testing/debugging.
      
      To demonstrate its correctness in this CL, -build-on-read=0 is removed
      from the two .ll lit tests that explicitly use it, and also from the
      crosstest.py script.  The lit test wrapper run-llvm2ice.py is left
      unchanged to be safe.
      
      BUG= none
      R=kschimpf@google.com
      
      Review URL: https://codereview.chromium.org/732583002
  9. 11 Nov, 2014 1 commit
    • Subzero: Remove Variable::NeedsStackSlot. · 33c80641
      Jim Stichnoth authored
      Instead, separately compute it during prolog generation via another
      pass over the Cfg.
      
      This may slow down translation by ~1%, but it greatly simplifies the
      management of this flag/property.
      
      The higher motivation is to pull this management out of register
      allocation to make it easier to extend register allocation for other
      uses.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/692633004
  10. 06 Nov, 2014 4 commits
  11. 05 Nov, 2014 1 commit
  12. 04 Nov, 2014 4 commits
  13. 03 Nov, 2014 3 commits
  14. 02 Nov, 2014 1 commit
  15. 01 Nov, 2014 3 commits
    • Subzero: Switch to AT&T asm syntax. I give up. · bca2f655
      Jim Stichnoth authored
      The main motivation is that -build-on-read introduces Intel-style asm output like:
        mov al, byte ptr [flags]
      and llvm-mc misinterprets the global symbol "flags" as the flags register.  Further workarounds will likely cost more effort than switching over to AT&T syntax.
      
      Most of the lit tests don't need changing, since the asm text is generated by assembling and disassembling the llvm2ice asm output.
      
      There some LEAHACK TODOs that can be fixed, but that would change some of the instructions, so that can be a separate CL.
      
      The Operand emit() routines really ought to be moved entirely into the target-specific source files.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/695993004
    • Subzero: Decorate the text asm output with register availability info. · 3d44fe8c
      Jim Stichnoth authored
      The -asm-verbose flag adds comments to the text asm output about register availability.  Specifically, it prints the registers in use at the beginning and end of each block, and it prints which registers' live ranges end at each instruction.
      
      This is extremely helpful when studying the output to find opportunities to improve the code quality.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/682983004
    • Subzero: Remove a TODO comment about shld/shrd. · 8835576b
      Jan Voung authored
      The 32-bit validator is now consistent with the 64-bit
      validator w.r.t. 16-bit shld/shrd and accepts it. We didn't
      really use the 16-bit form in Subzero though, only the 32-bit
      one for 64-bit ops, I think.
      
      BUG=none
      R=stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/696753003
  16. 30 Oct, 2014 3 commits
    • Remove -Werror from pnacl build, due to default switch error in subzero. · 6c4fde96
      Karl Schimpf authored
      When compiling using toolchain_build_pnacl.py, we get errors of form:
      
        Don't use default labels in fully covered switches over enumerations
      
      I tried different combinations of -Wno-covered-switch-default and
      -Wno-error=covered-switch-default, but was not able to stop this
      error from being generated. Hence, taking the simplier route of
      removing -Werror from Makefile.
      
      (see www.llvm.org/docs/CodingStandards.html for more details)
      
      BUG=None
      R=stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/686103006
    • Subzero: Implementation of "advanced Phi lowering". · 336f6c4a
      Jim Stichnoth authored
      Delays Phi lowering until after register allocation.  This lets the Phi assignment order take register allocation into account and avoid creating false dependencies.
      
      All edges that lead to Phi instructions are split, and the new node gets mov instructions in the correct topological order, using available physical registers as needed.
      
      This lowering style is controllable under -O2 using -phi-edge-split (enabled by default).
      
      The result is faster translation time (due to fewer temporaries leading to faster liveness analysis and register allocation) as well as better code quality (due to better register allocation and fewer phi-based assignments).
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/680733002
    • Subzero: Fix broken lit tests. · 0506fc72
      Jim Stichnoth authored
      The file ifatts.py no longer exists.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/687403002
  17. 29 Oct, 2014 3 commits
  18. 27 Oct, 2014 2 commits
    • Allow conditional lit tests in Subzero, based on build flags. · b262c5e0
      Karl Schimpf authored
      Adds conditionality to lit tests in two ways:
      
      1) Allows the use of "; REQUIRES: XXX" lines in lit tests. In this
      case, the tests defined by the file are only run if all REQUIRES are
      met.
      
      2) Allows the conditional running of RUN commands, based on build
      flags. This comes in two subforms. There are predefined %ifX commands
      that run the command defined by remaining arguments, if the
      corresponding %X2i command is applicable. Alternatively, one can use
      %if with explicit '--att' arguments to define what conditions should
      be checked.
      
      In any case, unlike REQUIRES, the %if commands RUN all the time, but
      simply generate empty output, rather then output defined by the
      following command, if the condition is not met. These latter tests are
      useful when the same input is to be tested under different conditions,
      since the REQUIRES form does not allow this.
      
      Note that m2i, p2i, l2i, and lc2i are also conditionally controlled,
      so that they do nothing if the build did not construct the appropriate
      Subzero translator.
      
      This CL replaces https://codereview.chromium.org/644143002
      
      BUG=None
      R=jvoung@chromium.org, stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/659513005
    • Subzero: Refactor newline emission for Inst::emit(). · 120b4121
      Jim Stichnoth authored
      The (final) newline is emitted by the caller of emit(), instead of
      by all the emit() implementations.  This sets the stage for being
      able to add useful comments to the textual asm, such as annotating
      which registers became free after the instruction.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/681783002
  19. 24 Oct, 2014 3 commits
  20. 23 Oct, 2014 1 commit
    • Subzero: Improve debugging controls, plus minor refactoring. · 088b2be2
      Jim Stichnoth authored
      1. Decorate the list of live-in and live-out variables with register assignments in the dump() output.  This helps one to assess register pressure.
      
      2. Fix a bug where the DisableInternal flag wasn't being honored for function definitions.
      
      3. Add a -translate-only=<symbol> to limit translation to a single function or global variable.  This makes it easier to focus on debugging a single function.
      
      4. Change the -no-phi-edge-split option to -phi-edge-split and invert the meaning, to better not avoid the non double negatives.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/673783002