1. 06 Nov, 2014 1 commit
    • Subzero: Improve the use of NodeList objects. · bfb410dd
      Jim Stichnoth authored
      Currently NodeList is defined as std::vector<CfgNode*>, but in the future it may be desirable to change it to something like std::list<CfgNode*> so that it is easier to split edges and insert the new nodes at the right locations, rather than re-sorting them in a separate pass.
      
      This gets us closer by using foo.front() instead of foo[0].  There are still a couple more places using the [] operator, but the changes would be more intrusive.
      
      Also, a few instances of ".size()==0" are changed to the possibly more efficient ".empty()".
      
      BUG= none
      R=jvoung@chromium.org, kschimpf@google.com
      
      Review URL: https://codereview.chromium.org/704753007
  2. 05 Nov, 2014 1 commit
  3. 04 Nov, 2014 4 commits
  4. 03 Nov, 2014 3 commits
  5. 02 Nov, 2014 1 commit
  6. 01 Nov, 2014 3 commits
    • Subzero: Switch to AT&T asm syntax. I give up. · bca2f655
      Jim Stichnoth authored
      The main motivation is that -build-on-read introduces Intel-style asm output like:
        mov al, byte ptr [flags]
      and llvm-mc misinterprets the global symbol "flags" as the flags register.  Further workarounds will likely cost more effort than switching over to AT&T syntax.
      
      Most of the lit tests don't need changing, since the asm text is generated by assembling and disassembling the llvm2ice asm output.
      
      There some LEAHACK TODOs that can be fixed, but that would change some of the instructions, so that can be a separate CL.
      
      The Operand emit() routines really ought to be moved entirely into the target-specific source files.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/695993004
    • Subzero: Decorate the text asm output with register availability info. · 3d44fe8c
      Jim Stichnoth authored
      The -asm-verbose flag adds comments to the text asm output about register availability.  Specifically, it prints the registers in use at the beginning and end of each block, and it prints which registers' live ranges end at each instruction.
      
      This is extremely helpful when studying the output to find opportunities to improve the code quality.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/682983004
    • Subzero: Remove a TODO comment about shld/shrd. · 8835576b
      Jan Voung authored
      The 32-bit validator is now consistent with the 64-bit
      validator w.r.t. 16-bit shld/shrd and accepts it. We didn't
      really use the 16-bit form in Subzero though, only the 32-bit
      one for 64-bit ops, I think.
      
      BUG=none
      R=stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/696753003
  7. 30 Oct, 2014 3 commits
    • Remove -Werror from pnacl build, due to default switch error in subzero. · 6c4fde96
      Karl Schimpf authored
      When compiling using toolchain_build_pnacl.py, we get errors of form:
      
        Don't use default labels in fully covered switches over enumerations
      
      I tried different combinations of -Wno-covered-switch-default and
      -Wno-error=covered-switch-default, but was not able to stop this
      error from being generated. Hence, taking the simplier route of
      removing -Werror from Makefile.
      
      (see www.llvm.org/docs/CodingStandards.html for more details)
      
      BUG=None
      R=stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/686103006
    • Subzero: Implementation of "advanced Phi lowering". · 336f6c4a
      Jim Stichnoth authored
      Delays Phi lowering until after register allocation.  This lets the Phi assignment order take register allocation into account and avoid creating false dependencies.
      
      All edges that lead to Phi instructions are split, and the new node gets mov instructions in the correct topological order, using available physical registers as needed.
      
      This lowering style is controllable under -O2 using -phi-edge-split (enabled by default).
      
      The result is faster translation time (due to fewer temporaries leading to faster liveness analysis and register allocation) as well as better code quality (due to better register allocation and fewer phi-based assignments).
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/680733002
    • Subzero: Fix broken lit tests. · 0506fc72
      Jim Stichnoth authored
      The file ifatts.py no longer exists.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/687403002
  8. 29 Oct, 2014 3 commits
  9. 27 Oct, 2014 2 commits
    • Allow conditional lit tests in Subzero, based on build flags. · b262c5e0
      Karl Schimpf authored
      Adds conditionality to lit tests in two ways:
      
      1) Allows the use of "; REQUIRES: XXX" lines in lit tests. In this
      case, the tests defined by the file are only run if all REQUIRES are
      met.
      
      2) Allows the conditional running of RUN commands, based on build
      flags. This comes in two subforms. There are predefined %ifX commands
      that run the command defined by remaining arguments, if the
      corresponding %X2i command is applicable. Alternatively, one can use
      %if with explicit '--att' arguments to define what conditions should
      be checked.
      
      In any case, unlike REQUIRES, the %if commands RUN all the time, but
      simply generate empty output, rather then output defined by the
      following command, if the condition is not met. These latter tests are
      useful when the same input is to be tested under different conditions,
      since the REQUIRES form does not allow this.
      
      Note that m2i, p2i, l2i, and lc2i are also conditionally controlled,
      so that they do nothing if the build did not construct the appropriate
      Subzero translator.
      
      This CL replaces https://codereview.chromium.org/644143002
      
      BUG=None
      R=jvoung@chromium.org, stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/659513005
    • Subzero: Refactor newline emission for Inst::emit(). · 120b4121
      Jim Stichnoth authored
      The (final) newline is emitted by the caller of emit(), instead of
      by all the emit() implementations.  This sets the stage for being
      able to add useful comments to the textual asm, such as annotating
      which registers became free after the instruction.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/681783002
  10. 24 Oct, 2014 3 commits
  11. 23 Oct, 2014 1 commit
    • Subzero: Improve debugging controls, plus minor refactoring. · 088b2be2
      Jim Stichnoth authored
      1. Decorate the list of live-in and live-out variables with register assignments in the dump() output.  This helps one to assess register pressure.
      
      2. Fix a bug where the DisableInternal flag wasn't being honored for function definitions.
      
      3. Add a -translate-only=<symbol> to limit translation to a single function or global variable.  This makes it easier to focus on debugging a single function.
      
      4. Change the -no-phi-edge-split option to -phi-edge-split and invert the meaning, to better not avoid the non double negatives.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/673783002
  12. 21 Oct, 2014 1 commit
  13. 20 Oct, 2014 2 commits
  14. 16 Oct, 2014 2 commits
  15. 15 Oct, 2014 5 commits
    • emitIAS for movsx and movzx. · 39d4aca3
      Jan Voung authored
      Force dest to be the full 32-bit reg instead of sometimes being
      a 16-bit reg. This is to save on a operand size prefix (and
      avoid passing the DestTy down to the dispatchers).
      
      BUG=none
      R=stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/647223004
    • Subzero: Speed up VariablesMetadata initialization. · 877b04e4
      Jim Stichnoth authored
      Currently, O2 calls VariablesMetadata::init() 4 times:
      
      - Twice for liveness analysis, where only multi-block use information is needed for dealing with sparse bit vectors.
      
      - Once for address mode inference, where single-definition information is needed.
      
      - Once for register allocation, where all information is needed, including the set of all definitions which is needed for determining AllowOverlap.
      
      So we limit the amount of data we gather based on the actual need.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/650613003
    • Subzero: Class definition cleanup. · 7b451a92
      Jim Stichnoth authored
      For consistency, put deleted ctors at the beginning of the class
      definition.
      
      If the default copy ctor or assignment operator is not deleted,
      and the default implementation is used, leave it commented out to
      indicate it is intentional.
      
      Also, fixed one C++11 related TODO.
      
      BUG= none
      R=jvoung@chromium.org, kschimpf@google.com
      
      Review URL: https://codereview.chromium.org/656123003
    • Subzero: Register allocator performance improvements and simplifications. · 5ce0abb8
      Jim Stichnoth authored
      This removes the redundancy between live ranges stored in the Variable and those stored in Liveness, by removing the Liveness copy.  After liveness analysis, live ranges are constructed directly into the Variable.
      
      Also, the LiveRangeWrapper is removed and Variable * is directly used instead.  The original thought behind LiveRangeWrapper was that it could be extended to include live range splitting.  However, when/if live range splitting is implemented, it will probably involve creating a new variable with its own live range, and carrying around some extra bookkeeping until the split is committed, so such a wrapper probably won't be needed.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/656023002
    • emitIAS for Shld and Shrd and the ternary and three-address ops. · 962befa4
      Jan Voung authored
      Give a different name to the crosstest .s and .o files depending on the
      CPU features as well. That way the SSE2 and SSE4.1 .s and .o are separate.
      
      The encodings for Pextrw and Pextrb/d... make me sad.
      
      BUG=none
      R=stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/656983002
  16. 14 Oct, 2014 2 commits
    • Subzero: Enhance the timer dump format. · abce6e56
      Jim Stichnoth authored
      This adds update counts to the output, e.g.:
      
      Total across all functions - Flat times:
          0.262297 (13.0%): [ 1287] linearScan
          0.243965 (12.1%): [ 1287] emit
      ...
      
      This is useful to know when some passes are called once per function and others are called several times per function.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/655563005
    • Subzero: Improve performance of liveness analysis and live range construction. · 4775255d
      Jim Stichnoth authored
      The key performance problem was that the per-block LiveBegin and LiveEnd vectors were dense with respect to the multi-block "global" variables, even though very few of the global variables are ever live within the block.  This led to large vectors needlessly initialized and iterated over.
      
      The new approach is to accumulate two small vectors of <variable,instruction_number> tuples (LiveBegin and LiveEnd) as each block is processed, then sort the vectors and iterate over them in parallel to construct the live ranges.
      
      Some of the anomalies in the original liveness analysis code have been straightened out:
      
      1. Variables have an IgnoreLiveness attribute to suppress analysis.  This is currently used only on the esp register.
      
      2. Instructions have a DestNonKillable attribute which causes the Dest variable not to be marked as starting a new live range at that instruction.  This is used when a variable is non-SSA and has more than one assignment within a block, but we want to treat it as a single live range.  This lets the variable have zero or one live range begins or ends within a block.  DestNonKillable is derived automatically for two-address instructions, and annotated manually in a few other cases.
      
      This is tested by comparing the O2 asm output in each Spec2K component.  In theory, the output should be the same except for some differences in pseudo-instructions output as comments.  However, some actual differences showed up, related to the i64 shl instruction followed by trunc to i32.  This turned out to be a liveness bug that was accidentally fixed.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/652633002
  17. 13 Oct, 2014 3 commits