1. 01 Jun, 2015 1 commit
  2. 27 May, 2015 3 commits
  3. 26 May, 2015 2 commits
  4. 22 May, 2015 2 commits
  5. 19 May, 2015 2 commits
    • Lower a few basic ARM binops for i{8,16,32,64}. · 2971997a
      Jan Voung authored
      Do basic lowering for add, sub, and, or, xor, mul.
      We don't yet take advantage of commuting immediate operands
      (e.g., use rsb to reverse subtract instead of sub) or
      inverting immediate operands (use bic to bit clear instead
      of using and).
      
      The binary operations can set the flags register (e.g., to
      have the carry bit for use with a subsequent adc
      instruction). That is optional for the "data processing"
      instructions.
      
      I'm not yet able to compile 8bit.pnacl.ll and
      64bit.pnacl.ll so 8-bit and 64-bit are not well tested yet.
      Only tests are in the arith.ll file (like arith-opt.ll, but
      assembled instead of testing the "verbose inst" output).
      
      Not doing divide yet. ARM divide by 0 does not trap, but
      PNaCl requires uniform behavior for such bad code. Thus,
      in LLVM we insert a 0 check and would have to do the same.
      
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076
      R=stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/1127003003
    • Subzero: Use cmov to improve lowering for the select instruction. · 537b5ba0
      Jim Stichnoth authored
      This is instead of explicit control flow which may interfere with branch prediction.  However, explicit control flow is still needed for types other than i16 and i32, due to cmov limitations.
      
      The assembler for cmov is extended to allow the non-dest operand to be a memory operand.
      
      The select lowering is getting large enough that it was in our best interest to combine the default lowering with the bool-folding optimization.
      
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4095
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/1125323004
  6. 18 May, 2015 1 commit
  7. 17 May, 2015 1 commit
    • Subzero: Fold icmp into br/select lowering. · a59ae6ff
      Jim Stichnoth authored
      Originally there was a peephole-style optimization in lowerIcmp() that looks ahead to see if the next instruction is a conditional branch with the right properties, and if so, folds the icmp and br into a single lowering sequence.
      
      However, sometimes extra instructions come between the icmp and br instructions, disabling the folding even though it would still be possible.
      
      One thought is to do the folding inside lowerBr() instead of lowerIcmp(), by looking backward for a suitable icmp instruction.  The problem here is that the icmp lowering code may leave lowered instructions that can't easily be dead-code eliminated, e.g. instructions lacking a dest variable.
      
      Instead, before lowering a basic block, we do a prepass on the block to identify folding candidates.  For the icmp/br example, the prepass would tentatively delete the icmp instruction and then the br lowering would fold in the icmp.
      
      This folding can also be extended to several producers:
        icmp (i32 operands), icmp (i64 operands), fcmp, trunc .. to i1
      and several consumers:
        br, select, sext, zext
      
      This CL starts with 2 combinations: icmp32 paired with br & select.  Other combinations will be added in later CLs.
      
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4162
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4095
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/1141213004
  8. 16 May, 2015 1 commit
  9. 14 May, 2015 1 commit
    • Convert Constant->emit() definitions to allow multiple targets to define them. · 76bb0bec
      Jan Voung authored
      Wasn't sure how to allow TargetX8632 and TargetARM32
      to both define "ConstantInteger32::emit(GlobalContext *)",
      and define them differently if both targets happen to be
      ifdef'ed into the code. Rearranged things so that it's now
      "TargetFoo::emit(ConstantInteger32 *)", so that each
      TargetFoo can have a separate definition.
      
      Some targets may allow emitting some types of constants
      while other targets do not (64-bit int for x86-64?).
      Also they emit constants with a different style.
      E.g., the prefix for x86 is "$" while the prefix for ARM
      is "#" and there isn't a prefix for mips(?).
      Renamed emitWithoutDollar to emitWithoutPrefix.
      
      Did this sort of multi-method dispatch via a visitor
      pattern, which is a bit verbose though.
      
      We may be able to remove the emitWithoutDollar/Prefix for
      ConstantPrimitive by just inlining that into the few places
      that need it (only needed for ConstantInteger32). This
      undoes the unreachable methods added by: https://codereview.chromium.org/1017373002/diff/60001/src/IceTargetLoweringX8632.cpp
      The only place extra was for emitting calls to constants.
      There was already an inlined instance for OperandX8632Mem.
      
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076
      R=stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/1129263005
  10. 12 May, 2015 2 commits
  11. 07 May, 2015 1 commit
  12. 04 May, 2015 1 commit
    • Subzero: Use a setcc sequence for better icmp lowering. · f48b320c
      Jim Stichnoth authored
      For an example like:
        %a = icmp eq i32 %b, %c
      
      The original icmp lowering sequence for i8/i16/i32 was something like:
      
        cmpl b, c
        movb 1, a
        je label
        movb 0, a
      label:
      
      The improved sequence is:
        cmpl b, c
        sete a
      
      In O2 mode, this doesn't help when successive compare/branch instructions are fused, but it does help when the boolean result needs to be saved and later used.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/1118353005
  13. 30 Apr, 2015 3 commits
  14. 29 Apr, 2015 1 commit
    • Subzero: Produce actually correct code in --asm-verbose mode. · 76dcf1a8
      Jim Stichnoth authored
      The "pnacl-sz --asm-verbose=1" mode annotates the asm output with physical register liveness information, including which registers are live at the beginning and end of each basic block, and which registers' live ranges end at each instruction.  Computing this information requires a final liveness analysis pass.  One of the side effects of liveness analysis is to remove dead instructions, which happens when the instruction's dest variable is not live and the instruction lacks important side effects.
      
      In some cases, direct manipulation of physical registers was missing extra fakedef/fakeuse/etc., and as as result these instructions could be eliminated, leading to incorrect code.  Without --asm-verbose, these instructions were being created after the last run of liveness analysis, so they had no chance of being eliminated and everything was fine.  But with --asm-verbose, some instructions would be eliminated.
      
      This CL fixes the omissions so that the resulting code is runnable.
      
      An alternative would be to add a flag to liveness analysis directing it not to dead-code eliminate any more instructions.  However, it's better to get the liveness right in case future late-stage optimizations rely on it.
      
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4135
      TEST= pydir/szbuild_spec2k.py --filetype=asm -v --sz=--asm-verbose=1 --force
      R=jvoung@chromium.org, kschimpf@google.com
      
      Review URL: https://codereview.chromium.org/1113683002
  15. 28 Apr, 2015 1 commit
    • Subzero: Fix asm (non-ELF) output files. · 620ad732
      Jim Stichnoth authored
      In an earlier version of Subzero, the text output stream object was
      stack-allocated within main.  A later refactoring moved its allocation
      into a helper function, but it was still being stack-allocated, which
      was bad when the helper function returned.
      
      This change allocates the object via "new", which fixes that problem,
      but reveals another problem: the raw_ostream object for some reason
      doesn't finish writing everything to disk and yielding a truncated
      output file.  This is solved in the style of the ELF streamer, by
      using raw_fd_ostream instead.
      
      BUG= none
      R=kschimpf@google.com
      
      Review URL: https://codereview.chromium.org/1111603003
  16. 22 Apr, 2015 2 commits
  17. 21 Apr, 2015 2 commits
    • Subzero: Improve "make check-unit" execution. · e7e9b024
      Jim Stichnoth authored
      If you switch between "cmake" and "autoconf" toolchain builds, and
      neglect to clean out pnacl_newlib_raw/ in between, the wrong libgtest
      and libgtest_main may get pulled in for the autoconf build, leading to
      an assertion failure in "make check-unit".
      
      This tweak fixes that problem by rejiggering the lib search path.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/1099093005
    • Subzero: Auto-detect cmake versus autoconf LLVM build. · 0a9e1261
      Jim Stichnoth authored
      The CMAKE=1 option is no longer needed.
      
      Pretty much all the tools we need are now in pnacl_newlib_raw/bin, so use PNACL_BIN_PATH set to that instead of using LLVM_BIN_PATH and BINUTILS_BIN_PATH.
      
      However, for the autoconf build, libgtest and libtest_main and clang-format are only under the llvm_x86_64_linux_work directory, so they need special casing.  This also means that you have to actually do an LLVM build and not blow away the work directory in order to "make check-unit" or "make format".
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/1085733002
  18. 16 Apr, 2015 5 commits
  19. 10 Apr, 2015 1 commit
  20. 09 Apr, 2015 2 commits
  21. 07 Apr, 2015 1 commit
  22. 06 Apr, 2015 1 commit
  23. 31 Mar, 2015 1 commit
    • Add argv[0] before parsing commandline flags. · 9c1d3869
      Jan Voung authored
      The \0 delimited string array that the browser sends doesn't have
      the program name and the IRT only tokenizes that and forwards
      it along. We need argv[0] to make the llvm CL parser happy
      (used for -help message, etc).
      
      Alternatively, we could have the IRT fill in a program name
      so that the argv is a real argv. That will involve less copying since
      the argv will be the right size to begin with, but prevents each app
      from customizing its argv[0] =/
      
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4091
      TEST= manual for now (construct the sel_universal script to only pass
      the "--build-atts" flag and see it exits without being swallowed,
      or pass "-Ofoo" and see an error + exit)
      
      R=mtrofin@chromium.org, stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/1041843003
  24. 30 Mar, 2015 1 commit
  25. 29 Mar, 2015 1 commit
    • Subzero: Fix dependency checking to avoid unnecessary rebuilds. · 8c7b0a2c
      Jim Stichnoth authored
      When trying to do bisection debugging, the pnacl-llc translation was happening every time even if the pexe didn't change.  This is because it was checking for a binary called 'llc' in the current directory, instead of an absolute path the pnacl-llc.  (This check is done so that updating pnacl-llc triggers a rebuild of the bisection binary, similar to the check for an update of pnacl-sz.)
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/1044623003