1. 09 Apr, 2015 1 commit
  2. 07 Apr, 2015 1 commit
  3. 06 Apr, 2015 1 commit
  4. 31 Mar, 2015 1 commit
    • Add argv[0] before parsing commandline flags. · 9c1d3869
      Jan Voung authored
      The \0 delimited string array that the browser sends doesn't have
      the program name and the IRT only tokenizes that and forwards
      it along. We need argv[0] to make the llvm CL parser happy
      (used for -help message, etc).
      
      Alternatively, we could have the IRT fill in a program name
      so that the argv is a real argv. That will involve less copying since
      the argv will be the right size to begin with, but prevents each app
      from customizing its argv[0] =/
      
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4091
      TEST= manual for now (construct the sel_universal script to only pass
      the "--build-atts" flag and see it exits without being swallowed,
      or pass "-Ofoo" and see an error + exit)
      
      R=mtrofin@chromium.org, stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/1041843003
  5. 30 Mar, 2015 1 commit
  6. 29 Mar, 2015 1 commit
    • Subzero: Fix dependency checking to avoid unnecessary rebuilds. · 8c7b0a2c
      Jim Stichnoth authored
      When trying to do bisection debugging, the pnacl-llc translation was happening every time even if the pexe didn't change.  This is because it was checking for a binary called 'llc' in the current directory, instead of an absolute path the pnacl-llc.  (This check is done so that updating pnacl-llc triggers a rebuild of the bisection binary, similar to the check for an update of pnacl-sz.)
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/1044623003
  7. 27 Mar, 2015 1 commit
    • Refactor Subzero initialization and add a browser callback handler. · 44c3a804
      Jan Voung authored
      Handlers are represented as a "compile server" even though
      right now it can really only handle a single
      compile request.
      
      Then there can be a commandline-based server and a
      browser-based server. This server takes over the main
      thread. In the browser-based case the server can block,
      waiting on bytes to be pushed. This becomes a producer of
      bitcode bytes.
      
      The original main thread which did bitcode reading is now
      shifted to yet another worker thread, which is then the
      consumer of bitcode bytes.
      
      This uses an IRT interface for listening to messages
      from the browser:
      https://codereview.chromium.org/984713003/
      
      TEST=Build the IRT core nexe w/ the above patch and compile w/ something like:
      
      echo """
      readwrite_file objfile /tmp/temp.nexe---gcc.opt.stripped.pexe---.o
      rpc StreamInitWithSplit i(4) h(objfile) h(invalid) h(invalid) h(invalid) h(invalid) h(invalid) h(invalid) h(invalid) h(invalid) h(invalid) h(invalid) h(invalid) h(invalid) h(invalid) h(invalid) h(invalid) C(4,-O2\x00) * s()
      stream_file /usr/local/google/home/jvoung/pexe_tests/gcc.opt.stripped.pexe 65536 1000000000
      rpc StreamEnd * i() s() s() s()
      echo "pnacl-sz complete"
      """ | scons-out/opt-linux-x86-32/staging/sel_universal \
          -a -B scons-out/nacl_irt-x86-32/staging/irt_core.nexe \
          --abort_on_error \
          -- toolchain/linux_x86/pnacl_translator/translator/x86-32/bin/pnacl-sz.nexe
      
      echo """
      readwrite_file nexefile /tmp/temp.nexe.tmp
      readonly_file objfile0 /tmp/temp.nexe---gcc.opt.stripped.pexe---.o
      rpc RunWithSplit i(1) h(objfile0) h(invalid) h(invalid) h(invalid) h(invalid) h(invalid) h(invalid) h(invalid) h(invalid) h(invalid) h(invalid) h(invalid) h(invalid) h(invalid) h(invalid) h(invalid) h(nexefile) *
      echo "ld complete"
      """ | /usr/local/google/home/nacl3/native_client/scons-out/opt-linux-x86-32/staging/sel_universal \
          --abort_on_error \
          -a -B \
          scons-out/nacl_irt-x86-32/staging/irt_core.nexe \
          -E NACL_IRT_OPEN_RESOURCE_BASE=toolchain/linux_x86/pnacl_translator/translator/x86-32/lib/ \
          -E NACL_IRT_OPEN_RESOURCE_REMAP=libpnacl_irt_shim.a:libpnacl_irt_shim_dummy.a \
          -- toolchain/linux_x86/pnacl_translator/translator/x86-32/bin/ld.nexe
      
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4091
      R=kschimpf@google.com, stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/997773002
  8. 24 Mar, 2015 2 commits
    • Subzero: Fix a lowering bug involving xchg and xadd instructions. · 0e432ac4
      Jim Stichnoth authored
      The x86-32 xchg and xadd instructions are modeled using two source operands, one of which is a memory operand and the other ultimately a physical register.  These instructions have a side effect of modifying both operands.
      
      During lowering, we need to specially express that the instruction modifies the Variable operand (since it doesn't appear as the instruction's Dest variable).  This makes the register allocator aware of the Variable being multi-def, and prevents it from sharing a register with an overlapping live range.
      
      This was being partially expressed by adding a FakeDef instruction.  However, FakeDef instructions are still allowed to be dead-code eliminated, and if this happens, the Variable may appear to be single-def, triggering the unsafe register sharing.
      
      The solution is to prevent the FakeDef instruction from being eliminated, via a FakeUse instruction.
      
      It turns out that the current register allocator isn't aggressive enough to manifest the bug with cmpxchg instructions, but the fix and tests are there just in case.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/1020853011
    • Make compile without ICE_THREAD_LOCAL_HACK (avoid "Type *TLS = TLS;") · 3e5009f6
      Jan Voung authored
      Otherwise you get:
      
      In file included from src/IceGlobalContext.cpp:21:
      In file included from src/IceCfg.h:21:
      src/IceGlobalContext.h:257:44: error: variable 'TLS' is uninitialized when used within its own initialization [-Werror,-Wuninitialized]
          ThreadContext *TLS = ICE_TLS_GET_FIELD(TLS);
                         ~~~                     ^~~
      src/IceTLS.h:95:39: note: expanded from macro 'ICE_TLS_GET_FIELD'
                                            ^
      So rename the local var to Tls.
      
      BUG=none
      R=stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/1030793002
  9. 23 Mar, 2015 3 commits
    • Subzero: Don't use key SSE instructions on potentially unaligned loads. · f79d2cb6
      Jim Stichnoth authored
      The non-mov-like SSE instructions generally require 16-byte aligned memory operands.  The PNaCl bitcode ABI only guarantees 4-byte alignment or less on vector loads and stores.  Subzero maintains stack alignment so stack memory operands are fine.
      
      We handle this by legalizing memory operands into a register wherever there is doubt.
      
      This bug was first discovered on the vector_align scons test.
      
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4083
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4133
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/1024253003
    • Subzero: Prune unreachable nodes after constructing the Cfg. · 69d3f9c6
      Jim Stichnoth authored
      The gcc torture test suite has examples where there is a function call (to a routine that throws an exception or aborts or something), followed by an "unreachable" instruction, followed by more code that may e.g. return a value to the caller.  In these examples, the code following the unreachable is itself unreachable.
      
      Problems arise when the unreachable code references a variable defined in the reachable code.  This triggers a liveness consistency error because the use of the variable has no reaching definition.
      
      It's a bit surprising that LLVM actually allows this, but it does so we need to deal with it.
      
      The solution is, after initial CFG construction, do a traversal starting from the entry node and then delete any undiscovered nodes.
      
      There is code in Subzero that assumes Cfg::Nodes[i]->Number == i, so the nodes need to be renumbered after pruning.  The alternative was to set Nodes[i]=nullptr and not change the node number, but that would mean peppering the code base with CfgNode null checks.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/1027933002
    • Subzero: Fix inappropriate use of nullptr. · 27c56bf6
      Jim Stichnoth authored
      When lowering of a couple of atomic intrinsics down to a loop structure, a FakeUse on the memory address's base variable is created.  However, if the memory address is a global constant, there is no base variable.  So check for that and don't create a FakeUse if there is none.
      
      BUG= none
      TEST=synchronization_sync (scons test)
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/1023673007
  10. 20 Mar, 2015 3 commits
  11. 19 Mar, 2015 3 commits
    • Subzero: Fix floating-point constant pooling. · 5bfe2157
      Jim Stichnoth authored
      This fixes a regression likely introduced in d2cb4361 .
      
      The problem is that by using the default std::unordered_map comparison predicate std::equal_to, we get incorrect behavior when the key is float or double:
      
      1. 0.0 and -0.0 appear equal, so they share a constant pool entry even though the bit patterns are different.  This is a correctness bug.
      
      2. Each instance of NaN gets a separate constant pool entry, because NaN != NaN by C equality rules.  This is a performance bug.  (This problem doesn't show up with the native bitcode reader, because constants are already unique-ified in the PNaCl bitcode file.)
      
      The solution is to use memcmp for floating-point key types.
      
      Also, the abi-atomics.ll test is disabled for the MINIMAL build, to fix an oversight from a previous CL.
      
      BUG= none
      R=jfb@chromium.org
      
      Review URL: https://codereview.chromium.org/1019233002
    • Subzero: Add fabs intrinsic support. · 8c980d0d
      Jim Stichnoth authored
      The intrinsic is lowered using the standard technique of masking off the FP sign bit, which is the high-order bit.
      
      To construct this mask, we use the existing trick of loading a vector register with all "1" bits, then logical-shift-right by one bit.
      
      In the future, we should add 128-bit vector values to the constant pool and force them to memory, and this could be used for the other routines that synthesize a vector constant.
      
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4097
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/1022573004
    • Assemble calls to constant addresses. · f644a4b3
      Jan Voung authored
      Finally address this TODO in the assembler. This will help translate
      non-IRT using programs (no ABI stability). The default scons testing
      mode is non-IRT, so this helps with that. I haven't actually tested
      this against scons yet, but I'm filling in the tests based on how
      LLVM translates the same bitcode.
      
      The filetype=asm is adjusted to omit the "*" and the "$".
      
      The filetype=obj is adjusted to check for fixups with NullSymbols,
      and also fill the assembler buffer at the instruction's immediate
      field w/ the right constant.
      
      The filetype=iasm is still TODO (hits an new assert in the Fixup's emit() function).
      
      Reverts 7ad1bed9:
      "Allow stubbing of called constant addresses using command line argument."
      since this is now handled (except for iasm).
      
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4080
      R=stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/1017373002
  12. 18 Mar, 2015 2 commits
  13. 13 Mar, 2015 1 commit
  14. 12 Mar, 2015 1 commit
  15. 10 Mar, 2015 2 commits
    • Subzero: Enable a cmake build. · cd912149
      Jim Stichnoth authored
      This just puts the CMakeLists.txt file in place.  A couple other
      changes are needed in other repos to make this take effect.
      
      BUG= none
      R=dschuff@chromium.org, mtrofin@chromium.org
      
      Review URL: https://codereview.chromium.org/998693003
    • Subzero: Run cross tests as a much more configurable python script. · dc7c597e
      Jim Stichnoth authored
      The runtests.sh script is removed and replaced with crosstest_generator.py.
      
      "make check" limits to a relevant subset of cross tests to control the combinatorial explosion.  We cut the native tests almost in half, and the sandboxed tests down to a quarter.
      
      The --include and --exclude logic is copied/adapted from szbuild.py.
      
      The script works by running through every possible test in the combinatorial explosion, and if the test is a match against the --include and --exclude arguments, the test is built and run.
      
      The script includes lit support, which is the most likely way it will be run.  When run with the --lit argument, it sprays the output directory with lit test files in the form of shell scripts, and "make check" runs lit on that directory.
      
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4085
      R=jvoung@chromium.org, mtrofin@chromium.org
      
      Review URL: https://codereview.chromium.org/987503004
  16. 06 Mar, 2015 1 commit
  17. 05 Mar, 2015 1 commit
  18. 04 Mar, 2015 3 commits
  19. 03 Mar, 2015 2 commits
    • Ignore NaCl st_blksize of 0 and buffer writes to raw_fd_ostream. · 437ceff2
      Jan Voung authored
      The default LLVM raw_fd_ostream buffer size is based on stat'ing the
      FD and then checking st_blksize. Unfortunately, in the NaCl sandboxed
      build of pnacl-sz, NaCl's syscall returns st_blksize of 0 which makes
      the writes unbuffered. There is a comment in "src/trusted/service_runtime/include/bits/stat.h":
      
        nacl_abi_blksize_t nacl_abi_st_blksize;   /* not implemented */
      
      And the " src/trusted/desc/" implementation sets this to 0.
      
      This results in half a million write syscalls to translate the GCC pexe,
      which roughly doubles the translation time in sandboxed mode vs
      unsandboxed mode.
      
      Manually set a buffer size (Linux st_blksize seems to be about
      4KB for comparison). This drops the number of write syscalls
      to about 200 for translating the GCC pexe.
      
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4091
      R=stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/969403003
    • Subzero: Fix a register allocation issue for "advanced phi lowering". · 5bc44313
      Jim Stichnoth authored
      When the advanced phi lowering handles a phi arg that is Undef, it lowers it to an assignment of a constant zero (or vector of zeroes).  The ad-hoc register allocation was missing the fact that a vector of zeroes is done with "pxor %reg, %reg".  This resulted in a pxor instruction with invalid addressing modes at emission time.
      
      The fix is to tell legalize() to use the dest physical register if dest has one; and if dest lacks a register, take the path where it actually does the ad-hoc register allocation as though the source operand were a memory operand.
      
      Tests are added for these vector undef phi args, and for scalar undef phi args as well for good measure.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/969703002
  20. 02 Mar, 2015 2 commits
  21. 26 Feb, 2015 3 commits
  22. 25 Feb, 2015 3 commits
  23. 24 Feb, 2015 1 commit