1. 07 Jul, 2014 1 commit
  2. 29 Jun, 2014 1 commit
    • Subzero: Partial implementation of global initializers. · de4ca71e
      Jim Stichnoth authored
      This is still missing a couple things:
      
      1. It only supports flat arrays and zeroinitializers.  Arrays of structs are not yet supported.
      
      2. Initializers can't yet contain relocatables, e.g. the address of another global.Mod
      
      Some changes are made to work around an llvm-mc assembler bug.  When assembling using intel syntax, llvm-mc doesn't correctly parse symbolic constants or add relocation entries in some circumstances.  Call instructions work, and use in a memory operand works, e.g. mov eax, [ArrayBase+4*ecx].  To work around this, we adjust legalize() to not allow ConstantRelocatable by default, except for memory operands and when called from lowerCall(), so the relocatable ends up being the source operand of a mov instruction.  Then, the mov emit routine actually emits an lea instruction for such moves.
      
      A few lit tests needed to be adjusted to make szdiff work properly with respect to global initializers.
      
      In the new cross test, the driver calls test code that returns a pointer to an array with a global initializer, and the driver compares the arrays returned by llc and Subzero.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/358013003
  3. 27 Jun, 2014 1 commit
  4. 26 Jun, 2014 1 commit
  5. 25 Jun, 2014 1 commit
    • Add atomic load/store, fetch_add, fence, and is-lock-free lowering. · 5cd240df
      Jan Voung authored
      Loads/stores w/ type i8, i16, and i32 are converted to
      plain load/store instructions and lowered w/ the plain
      lowerLoad/lowerStore.  Atomic stores are followed by an mfence
      for sequential consistency.
      
      For 64-bit types, use movq to do 64-bit memory
      loads/stores (vs the usual load/store being broken into
      separate 32-bit load/stores). This means bitcasting the
      i64 -> f64, first (which splits the load of the value to be
      stored into two 32-bit ops) then stores in a single op. For
      load, load into f64 then bitcast back to i64 (which splits
      after the atomic load). This follows what GCC does for
      c++11 std::atomic<uint64_t> load/store methods (uses movq
      when -mfpmath=sse). This introduces some redundancy between
      movq and movsd, but the convention seems to be to use movq
      when working with integer quantities. Otherwise, movsd
      could work too. The difference seems to be in whether or
      not the XMM register's upper 64-bits are filled with 0 or
      not. Zero-extending could help avoid partial register
      stalls.
      
      Handle up to i32 fetch_add. TODO: add i64 via a cmpxchg loop.
      
      TODO: add some runnable crosstests to make sure that this
      doesn't do funny things to integer bit patterns that happen
      to look like signaling NaNs and quiet NaNs. However, the system
      clang would not know how to handle "llvm.nacl.*" if we choose to
      target that level directly via .ll files. Or, (a) we use old-school __sync
      methods (sync_fetch_and_add w/ 0 to load) or (b) require buildbot's
      clang/gcc to support c++11...
      
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=3882
      R=stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/342763004
  6. 24 Jun, 2014 1 commit
    • Bitcast of 64-bit immediates may need to split the immediate, not a var. · 1ee34165
      Jan Voung authored
      Currently, the integer immediate is legalized to a
      64-bit integer register first, and then the lower/upper
      parts of that register are used for the bitcast.
      However, mov(64_bit_reg, imm) done by the legalization
      isn't legal.
      
      Similarly, trunc of 64-bit immediates need to take the
      lower half of the immediate, not legalize to a var first.
      
      This shifts the legalization code around.
      
      Other cases where immediates are illegal and legalized
      are idiv/div, but for those cases 64-bit operands are
      handled separately via a function call. The function
      call code properly splits up immediate arguments.
      
      BUG=none
      R=stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/348373005
  7. 18 Jun, 2014 5 commits
  8. 17 Jun, 2014 2 commits
  9. 12 Jun, 2014 2 commits
  10. 06 Jun, 2014 1 commit
    • Make py import not assume dir is "pnacl-subzero". Avoid autovect in crosstest. · 1248a6d1
      Jan Voung authored
      Derek's CL to check out subzero calls the source directory
      "subzero", and the file header comments call the directory
      "subzero". Just make the python sys.path munging for
      importing pydir more generic.
      
      Also change crosstest to not run the raw LLVM "opt" with
      optimizations (only use it for ABI stabilization passes).
      Instead run pnacl-clang with -O2. Otherwise, newer NACL_SDK
      versions include a newer LLVM "opt" binary which
      autovectorizes and may generate vector IR that is not
      handled by Subzero yet.
      
      E.g.,
      LLVM ERROR: Invalid PNaCl instruction:   %1 = insertelement <4 x i32> undef, i32 %0, i32 0
      w/ pepper_canary to version 37, revision 274873
      
      BUG=none
      TEST=make -f Makefile.standalone check
      R=stichnot@chromium.org, wala@chromium.org
      
      Review URL: https://codereview.chromium.org/317963002
  11. 05 Jun, 2014 1 commit
    • Fix a C++ violation. · ab8242ca
      Jim Stichnoth authored
      Ice::Inst::NumberSentinel is defined within the Inst class definition:
      
      class Inst {
        ...
        static const InstNumberT NumberDeleted = -1;
        static const InstNumberT NumberSentinel = 0;
        ...
      };
      
      Under some compilers/options, this causes a link error when passing NumberSentinel as a const T& argument.
      
      (Another option would be to move the actual definitions into IceInst.cpp.)
      
      BUG= none
      R=jfb@chromium.org
      
      Review URL: https://codereview.chromium.org/311243006
  12. 04 Jun, 2014 1 commit
    • Subzero: Initial O2 lowering · d97c7df5
      Jim Stichnoth authored
      Includes the following:
      1. Liveness analysis.
      2. Linear-scan register allocation.
      3. Address mode optimization.
      4. Compare-branch fusing.
      
      All of these depend on liveness analysis.  There are three versions of liveness analysis (in order of increasing cost):
      1. Lightweight.  This computes last-uses for variables local to a single basic block.
      2. Full.  This computes last-uses for all variables based on global dataflow analysis.
      3. Full live ranges.  This computes all last-uses, plus calculates the live range intervals in terms of instruction numbers.  (The live ranges are needed for register allocation.)
      
      For testing the full live range computation, Cfg::validateLiveness() checks every Variable of every Inst and verifies that the current Inst is contained within the Variable's live range.
      
      The cross tests are run with O2 in addition to Om1.
      
      Some of the lit tests (for what good they do) are updated with O2 code sequences.
      
      BUG= none
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/300563003
  13. 02 Jun, 2014 1 commit
  14. 23 May, 2014 2 commits
  15. 22 May, 2014 2 commits
    • Add Makefiles to support building along with LLVM · bc643135
      Derek Schuff authored
      This change now supports building subzero as part of the LLVM build (instead
      of in a separate build step). It is modeled on clang's Makefiles.
      
      The existing Makefile has been renamed and can still be used manually, e.g.
      Make -f Makefile.standalone
      
      It does not yet support running tests, just building.
      
      R=stichnot@chromium.org, jvoung@chromium.org
      BUG=
      
      Review URL: https://codereview.chromium.org/293983007
    • Add Om1 lowering with no optimizations. · 5bc2b1d1
      Jim Stichnoth authored
      This adds infrastructure for low-level x86-32 instructions, and the target lowering patterns.
      
      Practically no optimizations are performed.  Optimizations to be introduced later include liveness analysis, dead-code elimination, global linear-scan register allocation, linear-scan based stack slot coalescing, and compare/branch fusing.  One optimization that is present is simple coalescing of stack slots for variables that are only live within a single basic block.
      
      There are also some fairly comprehensive cross tests.  This testing infrastructure translates bitcode using both Subzero and llc, and a testing harness calls both versions with a variety of "interesting" inputs and compares the results.  Specifically, Arithmetic, Icmp, Fcmp, and Cast instructions are tested this way, across all PNaCl primitive types.
      
      BUG=
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/265703002
  16. 19 May, 2014 1 commit
  17. 29 Apr, 2014 1 commit
    • Initial skeleton of Subzero. · f7c9a141
      Jim Stichnoth authored
      This includes just enough code to build the high-level ICE IR and dump it back out again.  There is a script szdiff.py that does a fuzzy diff of the input and output for verification.  See the comment in szdiff.py for a description of the fuzziness.
      
      Building llvm2ice requires LLVM headers, libs, and tools (e.g. FileCheck) to be present.  These default to something like llvm_i686_linux_work/Release+Asserts/ based on the checked-out and built pnacl-llvm code; I'll try to figure out how to more automatically detect the build configuration.
      
      "make check" runs the lit tests.
      
      This CL has under 2000 lines of "interesting" Ice*.{h,cpp} code, plus 600 lines of llvm2ice.cpp driver code, and the rest is tests.
      
      Here is the high-level mapping of source files to functionality:
      
      IceDefs.h, IceTypes.h, IceTypes.cpp:
      Commonly used types and utilities.
      
      IceCfg.h, IceCfg.cpp:
      Operations at the function level.
      
      IceCfgNode.h, IceCfgNode.cpp:
      Operations on basic blocks (nodes).
      
      IceInst.h, IceInst.cpp:
      Operations on instructions.
      
      IceOperand.h, IceOperand.cpp:
      Operations on operands, such as stack locations, physical registers, and constants.
      
      BUG= none
      R=jfb@chromium.org
      
      Review URL: https://codereview.chromium.org/205613002
  18. 19 Mar, 2014 2 commits