1. 23 Jul, 2015 2 commits
  2. 21 Jul, 2015 5 commits
    • Make ARM RegNames[] static like X86 (no ARM syms in X86-only build). · 0dab0324
      Jan Voung authored
      The X86 code was switch out here:
      https://codereview.chromium.org/1216933015/diff/150001/src/IceTargetLoweringX86Base.h
      
      The important bit might be that it's static const char * instead of
      static IceString. This removes static ctor/dtor for that array,
      which LTO doesn't seem to be able to optimize out, leaving ARM
      and MIPS symbols in the X86-only build. After changing it to static
      const char *, LTO is able to optimize out the ARM and MIPS
      symbols in the x86-only build, saving about 3KB of .text and
      few bytes of .rodata.
      
      BUG=none
      R=jpp@chromium.org
      
      Review URL: https://codereview.chromium.org/1246013004 .
    • Changes the TargetX8632 to inherit from TargetX86Base<TargetX8632>. · 5aeed955
      John Porto authored
      Previously, TargetX8632 was defined as
      
      class TargetX8632 : public TargetLowering;
      
      and its create method would do
      
      TargetX8632 *TargetX8632::create() {
        return TargetX86Base<TargetX8632>::create()
      }
      
      TargetX86Base<M> was defined was
      
      template <class M> class TargetX86Base : public M;
      
      which meant TargetX8632 had no way to access methods defined in
      TargetX86Base<M>. This used to not be a problem, but with the X8664
      backend around the corner it became obvious that the actual TargetX86
      targets (e.g., X8632. X8664SysV, X8664Win) would need access to some
      methods in TargetX86Base (e.g., _mov, _fld, _fstp etc.)
      
      This CL changes the class hierarchy to something like
      
      TargetLowering <-- TargetX86Base<X8632> <-- X8632
                     <-- TargetX86Base<X8664SysV> <-- X8664SysV (TODO)
                     <-- TargetX86Base<X8664Win> <-- X8664Win (TODO)
      
      One problem with this new design is that TargetX86Base<M> needs to be
      able to invoke methods in the actual backends. For example, each
      backend will have its own way of lowering llvm.nacl.read.tp. This
      creates a chicken/egg problem that is solved with (you guessed)
      template machinery (some would call it voodoo.)
      
      In this CL, as a proof of concept, we introduce the
      
         TargetX86Base::dispatchToConcrete
      
      template method. It is a very simple method: it downcasts "this" from
      the template base class (TargetX86Base<TargetX8664>) to the actual
      (concrete) class (TargetX8632), and then it invokes the requested
      method. It uses perfect forwarding for passing arguments to the method
      being invoked, and returns whatever that method returns.
      
      A simple proof-of-concept for using dispatchToConcrete is introduced
      with this CL: it is used to invoke createNaClReadTPSrcOperand on the
      concrete target class. In a way, dispatchToConcrete is a poor man's
      virtual method call, without the virtual method call overhead.
      
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4077
      R=jvoung@chromium.org, stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/1217443024.
    • Only run adv-switch test when asm is allowed. · 8c8f3bc1
      Andrew Scull authored
      BUG=
      R=stichnot@chromium.org, jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/1248823003.
    • Rename legalizeToVar to the more accurate legalizeToReg. · 97f460dc
      Andrew Scull authored
      BUG=
      R=stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/1245063003.
    • Fix --filetype=iasm non-pc-rel fixup offsets (double counted). · b7db1a52
      Jan Voung authored
      For pc-rel fixups, we have a ConstantRelocatable referring
      to Foo+0, and and the offset "-4" is encoded in the code
      buffer (but not the ConstantRelocatable object). Thus we
      need to load from the code buffer in order to
      get that "-4" instead of just taking the +0 from Foo+0.
      
      For non-pc-rel fixups, we have the ConstantRelocatable
      with a true offset, and we also write that offset into the
      code buffer (for ELF REL and not RELA, it expects the
      offset in the code buffer). In this case, we want to choose
      one and not double-count.
      
      BUG=none
      176.gcc seemed to be failing when compiled with --filetype=iasm...
      load address for 64-bit pointers were +8 instead of +4
      
      R=stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/1241313002 .
  3. 20 Jul, 2015 1 commit
    • Introduction of improved switch lowering. · 87f80c12
      Andrew Scull authored
      This includes the high level analysis of switches, the x86 lowering,
      the repointing of targets in jump tables and ASM emission of jump
      tables.
      
      The technique uses jump tables, range test and binary search with
      worst case O(lg n) which improves the previous worst case of O(n)
      from a sequential search.
      
      Use is hidden by the --adv-switch flag as the IAS emission still
      needs to be implemented.
      
      BUG=None
      R=jvoung@chromium.org, stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/1234803007.
  4. 16 Jul, 2015 1 commit
    • Factor out prelowerPhi for 32-bit targets. Disable adv phi lowering for ARM. · 53483691
      Jan Voung authored
      This way, prelowerPhi can be shared between 32-bit targets (split 64-bit
      values into 32-bit ones, and legalize undef). Suggestions from template
      experts on how to share prelowerPhi welcome. I'm not particularly happy
      with the first pass in that legalizeUndef has to be made public (though
      other methods used are also public). Also the methods required from the
      template type TargetT aren't clear without looking through the code.
      
      The current advanced phi lowering code depends on lowerPhiAssignments.
      That is a special case of lowerAssign that does some adhoc register
      allocation. The current adhoc register allocation doesn't work as
      well when a target may need to spill more than one register.
      Disable that optimization for ARM for now, until we have a better
      way that works for ARM, and enable O2 cross testing on ARM.
      
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076
      R=stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/1223133007 .
  5. 15 Jul, 2015 2 commits
    • Factor out legalization of undef, and handle more cases for ARM. · fbdd2440
      Jan Voung authored
      By factoring out legalizeUndef(), we can use the same
      logic in prelowerPhis which may help if we ever change the
      value used (though if we switch from zero-ing out regs to
      using uninitialized regs, it'll take more work -- e.g.,
      can't return a 64-bit reg).
      
      For x86, use legalizeUndef where it's clear that the value
      is immediately fed to loOperand/hiOperand then another
      legalize() call. Otherwise, leave the general
      X = legalize(X); alone where the code is counting on that
      being the sole legalization.
      
      For x86 legalize(const64) is a pass-through, which can then
      be passed to loOperand/hiOperand nicely. However, for ARM,
      legalize(const64) may end up trying to copy the const64 to
      a register, but we don't have 64-bit registers. Instead do
      legalizeUndef(X) where x86 would have just done
      legalize(X). This happens to work because legalizeUndef
      doesn't try to copy to reg, and we immediately pass the
      result to loOperand/hiOperand() which then passes the
      result to a real legalization call.
      
      Add a few more undef tests.
      
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076
      R=stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/1233903002 .
    • Subzero: Fix register encodings. · 728c1d40
      Jim Stichnoth authored
      Specifically, we were ending up with Encoded_Reg_xmm0=0 yet Encoded_Reg_xmm1=10, Encoded_Reg_xmm2=11, etc.
      
      It's a mystery as to why this wasn't triggering any failures with filetype!=asm.
      
      BUG= none
      R=jpp@chromium.org
      
      Review URL: https://codereview.chromium.org/1231973003.
  6. 13 Jul, 2015 1 commit
    • Add an cross include path for ARM to work around clang bug 22937. · 112b6e89
      Jan Voung authored
      Clang appears to be missing an include path to find
      bits/c++config.h so we were unable to compile the
      unsandboxed c++ based cross tests and link against the
      subzero unsandboxed ARM object files.
      
      Work around this for now by finding and including the
      missing path.
      
      Turn on a few ARM cross tests that should be working
      (mem_intrin and test_strengthreduce -- though the
      strength-reduction isn't done for ARM). The test_bitmanip
      still fails, because under Om1 we overflow the stack offset
      and need to materialize that offset with a register first.
      
      Update a few other references that still say x8632.
      
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076
      R=jpp@chromium.org, stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/1232183002 .
  7. 11 Jul, 2015 1 commit
  8. 10 Jul, 2015 1 commit
  9. 09 Jul, 2015 2 commits
  10. 08 Jul, 2015 1 commit
  11. 07 Jul, 2015 1 commit
    • X8632 Templatization completed. · 921856d4
      John Porto authored
      This CL introduces the X86Inst templates. The previous implementation relied on template specialization which did not played nice with the new design. This required a lot of other boilerplate code (i.e., tons of new named constructors, one for each X86Inst.)
      
      This CL also moves X8632 code out of the X86Base{Impl}?.h files so that they are **almost** target agnostic. As we move to adding other X86 targets more methods will be moved to the target-specific trait class (e.g., call/ret/argument lowering.)
      
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4077
      R=jvoung@chromium.org
      
      Review URL: https://codereview.chromium.org/1216933015.
  12. 06 Jul, 2015 3 commits
  13. 30 Jun, 2015 6 commits
  14. 29 Jun, 2015 3 commits
  15. 28 Jun, 2015 1 commit
  16. 27 Jun, 2015 1 commit
  17. 26 Jun, 2015 4 commits
    • Adds X8664 Condition codes. · a054f0ac
      John Porto authored
      Also fixes the X8664 Registers file.
      
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4077
      R=kschimpf@google.com
      
      Review URL: https://codereview.chromium.org/1212393005.
    • Adds the X8664 register definition. · 2b18687b
      John Porto authored
      BUG=
      R=stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/1211103004.
    • Fixes bug on conditional branch where the targets are the same. · c070d6f7
      Karl Schimpf authored
      Fixes constructor InstBr when it is a conditional branch, and the
      true and false branches are the same.
      
      BUG= https://code.google.com/p/nativeclient/issues/detail?id=4212
      R=jpp@chromium.org, stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/1215443002.
    • Function Layout, Global Variable Layout and Pooled Constants Layout Reordering · 7cd5351c
      Qining Lu authored
      PURPOSE:
      The purpose of function layout reordering is to defend against code-reuse attacks as the location of code blocks will be various among different binaries. The layout reordering for global variables and pooled constants can be considered as static data randomization. This is to stop memory corruption attacks by randomizing the locations of the static data. After function layout reordering, the order of function blocks in TEXT section will be randomized. Global variable reordering randomize the order of global variables, and pooled constant reordering randomize the order of pooled constants. Note the order of constant pools won’t be affected and all pooled constants will remain in their original constant pools.
      
      USAGE:
      -reorder-functions: bool type command line option, enables function layout shuffling in TEXT section. Note when -threads=0 is set, function reordering will be forced off.
      
      -reorder-functions-window-size: uint32 type command line option, specify the length of the shuffling queue. Note -reorder-functions-window-size=0 or 1 means no shuffling applied to functions.
      
      -reorder-global-variables: bool type command line option, enables global variables shuffling.
      
      -reorder-pooled-constants: bool type command line option, enables pooled constants shuffling.
      
      APPROACH:
      Randomization is introduced at the code emission time. We use a shuffling method to randomize the emission of function code, global variables and pooled constants. For function code emission, we also introduce “window size” as a parameter to control the size of the function holding buffer for shuffling. Window size 1 and 0 mean no shuffling applied, and a value higher than the number of translated functions means holding all the functions and shuffling them before emitting any of them.
      
      IMPLEMENTATION:
          Function reordering:
              GlobalContext::emitItems(): Call RandomShuffle() routine to shuffle a specific part of the Pending vector.
      
          Global variable reorder:
              GlobalContext::lowerGlobals(const IceString &SectionSuffix): Call RandomShuffle() routine upon declaration list: Globals.
      
          Pooled constant reordering:
              TargetDataX8632::emitConstantPool(GlobalContext *Ctx): Add call to RandomShuffle() to shuffle the constant pool to be emitted. This is for asm output.
      
              ELFObjectWriter::writeConstantPool(Type Tu): Add call to RandomShuffle() to shuffle the constant pool before emitting it. This is only for elf output.
      
      ISSUES:
          The initialization of global variables are emitted along with function code, all of them are considered as EmitterWorkItem. However, we do need to first emit global variables to keep the block profiling workflow untouched. To fulfill this, a “kind” check is added in the while loop of GlobalContext::emitItems(). The “if” statement at line 480 shows the workaround of this issue.
      BUG=
      R=jpp@chromium.org, jvoung@chromium.org, stichnot@chromium.org
      
      Review URL: https://codereview.chromium.org/1206723003.
  18. 25 Jun, 2015 2 commits
  19. 24 Jun, 2015 2 commits