- 30 Jul, 2014 6 commits
-
-
Jim Stichnoth authored
Also cleans up some unneeded table size const static variables. BUG= https://codereview.chromium.org/296053008/ R=jvoung@chromium.org Review URL: https://codereview.chromium.org/428353002
-
Jim Stichnoth authored
Quiet some unused-variable warnings when their only use is in an assert(). Forward-declare partial template specializations when the template method already has a default implementation, to avoid ODR violations and link errors. BUG= https://codereview.chromium.org/296053008/ R=wala@chromium.org Review URL: https://codereview.chromium.org/429993002
-
Jan Voung authored
Speculative fix for Mac GCC build. BUG=none R=dschuff@chromium.org Review URL: https://codereview.chromium.org/432523002
-
Matt Wala authored
* Add initial support for code generation with SSE4.1 instructions. The following operations are affected: - multiplication with v4i32 - select - insertelement - extractelement * Add appropriate lit checks for SSE4.1 instructions. Run the crosstests in both SSE2 and SSE4.1 mode. * Introduce the -mattr flag to llvm2ice to control which instruction set gets used. BUG=none R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/427843002
-
Jan Voung authored
Normally, the FakeUse for preserving the atomic load ends up on the load's Dest. However, for fused load+add, the load is deleted, and its Dest is no longer defined. This trips up the liveness analysis when it happens on a non-entry block. So the FakeUse should be for the add's dest instead, in that case. We have no access to the add, so introduce a getLastInserted() helper. A couple of ways to do that: - modify insert() to track explicitly - rewind from Next one step Either that, or we disable the fusing for atomic loads. BUG= https://code.google.com/p/nativeclient/issues/detail?id=3882 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/417353003
-
Derek Schuff authored
The mac build treats this as an error. R=stichnot@chromium.org Review URL: https://codereview.chromium.org/429253002
-
- 29 Jul, 2014 1 commit
-
-
Jan Voung authored
The cmpxchg instruction already sets ZF for comparing the return value vs the expected value. So there is no need to compare eq again. Lots of pexes-in-the-wild have this pattern. Some compare against a constant, some compare against a variable. BUG=https://code.google.com/p/nativeclient/issues/detail?id=3882 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/413903002
-
- 28 Jul, 2014 2 commits
-
-
Jan Voung authored
(*) PNaCl toolchain_build builds 64-bit libraries for LLVM on Mac. That won't link with subzero code if subzero is built with -m32, so add an option to override the -m32. (*) include locale header (*) Mark xMacroIntegrityCheck unused to avoid clang compiler warning. (*) virtual dtor, for inheritable class (*) Mark compare function const BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/428733003 -
Jim Stichnoth authored
Previously Ostream was a class that wrapped a raw_ostream pointer, structured that way in case we wanted to wrap an alternate stream type. Also, Ostream used to include a Cfg pointer, but that had to go away when the Ostream became associated with the GlobalContext which persists beyond the Cfg lifetime, so the Cfg pointer was removed leaving only the raw_ostream. Since llvm::raw_ostream is supposed to be very lightweight, we can just give up the abstraction and equate it to Ice::Ostream. BUG= none R=kschimpf@google.com Review URL: https://codereview.chromium.org/413393005
-
- 25 Jul, 2014 1 commit
-
-
Matt Wala authored
This avoids using a pair of shufps instructions as the previous lowering was doing. Instead, we use movss to copy the element to be inserted into the lower 32 bits of the destination. Define InstX8632Movss as a Binop, the class to which it properly belongs. BUG=none R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/412353005
-
- 24 Jul, 2014 4 commits
-
-
Matt Wala authored
Most fcmp conditions map directly to single x86 instructions. For these, the lowering is table driven. BUG=none R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/413053002
-
Matt Wala authored
Select of vectors is implemented by appropriately masking and combining the inputs with sign extend / bitwise operations and without the use of branches. BUG=none R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/417653004
-
Matt Wala authored
Change TotalTests so that the test count matches up with the number of recorded passes and failures. BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/415803004
-
Jim Stichnoth authored
We don't need/want to evict an inactive live range when it doesn't overlap with the live range currently being considered. This is especially important for Variables representing scratch registers that are killed by call instructions. These register assignments should obviously never be evicted. Note that the algorithm that computes the min-weight register to evict doesn't consider inactive and non-overlapping live ranges. BUG= https://code.google.com/p/nativeclient/issues/detail?id=3903 R=jvoung@chromium.org Review URL: https://codereview.chromium.org/417933004
-
- 23 Jul, 2014 3 commits
-
-
Matt Wala authored
SSE2 only has signed integer comparison. Unsigned compares are implemented by inverting the sign bits of the operands and doing a signed compare. A common pattern in clang generated IR is a vector compare which generates an i1 vector followed by a sign extension of the result of the compare. The x86 comparison instructions already generate sign extended values, so we can eliminate unnecessary sext operations that follow compares in the IR. BUG=none R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/412593002
-
Jim Stichnoth authored
BUG= none R=wala@chromium.org Review URL: https://codereview.chromium.org/415583003
-
Matt Wala authored
llvm-mc. This fixes the failing validation of callindirect.pnacl.ll. The following tests fail to validate (some due to the addition of -filetype=obj): * convert.ll * globalinit.pnacl.ll * mangle.ll * nacl-atomic-fence-all.ll * shift.ll BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/410743005
-
- 22 Jul, 2014 3 commits
-
-
Matt Wala authored
The source operand to bsr and bsf must be in a register or memory. BUG=none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/407093014
-
Matt Wala authored
Add RUN lines to applicable lit tests to pipe the output of Subzero (in -Om1 and/or -O2 mode) to llvm-mc for validation. Note that the following unit tests fail the validation: * callindirect.pnacl.ll * mangle.ll * nacl-other-intrinsics.ll BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/411693003
-
Matt Wala authored
Add vectors.h and vector.def to hold vector type declarations and useful vector utilities. Change the existing tests to use this new header where applicable (arith, vector_ops). BUG=none R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/407543003
-
- 21 Jul, 2014 1 commit
-
-
Jan Voung authored
Otherwise, there can be a movzx reg, 0, which is illegal, when the memset value is constant 0. BUG= https://code.google.com/p/nativeclient/issues/detail?id=3882 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/402253002
-
- 18 Jul, 2014 4 commits
-
-
Matt Wala authored
Index() % NumElementsInType should be Index() % NumValues. BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/404553007
-
Jan Voung authored
Just copies the current stack pointer to/from a variable. BUG= https://code.google.com/p/nativeclient/issues/detail?id=3882 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/396993009
-
Jan Voung authored
Clump the negate instruction w/ the bswap instruction as an "inplace" operation. One difference is that bswap has stricter requirements the operand type. BUG= https://code.google.com/p/nativeclient/issues/detail?id=3882 R=stichnot@chromium.org, wala@chromium.org Review URL: https://codereview.chromium.org/401533002
-
Matt Wala authored
Use instructions that do the operations in registers and that are available in SSE2. Spill to memory to perform the operation in the absence of any other reasonable options (v16i8 and v16i1). Unfortunately there is no natural class of SSE2 instructions that insertelement / extractelement can get lowered to for all vector types (though pinsr[bwd] and pextr[bwd] are available in SSE4.1). There are in some cases a large number of choices available for lowering and I have not looked into which choices are the best yet, besides using LLVM output as a guide. BUG=none R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/401523003
-
- 17 Jul, 2014 1 commit
-
-
Matt Wala authored
The instructions emitted by the lowering operations require memory operands to be aligned to 16 bytes. Since there is no support for aligning memory operands in Subzero, do the arithmetic in registers for now. Add vector arithmetic to the arith crosstest. Pass the -mstackrealign parameter to the crosstest clang so that llc code called back from Subzero code (helper calls) doesn't assume that the stack is aligned at the entry to the call. BUG=none R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/397833002
-
- 16 Jul, 2014 2 commits
-
-
Matt Wala authored
Impacted instructions: bitcast {v4f32, v4i32, v8i16, v16i8} <-> {v4f32, v4i32, v8i16, v16i8} bitcast v8i1 <-> i8 bitcast v16i1 <-> i16 (There was already code present to handle trivial bitcasts like v16i1 <-> v16i1.) [sz]ext v4i1 -> v4i32 [sz]ext v8i1 -> v8i16 [sz]ext v16i1 -> v16i8 trunc v4i32 -> v4i1 trunc v8i16 -> v8i1 trunc v16i8 -> v16i1 [su]itofp v4i32 -> v4f32 fpto[su]i v4f32 -> v4i32 Where there is a relatively simple lowering to x86 instructions, it has been used. Otherwise a helper call is used. Some lowerings require a materialization of a integer vector with 1s in each entry. Since there is no support for vector constant pools, the constant is materialized purely through register operations. BUG=none R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/383303003 -
Jan Voung authored
We'll need the fallbacks in any case. However, once we've decided on how to specify the CPU features of the user machine we can use the nicer LZCNT/TZCNT/POPCNT as well. Adds cmov, bsf, and bsr instructions. Calls a popcount helper function for machines without SSE4.2. Not handling bswap yet (which can also take i16 params). BUG= https://code.google.com/p/nativeclient/issues/detail?id=3882 R=stichnot@chromium.org, wala@chromium.org Review URL: https://codereview.chromium.org/390443005
-
- 15 Jul, 2014 2 commits
-
-
Matt Wala authored
1) In makeHelperCall(), function pointers that are created should have type IceType_i32, not the functions' own return type. 2) In legalize(), change the name of WillHaveRegister to MustHaveRegister. Add a comment to clarify the condition being computed. 3) In legalize(), add an assert to make sure that vector "constants" don't get legalized (other than undef). There should be no constants of vector type. 4) In copyToReg(), replace an unnecessary use of Src->getType(). BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/385133006
-
Matt Wala authored
The frem operation takes two arguments. Pass both Src0 and Src1 to __frem_v4f32. BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/387153002
-
- 14 Jul, 2014 2 commits
-
-
Jan Voung authored
Now that the name mangling is a bit smarter (from commit: 217dc082), we don't need to avoid having the same type twice in the function signature. BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/389683003
-
Jan Voung authored
64-bit ops are expanded via a cmpxchg8b loop. 64/32-bit and/or/xor are also expanded into a cmpxchg / cmpxchg8b loop. Add a cross test for atomic RMW operations and compare and swap. Misc: Test that atomic.is.lock.free can be optimized out if result is ignored. TODO: * optimize compare and swap with compare+branch further down instruction stream. * optimize atomic RMW when the return value is ignored (adds a locked field to binary ops though). * We may want to do some actual target-dependent basic block splitting + expansion (the instructions inserted by the expansion must reference the pre-colored registers, etc.). Otherwise, we are currently getting by with modeling the extended liveness of the variables used in the loops using fake uses. BUG= https://code.google.com/p/nativeclient/issues/detail?id=3882 R=jfb@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/362463002
-
- 11 Jul, 2014 4 commits
-
-
Matt Wala authored
This adds lowering code for fadd, fsub, fmul, fdiv, and frem. frem, having no native x86 counterpart, is implemented by making a helper call. BUG=none R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/389653002
-
Jim Stichnoth authored
SZZZ_ was being incremented to S0000_ instead of S1000_. BUG= https://codereview.chromium.org/385273002/ R=wala@chromium.org Review URL: https://codereview.chromium.org/390533002
-
Jim Stichnoth authored
https://refspecs.linuxbase.org/cxxabi-1.75.html#mangling-compression describes the mechanism for compressing mangled strings by using substitutions of the form S[0-9A-Z]*_ to represent repeated components. When the prefix is handled as wrapping inside a namespace, the base-36 substitution numbers all have to be incremented. This is implemented in a very simple way by scanning the string only for instances of the substitution pattern. Unfortunately, false matches are possible because the S[0-9A-Z]*_ pattern can be a substring of the type name, or can span other components of the mangled name. Getting this completely right would essentially require a full demangling parser - see the ~4000 lines of code in cxa_demangle.cpp and ItaniumMangle.cpp. Since this is just for testing, any false matches will likely cause a linking error and the test can be rewritten to avoid false matches. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/385273002
-
Karl Schimpf authored
Makes IceTranslator.ExitStatus a boolean (rather than int), and changes code to check flag when done. Fixes bug introduced in https://codereview.chromium.org/387023002. Also cleans up the (Ice) Converter class to handle globals processing, rathe than doing it in llvm2ice.cpp. BUG= https://code.google.com/p/nativeclient/issues/detail?id=3894 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/387023002
-
- 10 Jul, 2014 1 commit
-
-
Jim Stichnoth authored
See the BUG description for more details. In short, the register allocator was inappropriately honoring AllowRegisterOverlap even when the variable's live range overlaps with an Unhandled variable precolored to the preferred register. Also changes legalize() logic to recognize when a variable is guaranteed to ultimately have a physical register due to infinite weight, and not create a new temporary in those cases. Finally, dumps RegisterPreference and AllowRegisterOverlap info for Variables for improved diagnostics. BUG= https://code.google.com/p/nativeclient/issues/detail?id=3897 R=jvoung@chromium.org Review URL: https://codereview.chromium.org/380363002
-
- 09 Jul, 2014 3 commits
-
-
Jim Stichnoth authored
This invokes clang-format-diff.py so you can easily reformat just the code you touched. (Caution, this may not apply to new files.) BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/372133002
-
Matt Wala authored
- Add TargetLowering::lowerArguments() as a new stage in TargetLowering. - Add support for passing arguments/return values in XMM registers in the x86 target. BUG=none R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/372113005
-
Jan Voung authored
Re-used test_arith_main.cpp, mostly to share the set of interesting floating point constants. BUG= https://code.google.com/p/nativeclient/issues/detail?id=3882 R=stichnot@chromium.org, wala@chromium.org Review URL: https://codereview.chromium.org/384443003
-