- 27 Aug, 2014 3 commits
-
-
Jim Stichnoth authored
Also adds much-needed logging of the decision process that goes into the address mode optimization. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/490333003
-
Jim Stichnoth authored
With the original link command, -lpthread comes before some other LLVM libraries, and this ends up causing undefined pthreads symbols. The new link command makes sure the -lpthread part comes last. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/514723004
-
Jim Stichnoth authored
Some lowering sequences were incorrectly allowing immediate operands in native instructions. This includes 32-bit icmp, 64-bit icmp, select, switch, and 64-bit mul. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/511543002
-
- 26 Aug, 2014 4 commits
-
-
Jim Stichnoth authored
BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/507813002
-
Jim Stichnoth authored
Add the llvm2ice -sandbox option (false by default) to select between native and sandboxed code generation. Currently, it controls whether the llvm.nacl.read.tp intrinsic is lowered to gs:[0x0] or a call to __nacl_read_tp. Change the asm output slightly for -ffunction-sections so that objdump is more willing to provide a disassembly. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/504963002
-
Jim Stichnoth authored
This was committed as a test, not actually intended. This reverts commit 420e8bf2. BUG= R=dschuff@chromium.org Review URL: https://codereview.chromium.org/504073003
-
Jim Stichnoth authored
Patch from Jim Stichnoth <stichnot@chromium.org>.
-
- 18 Aug, 2014 1 commit
-
-
Jim Stichnoth authored
Background: After lowering each high-level ICE instruction, Om1 calls postLower() to do simple register allocation. It only assigns registers where absolutely necessary, specifically for infinite-weight variables, while honoring pre-coloring decisions. The original Om1 register allocation never tried to reuse registers within a lowered sequence, which was generally OK except for very long lowering sequences, such as call instructions or some intrinsics. In these cases, when it ran out of physical registers, it would just reset the free list and hope for the best, but with no guarantee of correctness. The fix involves keeping track of which instruction in the lowered sequence holds the last use of each variable, and releasing each register back to the free list after its last use. This makes much better use of registers. It's not necessarily optimal, at least with respect to pre-colored variables, since those registers are black-listed even if they don't interfere with an infinite-weight variable. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/483453002
-
- 15 Aug, 2014 2 commits
-
-
Matt Wala authored
Adds command line options -nop-insertion, -nop-insertion-probability=X, and -max-nops-per-instruction=X. BUG=none R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/463563006
-
Matt Wala authored
BUG=none R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/477773003
-
- 14 Aug, 2014 2 commits
-
-
Matt Wala authored
This requires sorting the spilled variables based on alignment and introducing additional padding around the spill location areas. These changes allow vector instructions to accept memory operands. Old stack frame layout: New stack frame layout: +---------------------+ +---------------------+ | return address | | return address | +---------------------+ +---------------------+ | preserved registers | | preserved registers | +---------------------+ +---------------------+ | global spill area | | padding | +---------------------+ +---------------------+ | local spill area | | global spill area | +---------------------+ +---------------------+ | padding | | padding | +---------------------+ +---------------------+ | local variables | | local spill area | +---------------------+ +---------------------+ | padding | +---------------------+ | local variables | +---------------------+ BUG=none R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/465413003 -
Jan Voung authored
Otherwise llvm-mc asserts. This is also the order that llc emits the directives. Change a couple of RUIN -> RUN in lit tests. BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/469973002
-
- 13 Aug, 2014 1 commit
-
-
Jan Voung authored
Mostly to make them a bit more portable across OSes. Otherwise the OS assumed by llvm-mc is the build/host OS. So, on Mac llvm-mc will assume it's targeting darwin and only accepts macho assembler directives. Assembler directives like .rodata.cst8 are not accepted (I'm guessing it uses .cstring, .literal4, etc. instead?). Force an OS (NaCl) so that ELF-related assembler macros make sense. Also remove a now unused function typeIdentString to make clang happy. Example errors: Command 5 Stderr: <stdin>:5:2: error: unknown directive .type fixed_400,@function ^ <stdin>:23:2: error: unknown directive .type variable_n,@function ^ <stdin>:40:11: error: mach-o section specifier uses an unknown section type .section .rodata.cst4,"aM",@progbits,4 ^ <stdin>:42:11: error: mach-o section specifier uses an unknown section type .section .rodata.cst8,"aM",@progbits,8 ^ BUG=none R=stichnot@chromium.org, wala@chromium.org Review URL: https://codereview.chromium.org/467103004
-
- 12 Aug, 2014 4 commits
-
-
Matt Wala authored
Introduce a base class for mov, movq, and movp instruction classes. BUG=none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/466733005
-
Matt Wala authored
Be compatible with the x86-32 calling convention by ensuring that the stack is aligned to 16 bytes at the point of the call instruction. Also ensure that vector arguments passed on the stack are 16 byte aligned. Also, make alloca instructions respect alignment. BUG=none R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/444443002
-
Matt Wala authored
Teach address mode optimization about Base=Base+Const, Base=Const+Base, and Base=Base-Const patterns. Change ConstantInteger::emit() to emit signed values. BUG=none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/459133002
-
Matt Wala authored
STR(inst) should be STR(cmp). BUG=none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/466543002
-
- 08 Aug, 2014 3 commits
-
-
Matt Wala authored
This is inital work necessary for diversification support in Subzero. The random number generator implementation is temporary. It will eventually use a cryptographically secure pseudorandom number generator (perhaps from LLVM, if LLVM gets one). Add the -rng-seed= option to seed the random number generator from the command line. BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/455593004
-
Jim Stichnoth authored
The purpose is to enable bisection debugging of Subzero-translated functions, using objcopy to selectively splice functions from llc and Subzero into the binary. Note that llvm-mc claims to take this argument, but actually does nothing with it, so we need to implement it in Subzero. Also moves the ClFlags object into the GlobalContext so everyone can access it. BUG= none R=wala@chromium.org Review URL: https://codereview.chromium.org/455633002
-
Matt Wala authored
After the changes in CL 443203003, InstX8632Cbwdq fits the template for a UnaryOp, so change it to be in instance of this class. BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/452143003
-
- 07 Aug, 2014 2 commits
-
-
Matt Wala authored
Implement scalarizeArithmetic() which extracts the components of the input vectors, performs the operation with scalar instructions, and builds the output vector component by component. Fix the lowering of sdiv and srem. These were previously emitting a wrong instruction (cdq) for i8 and i16 inputs (needing cbw, cwd). In the test_arith crosstest, mask the inputs to vector shift operations to ensure that the shifts are in range. Otherwise the Subzero output is not identical to the llc output in some (undefined) cases. BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/443203003
-
Jim Stichnoth authored
1. Add 'llvm2ice -disable-globals' to disable Subzero translation of global initializers, since full support isn't yet implemented. 2. Change the names of intra-block branch target labels to avoid collisions with basic block labels. 3. Fix lowering of "br i1 <constant>, label ...", which was producing invalid instructions like "cmp 1, 0". 4. Fix the "make format-diff" operation, which was diffing against the wrong target. BUG= none R=wala@chromium.org Review URL: https://codereview.chromium.org/449093002
-
- 05 Aug, 2014 1 commit
-
-
Jim Stichnoth authored
1. It turns out that the crosstest scripts mix different versions of clang - build_pnacl_ir.py uses pnacl-clang from the NaCl SDK for the tests, while crosstest.py uses clang/clang++ from LLVM_BIN_PATH for the driver. The SDK has been updated to use a different version of the standard library, and now there is a mismatch as to whether int8_t is typedef'd to 'char' or 'signed char', leading to name mangling mismatches. (char, signed char, and unsigned char are distinct types.) We deal with this by using myint8_t which is explicitly defined as signed char. 2. Some ugly function pointer casting in test_arith_main.cpp is fixed/removed. 3. std::endl is replaced with "\n". 4. License text is added to tests that were touched by the above items. BUG= none R=wala@chromium.org Review URL: https://codereview.chromium.org/435353002
-
- 31 Jul, 2014 1 commit
-
-
Matt Wala authored
1. Much of the lowering code for vector operations was not properly checking that the input operand was in a register or memory. This problem could be exhibited by passing undef values as inputs. => Change the vector legalization code to legalize input operands to register or memory before producing instructions that use the operands. Also, append a suffix to the variable names in the vector legalization code to clarify the legalization status of the values. 2. Undef values should never be emitted directly. Rather, they should have been appropriately legalized to a zero value. => To enforce this, make ConstantUndef::emit() issue an error message. Do this in the x86 backend, as other backends may decide to treat undef values differently. 3. The regalloc_evict_non_overlap test was loading from an undef pointer. Subzero was not handling this correctly (the undef pointer was being emitted without being legalized), but it does not have to handle this case since PNaCl IR disallows undef pointers. => Fix the regalloc_evict_non_overlap test to use an inttoptr instead of directly loading from the undef pointer. Also, add an assert in IceTargetLoweringX8632::FormMemoryOperand() to make sure that undef pointers are never encountered. BUG=none R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/432613002
-
- 30 Jul, 2014 6 commits
-
-
Jim Stichnoth authored
Also cleans up some unneeded table size const static variables. BUG= https://codereview.chromium.org/296053008/ R=jvoung@chromium.org Review URL: https://codereview.chromium.org/428353002
-
Jim Stichnoth authored
Quiet some unused-variable warnings when their only use is in an assert(). Forward-declare partial template specializations when the template method already has a default implementation, to avoid ODR violations and link errors. BUG= https://codereview.chromium.org/296053008/ R=wala@chromium.org Review URL: https://codereview.chromium.org/429993002
-
Jan Voung authored
Speculative fix for Mac GCC build. BUG=none R=dschuff@chromium.org Review URL: https://codereview.chromium.org/432523002
-
Matt Wala authored
* Add initial support for code generation with SSE4.1 instructions. The following operations are affected: - multiplication with v4i32 - select - insertelement - extractelement * Add appropriate lit checks for SSE4.1 instructions. Run the crosstests in both SSE2 and SSE4.1 mode. * Introduce the -mattr flag to llvm2ice to control which instruction set gets used. BUG=none R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/427843002
-
Jan Voung authored
Normally, the FakeUse for preserving the atomic load ends up on the load's Dest. However, for fused load+add, the load is deleted, and its Dest is no longer defined. This trips up the liveness analysis when it happens on a non-entry block. So the FakeUse should be for the add's dest instead, in that case. We have no access to the add, so introduce a getLastInserted() helper. A couple of ways to do that: - modify insert() to track explicitly - rewind from Next one step Either that, or we disable the fusing for atomic loads. BUG= https://code.google.com/p/nativeclient/issues/detail?id=3882 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/417353003
-
Derek Schuff authored
The mac build treats this as an error. R=stichnot@chromium.org Review URL: https://codereview.chromium.org/429253002
-
- 29 Jul, 2014 1 commit
-
-
Jan Voung authored
The cmpxchg instruction already sets ZF for comparing the return value vs the expected value. So there is no need to compare eq again. Lots of pexes-in-the-wild have this pattern. Some compare against a constant, some compare against a variable. BUG=https://code.google.com/p/nativeclient/issues/detail?id=3882 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/413903002
-
- 28 Jul, 2014 2 commits
-
-
Jan Voung authored
(*) PNaCl toolchain_build builds 64-bit libraries for LLVM on Mac. That won't link with subzero code if subzero is built with -m32, so add an option to override the -m32. (*) include locale header (*) Mark xMacroIntegrityCheck unused to avoid clang compiler warning. (*) virtual dtor, for inheritable class (*) Mark compare function const BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/428733003 -
Jim Stichnoth authored
Previously Ostream was a class that wrapped a raw_ostream pointer, structured that way in case we wanted to wrap an alternate stream type. Also, Ostream used to include a Cfg pointer, but that had to go away when the Ostream became associated with the GlobalContext which persists beyond the Cfg lifetime, so the Cfg pointer was removed leaving only the raw_ostream. Since llvm::raw_ostream is supposed to be very lightweight, we can just give up the abstraction and equate it to Ice::Ostream. BUG= none R=kschimpf@google.com Review URL: https://codereview.chromium.org/413393005
-
- 25 Jul, 2014 1 commit
-
-
Matt Wala authored
This avoids using a pair of shufps instructions as the previous lowering was doing. Instead, we use movss to copy the element to be inserted into the lower 32 bits of the destination. Define InstX8632Movss as a Binop, the class to which it properly belongs. BUG=none R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/412353005
-
- 24 Jul, 2014 4 commits
-
-
Matt Wala authored
Most fcmp conditions map directly to single x86 instructions. For these, the lowering is table driven. BUG=none R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/413053002
-
Matt Wala authored
Select of vectors is implemented by appropriately masking and combining the inputs with sign extend / bitwise operations and without the use of branches. BUG=none R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/417653004
-
Matt Wala authored
Change TotalTests so that the test count matches up with the number of recorded passes and failures. BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/415803004
-
Jim Stichnoth authored
We don't need/want to evict an inactive live range when it doesn't overlap with the live range currently being considered. This is especially important for Variables representing scratch registers that are killed by call instructions. These register assignments should obviously never be evicted. Note that the algorithm that computes the min-weight register to evict doesn't consider inactive and non-overlapping live ranges. BUG= https://code.google.com/p/nativeclient/issues/detail?id=3903 R=jvoung@chromium.org Review URL: https://codereview.chromium.org/417933004
-
- 23 Jul, 2014 2 commits
-
-
Matt Wala authored
SSE2 only has signed integer comparison. Unsigned compares are implemented by inverting the sign bits of the operands and doing a signed compare. A common pattern in clang generated IR is a vector compare which generates an i1 vector followed by a sign extension of the result of the compare. The x86 comparison instructions already generate sign extended values, so we can eliminate unnecessary sext operations that follow compares in the IR. BUG=none R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/412593002
-
Jim Stichnoth authored
BUG= none R=wala@chromium.org Review URL: https://codereview.chromium.org/415583003
-