- 11 Jun, 2015 4 commits
-
-
John Porto authored
Adjusts the expected unittest output. BUG= None R=kschimpf@google.com, stichnot@chromium.org Review URL: https://codereview.chromium.org/1173353003.
-
Jim Stichnoth authored
BUG= https://code.google.com/p/nativeclient/issues/detail?id=4167 Move issue https://codereview.chromium.org/1159823004/ here so that it's under the proper email. Review URL: https://codereview.chromium.org/1169533003
-
Jim Stichnoth authored
1. The data symbol __Sz_block_profile_info should never be mangled (for cross tests), similar to runtime helper calls. Add a SuppressMangling override for such variable declarations. 2. When cross tests contain more than one translated object file, we end up with multiple definitions of __Sz_block_profile_info . Work around this by making that symbol weak. 3. Don't try to attach global inits to an EmitterWorkItem that represents a translation error. 4. Update one lit test to reflect the additional profiling value in the data section. 5. Update one lit test to reflect that global initializers are emitted at the end instead of the beginning. The check-unit test is still broken and will be fixed in a separate CL. BUG= none R=kschimpf@google.com Review URL: https://codereview.chromium.org/1180883002
-
John Porto authored
Initializes IceAssembler::Allocator before IceAssembler::Buffer. BUG= None R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1177843006.
-
- 10 Jun, 2015 2 commits
-
-
John Porto authored
Renames the assembler* files to IceAssembler*. Fixes whatever breaks. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4077 R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1179563004.
-
John Porto authored
BUG= None R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1147023007.
-
- 08 Jun, 2015 1 commit
-
-
Karl Schimpf authored
Simplify the munging unit tests to follow the new NaCl utilities for munging tests. Note that this CL takes advantage of changes added by CL https://codereview.chromium.org/1140153004 BUG=None R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1149423011
-
- 05 Jun, 2015 4 commits
-
-
John Porto authored
Merge branch 'master' of https://chromium.googlesource.com/native_client/pnacl-subzero into subzero-ownership
-
John Porto authored
BUG= None R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1149213006
-
John Porto authored
-
Jan Voung authored
Sext, etc. usually uses shifts (especially for i1 and i64) so implement shift, then implement those casts. Implement just enough of bitcast to handle accessing global addresses (used by some tests). Otherwise, most other bitcasts are from GPR to FP and FP regs aren't modeled yet. Generally following the GCC style for 64-bit shifts. This takes advantage of the flexible second operand in a "orr", and takes advantage of the shift-beyond bitwidth saturation. LLVM is almost the same, but only seems to take advantage on one side of the 32-bits, not the other side. Should really get some of the execution tests running to test this behavior! Fix InstARM32Str::dump(). Str doesn't have a Dest, so use Src. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1143323013
-
- 04 Jun, 2015 2 commits
-
-
Jim Stichnoth authored
Previously, the legalize() function would always force a floating point constant into an xmm register before it could be used in an instruction. This uses an extra register unnecessarily when the instruction allows a memory operand for that operand. We improve this by lowering the FP constant operand to an OperandX8632Mem that wraps a ConstantRelocatable representing the label for the constant pool entry, e.g. [.L$float$0]. (This may end up being copied into an xmm register if the instruction doesn't allow a memory operand for that operand.) BUG= https://code.google.com/p/nativeclient/issues/detail?id=4095 R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1163943005
-
Jan Voung authored
The input object may be a QueueStreamer, which the compile server will still have a reference to (even though downstream the memory object API and parser API thinks it has a unique_ptr). Terminate the thread quickly on error, instead of free'ing and causing a use-after-free. Also set up a report_fatal_error handler which has access to the server's state. This allows the server to record the error and stop pushing bytes to the QueueStreamer. Otherwise the QueueStreamer can get full without a consumer still active to unblock. Unfortunately the fatal error handler only terminates the current thread, and not all worker threads. NaCl doesn't have support for signals or pthread_kill. E.g., with pthread_kill(std_thread.native_handle(), SIGABRT). So, other worker/emitter threads will have to hang waiting on more input or something. Random clang-format edits from 3.7. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4163 TEST= tbd: I manually ran the translator a dummy text file (invalid bitcode header), and observed that this no longer crashes. Instead the SRPC calls finish and I see: 3> [17812,4147750656:14:23:02.025382] Streaming file at 100000 bps [17812,4147750656:14:23:12.511574] RPC call failed: Rpc application returned an error. [17812,4147750656:14:23:12.511625] StreamChunk failed [17812,4147750656:14:23:12.511655] stream_file: SendDataChunk failed, but returning without failing. Expect call to StreamEnd.4> rpc call initiated StreamEnd::isss [17812,4147750656:14:23:12.511931] RPC call failed: Rpc application returned an error. rpc call complete StreamEnd::isss output 0: i(0) output 1: s("") output 2: s("") output 3: s("Invalid PNaCl bitcode header") [17812,4147750656:14:23:12.512102] Command [rpc] failed. R=kschimpf@google.com, stichnot@chromium.org Review URL: https://codereview.chromium.org/1168543002
-
- 03 Jun, 2015 2 commits
-
-
Jim Stichnoth authored
This is turned into a separate (O2-only) pass that looks for opportunities: 1. A Load instruction, or an AtomicLoad intrinsic that would be lowered just like a Load instruction 2. Followed immediately by an instruction with a whitelisted kind that uses the Load dest variable as one of its operands 3. Where the whitelisted instruction ends the live range of the Load dest variable. In such cases, the original two instructions are deleted and a new instruction is added that folds the load into the whitelisted instruction. We also do some work to splice the liveness information (Inst::LiveRangesEnded and Inst::isLastUse()) into the new instruction, so that the target lowering pass might still take advantage. Currently this is used quite sparingly, but in the future we could use that along with operator commutativity to choose among different lowering sequences to reduce register pressure. The whitelisted instruction kinds are chosen based primarily on whether the main operation's native instruction can use a memory operand - e.g., arithmetic (add/sub/imul/etc), compare (cmp/ucomiss), cast (movsx/movzx/etc). Notably, call and ret are not included because arg passing is done through simple assignments which normal lowering is sufficient for. BUG= none R=jvoung@chromium.org, mtrofin@chromium.org Review URL: https://codereview.chromium.org/1169493002
-
Jim Stichnoth authored
BUG= none R=jvoung@chromium.org, kschimpf@google.com Review URL: https://codereview.chromium.org/1162903003
-
- 02 Jun, 2015 1 commit
-
-
Jan Voung authored
Thought leaving "mov" simple and not handle memory operands, but then we'd have to duplicate some of the lowerAssign code for lowerLoad =/ BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076 R=kschimpf@google.com, stichnot@chromium.org Review URL: https://codereview.chromium.org/1152703006
-
- 01 Jun, 2015 4 commits
-
-
Jim Stichnoth authored
1. Change Makefile.standalone from 3.6 to 3.7. 2. Update to new load instruction .ll syntax. This includes changing InstLoad::dump() to match. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1161543005
-
Jim Stichnoth authored
BUG= none R=kschimpf@google.com Review URL: https://codereview.chromium.org/1161353002
-
Jan Voung authored
Split out some of the addProlog code from x86 and reuse that for ARM. Mainly, the code that doesn't concern preserved registers or stack arguments is split out. ARM push and pop take a whole list of registers (not necessarily consecutive, but should be in ascending order). There is also "vpush" for callee-saved float/vector registers but we do not handle that yet (the register numbers for that have to be consecutive). Enable some of the int-arg.ll tests, which relied on addPrologue's finishArgumentLowering to pull from the correct argument stack slot. Test some of the frame pointer usage (push/pop) when handling a variable sized alloca. Also change the classification of LR, and PC so that they are not "CalleeSave". We don't want to push LR if it isn't overwritten by another call. It will certainly be "used" by the return however. The prologue code only checks if a CalleeSave register is used somewhere before deciding to preserve it. We could make that stricter and check if the register is also written to, but there are some additional writes that are not visible till after the push/pop are generated (e.g., copy from argument stack slot to the argument register). Instead, keep checking use only, and handle LR as a special case (IsLeafFunction). BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1159013002
-
Jim Stichnoth authored
This is similar to the way a load instruction may be folded into the next arithmetic instruction. Usually the effect is to improve a sequence like: mov ax, WORD PTR [mem] movsx eax, ax into this: movsx eax, WORD PTR [mem] without actually improving register allocation, though other kinds of casts may have different improvements. Existing tests needed to be fixed when they "inadvertently" did a cast to i32 return type and triggered the optimization when it wasn't wanted. These were fixed by inserting a "dummy" instruction between the load and the cast. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4095 R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1152783006
-
- 27 May, 2015 3 commits
-
-
Jan Voung authored
So far we've been using ldr/str (32-bit) to load/store the whole stack slot, independent of the variable type. Toggle on some tests that didn't have an Om1 variant previously. Didn't toggle everything since there are still some problems with liveness from code being unimplemented. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1144923008
-
Jim Stichnoth authored
It turns out that code deleted in 9a05aea8 actually had a legitimate purpose, so it is added back, this time with more extensive comments justifying it. Also, takes the instruction's IsDestNonKillable flag into account when updating the live register usage count (along with extra comments on why that is necessary). Furthermore, removes an unnecessary assert that otherwise fails when --asm-verbose is used with --filetype=iasm or --filetype-obj. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1158113002
-
Jan Voung authored
Might have gotten replaced by some other field, but don't quite remember. Spotted while looking for ways to share the addProlog() code between targets. BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1158713005
-
- 26 May, 2015 2 commits
-
-
Jim Stichnoth authored
Fixes a bug where a num-uses counter wasn't being updated because of C operator && semantics. The code was something like "if (A && --B) ..." but we want --B to happen even when A is false. Sorts the LiveIn and LiveOut lists by regnum so that the lists always display the set of registers in a consistent/familiar order. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1152813003
-
Jan Voung authored
Lower alloca in a way similar to x86. Subtract the stack and align if needed, then copy that stack address to dest. Sometimes use "bic" for the mask, sometimes use "and", depending on what fits better. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1156713003
-
- 22 May, 2015 2 commits
-
-
Jan Voung authored
Allow instructions to be predicated and use that in lower icmp and branch. Tracking the predicate for almost every instruction is a bit overkill, but technically possible. Add that to most of the instruction constructors except ret and call for now. This doesn't yet do compare + branch fusing, but it does handle the branch fallthrough to avoid branching twice. I can't yet test 8bit and 16bit, since those come from "trunc" and "trunc" is not lowered yet (or load, which also isn't handled yet). Adds basic "call(void)" lowering, just to get the call markers showing up in tests. 64bit.pnacl.ll no longer explodes with liveness consistency errors, so risk running that and backfill some of the 64bit arith tests. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1151663004
-
Jim Stichnoth authored
For C/C++ semantics, this applies to all the FP comparisons except == and != which require two comparisons due to ordered/unordered requirements. For == and !=, two comparisons and control flow are still used. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4095 TEST= crosstest/test_fcmp R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1148023003
-
- 19 May, 2015 2 commits
-
-
Jan Voung authored
Do basic lowering for add, sub, and, or, xor, mul. We don't yet take advantage of commuting immediate operands (e.g., use rsb to reverse subtract instead of sub) or inverting immediate operands (use bic to bit clear instead of using and). The binary operations can set the flags register (e.g., to have the carry bit for use with a subsequent adc instruction). That is optional for the "data processing" instructions. I'm not yet able to compile 8bit.pnacl.ll and 64bit.pnacl.ll so 8-bit and 64-bit are not well tested yet. Only tests are in the arith.ll file (like arith-opt.ll, but assembled instead of testing the "verbose inst" output). Not doing divide yet. ARM divide by 0 does not trap, but PNaCl requires uniform behavior for such bad code. Thus, in LLVM we insert a 0 check and would have to do the same. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1127003003
-
Jim Stichnoth authored
This is instead of explicit control flow which may interfere with branch prediction. However, explicit control flow is still needed for types other than i16 and i32, due to cmov limitations. The assembler for cmov is extended to allow the non-dest operand to be a memory operand. The select lowering is getting large enough that it was in our best interest to combine the default lowering with the bool-folding optimization. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4095 R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1125323004
-
- 18 May, 2015 1 commit
-
-
Jan Voung authored
Adds basic assignment instructions, mov, movn, movw, movt, ldr, etc. in order to copy around the first few integer (i32, i64) arguments out of r0 - r3, and then return then. The "mov" instruction is a bit special and can actually be a "str" when the dest is a stack slot. Model the Memory operand types, and the "flexible Operand2". Add a few tests demonstrating the flexibility of the immediate encoding. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1127963004
-
- 17 May, 2015 1 commit
-
-
Jim Stichnoth authored
Originally there was a peephole-style optimization in lowerIcmp() that looks ahead to see if the next instruction is a conditional branch with the right properties, and if so, folds the icmp and br into a single lowering sequence. However, sometimes extra instructions come between the icmp and br instructions, disabling the folding even though it would still be possible. One thought is to do the folding inside lowerBr() instead of lowerIcmp(), by looking backward for a suitable icmp instruction. The problem here is that the icmp lowering code may leave lowered instructions that can't easily be dead-code eliminated, e.g. instructions lacking a dest variable. Instead, before lowering a basic block, we do a prepass on the block to identify folding candidates. For the icmp/br example, the prepass would tentatively delete the icmp instruction and then the br lowering would fold in the icmp. This folding can also be extended to several producers: icmp (i32 operands), icmp (i64 operands), fcmp, trunc .. to i1 and several consumers: br, select, sext, zext This CL starts with 2 combinations: icmp32 paired with br & select. Other combinations will be added in later CLs. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4162 BUG= https://code.google.com/p/nativeclient/issues/detail?id=4095 R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1141213004
-
- 16 May, 2015 1 commit
-
-
Jan Voung authored
Previously it would print both the targets compiled-in and the target requested on the commandline, but we really only care about what's compiled-in. BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1132883003
-
- 14 May, 2015 1 commit
-
-
Jan Voung authored
Wasn't sure how to allow TargetX8632 and TargetARM32 to both define "ConstantInteger32::emit(GlobalContext *)", and define them differently if both targets happen to be ifdef'ed into the code. Rearranged things so that it's now "TargetFoo::emit(ConstantInteger32 *)", so that each TargetFoo can have a separate definition. Some targets may allow emitting some types of constants while other targets do not (64-bit int for x86-64?). Also they emit constants with a different style. E.g., the prefix for x86 is "$" while the prefix for ARM is "#" and there isn't a prefix for mips(?). Renamed emitWithoutDollar to emitWithoutPrefix. Did this sort of multi-method dispatch via a visitor pattern, which is a bit verbose though. We may be able to remove the emitWithoutDollar/Prefix for ConstantPrimitive by just inlining that into the few places that need it (only needed for ConstantInteger32). This undoes the unreachable methods added by: https://codereview.chromium.org/1017373002/diff/60001/src/IceTargetLoweringX8632.cpp The only place extra was for emitting calls to constants. There was already an inlined instance for OperandX8632Mem. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1129263005
-
- 12 May, 2015 2 commits
-
-
Jan Voung authored
Modify run-pnacl-sz to pass in the correct assembler/disasembler flags for ARM when not using the integrated assembler. Model the "ret" pseudo instruction (special form of "bx" inst). Separate from "bx" to allow epilogue insertion to find the terminator. Add a flag "--skip-unimplemented" to skip through all of the "Not yet implemented" assertions, and use that in the test. Set up a stack trace printer when ALLOW_DUMP so that the UnimplementedError prints out some useful information of *which* case is unimplemented. Change the .type ...,@function from @function to %function. ARM assembler seems to only like %function because "@" is a comment character. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1136793002
-
Karl Schimpf authored
Dependent on https://codereview.chromium.org/1122423005 being committed first. BUG=https://code.google.com/p/nativeclient/issues/detail?id=4164 R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1130313002
-
- 07 May, 2015 1 commit
-
-
Jim Stichnoth authored
The original code for 64-bit icmp lowering had separate cases for eq/ne versus other conditions, mostly because eq/ne need two 32-bit comparisons while the others need three. However, with small changes, we can handle everything uniformly, simplifying the code. This gets thoroughly tested by the test_icmp cross test. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4162 R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1130023002
-
- 04 May, 2015 1 commit
-
-
Jim Stichnoth authored
For an example like: %a = icmp eq i32 %b, %c The original icmp lowering sequence for i8/i16/i32 was something like: cmpl b, c movb 1, a je label movb 0, a label: The improved sequence is: cmpl b, c sete a In O2 mode, this doesn't help when successive compare/branch instructions are fused, but it does help when the boolean result needs to be saved and later used. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1118353005
-
- 30 Apr, 2015 3 commits
-
-
Karl Schimpf authored
Fixes code to follow new editing constants in class NaClMungedBitcode rather than obsolete editing constants in class NaClBitcodeMunger. BUG=None R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1120853002
-
Jan Voung authored
This is to conditionally (ifdef) include only the enabled target assemblers. Also rename the assembler's "x86" namespace to "X8632" for similar reasons. The namespace was created to hide generic sounding classes like "Address" which are used all over the assembler. Plop the somewhat empty AssemblerARM32 in an ARM32 namespace for consistency. BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1114223002
-
Jim Stichnoth authored
It's sometimes useful to know whether a use of a stack variable (as opposed to a physical register) is the last use of that variable. For example, in a code sequence like: movl %edx, 24(%esp) movl 24(%esp), %edx it would be nice to know whether the code sequence is merely bad (i.e., 24(%esp) will be used later), or horrible (i.e., this ends 24(%esp)'s live range). We add stack variables to the per-instruction live-range-end annotation, but not to the per-block live-in and live-out annotations, because the latter would clutter the output greatly while adding very little actionable information. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4135 R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1113133002
-