- 04 Jun, 2015 1 commit
-
-
Jan Voung authored
The input object may be a QueueStreamer, which the compile server will still have a reference to (even though downstream the memory object API and parser API thinks it has a unique_ptr). Terminate the thread quickly on error, instead of free'ing and causing a use-after-free. Also set up a report_fatal_error handler which has access to the server's state. This allows the server to record the error and stop pushing bytes to the QueueStreamer. Otherwise the QueueStreamer can get full without a consumer still active to unblock. Unfortunately the fatal error handler only terminates the current thread, and not all worker threads. NaCl doesn't have support for signals or pthread_kill. E.g., with pthread_kill(std_thread.native_handle(), SIGABRT). So, other worker/emitter threads will have to hang waiting on more input or something. Random clang-format edits from 3.7. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4163 TEST= tbd: I manually ran the translator a dummy text file (invalid bitcode header), and observed that this no longer crashes. Instead the SRPC calls finish and I see: 3> [17812,4147750656:14:23:02.025382] Streaming file at 100000 bps [17812,4147750656:14:23:12.511574] RPC call failed: Rpc application returned an error. [17812,4147750656:14:23:12.511625] StreamChunk failed [17812,4147750656:14:23:12.511655] stream_file: SendDataChunk failed, but returning without failing. Expect call to StreamEnd.4> rpc call initiated StreamEnd::isss [17812,4147750656:14:23:12.511931] RPC call failed: Rpc application returned an error. rpc call complete StreamEnd::isss output 0: i(0) output 1: s("") output 2: s("") output 3: s("Invalid PNaCl bitcode header") [17812,4147750656:14:23:12.512102] Command [rpc] failed. R=kschimpf@google.com, stichnot@chromium.org Review URL: https://codereview.chromium.org/1168543002
-
- 03 Jun, 2015 2 commits
-
-
Jim Stichnoth authored
This is turned into a separate (O2-only) pass that looks for opportunities: 1. A Load instruction, or an AtomicLoad intrinsic that would be lowered just like a Load instruction 2. Followed immediately by an instruction with a whitelisted kind that uses the Load dest variable as one of its operands 3. Where the whitelisted instruction ends the live range of the Load dest variable. In such cases, the original two instructions are deleted and a new instruction is added that folds the load into the whitelisted instruction. We also do some work to splice the liveness information (Inst::LiveRangesEnded and Inst::isLastUse()) into the new instruction, so that the target lowering pass might still take advantage. Currently this is used quite sparingly, but in the future we could use that along with operator commutativity to choose among different lowering sequences to reduce register pressure. The whitelisted instruction kinds are chosen based primarily on whether the main operation's native instruction can use a memory operand - e.g., arithmetic (add/sub/imul/etc), compare (cmp/ucomiss), cast (movsx/movzx/etc). Notably, call and ret are not included because arg passing is done through simple assignments which normal lowering is sufficient for. BUG= none R=jvoung@chromium.org, mtrofin@chromium.org Review URL: https://codereview.chromium.org/1169493002
-
Jim Stichnoth authored
BUG= none R=jvoung@chromium.org, kschimpf@google.com Review URL: https://codereview.chromium.org/1162903003
-
- 02 Jun, 2015 1 commit
-
-
Jan Voung authored
Thought leaving "mov" simple and not handle memory operands, but then we'd have to duplicate some of the lowerAssign code for lowerLoad =/ BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076 R=kschimpf@google.com, stichnot@chromium.org Review URL: https://codereview.chromium.org/1152703006
-
- 01 Jun, 2015 4 commits
-
-
Jim Stichnoth authored
1. Change Makefile.standalone from 3.6 to 3.7. 2. Update to new load instruction .ll syntax. This includes changing InstLoad::dump() to match. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1161543005
-
Jim Stichnoth authored
BUG= none R=kschimpf@google.com Review URL: https://codereview.chromium.org/1161353002
-
Jan Voung authored
Split out some of the addProlog code from x86 and reuse that for ARM. Mainly, the code that doesn't concern preserved registers or stack arguments is split out. ARM push and pop take a whole list of registers (not necessarily consecutive, but should be in ascending order). There is also "vpush" for callee-saved float/vector registers but we do not handle that yet (the register numbers for that have to be consecutive). Enable some of the int-arg.ll tests, which relied on addPrologue's finishArgumentLowering to pull from the correct argument stack slot. Test some of the frame pointer usage (push/pop) when handling a variable sized alloca. Also change the classification of LR, and PC so that they are not "CalleeSave". We don't want to push LR if it isn't overwritten by another call. It will certainly be "used" by the return however. The prologue code only checks if a CalleeSave register is used somewhere before deciding to preserve it. We could make that stricter and check if the register is also written to, but there are some additional writes that are not visible till after the push/pop are generated (e.g., copy from argument stack slot to the argument register). Instead, keep checking use only, and handle LR as a special case (IsLeafFunction). BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1159013002
-
Jim Stichnoth authored
This is similar to the way a load instruction may be folded into the next arithmetic instruction. Usually the effect is to improve a sequence like: mov ax, WORD PTR [mem] movsx eax, ax into this: movsx eax, WORD PTR [mem] without actually improving register allocation, though other kinds of casts may have different improvements. Existing tests needed to be fixed when they "inadvertently" did a cast to i32 return type and triggered the optimization when it wasn't wanted. These were fixed by inserting a "dummy" instruction between the load and the cast. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4095 R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1152783006
-
- 27 May, 2015 3 commits
-
-
Jan Voung authored
So far we've been using ldr/str (32-bit) to load/store the whole stack slot, independent of the variable type. Toggle on some tests that didn't have an Om1 variant previously. Didn't toggle everything since there are still some problems with liveness from code being unimplemented. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1144923008
-
Jim Stichnoth authored
It turns out that code deleted in 9a05aea8 actually had a legitimate purpose, so it is added back, this time with more extensive comments justifying it. Also, takes the instruction's IsDestNonKillable flag into account when updating the live register usage count (along with extra comments on why that is necessary). Furthermore, removes an unnecessary assert that otherwise fails when --asm-verbose is used with --filetype=iasm or --filetype-obj. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1158113002
-
Jan Voung authored
Might have gotten replaced by some other field, but don't quite remember. Spotted while looking for ways to share the addProlog() code between targets. BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1158713005
-
- 26 May, 2015 2 commits
-
-
Jim Stichnoth authored
Fixes a bug where a num-uses counter wasn't being updated because of C operator && semantics. The code was something like "if (A && --B) ..." but we want --B to happen even when A is false. Sorts the LiveIn and LiveOut lists by regnum so that the lists always display the set of registers in a consistent/familiar order. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1152813003
-
Jan Voung authored
Lower alloca in a way similar to x86. Subtract the stack and align if needed, then copy that stack address to dest. Sometimes use "bic" for the mask, sometimes use "and", depending on what fits better. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1156713003
-
- 22 May, 2015 2 commits
-
-
Jan Voung authored
Allow instructions to be predicated and use that in lower icmp and branch. Tracking the predicate for almost every instruction is a bit overkill, but technically possible. Add that to most of the instruction constructors except ret and call for now. This doesn't yet do compare + branch fusing, but it does handle the branch fallthrough to avoid branching twice. I can't yet test 8bit and 16bit, since those come from "trunc" and "trunc" is not lowered yet (or load, which also isn't handled yet). Adds basic "call(void)" lowering, just to get the call markers showing up in tests. 64bit.pnacl.ll no longer explodes with liveness consistency errors, so risk running that and backfill some of the 64bit arith tests. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1151663004
-
Jim Stichnoth authored
For C/C++ semantics, this applies to all the FP comparisons except == and != which require two comparisons due to ordered/unordered requirements. For == and !=, two comparisons and control flow are still used. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4095 TEST= crosstest/test_fcmp R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1148023003
-
- 19 May, 2015 2 commits
-
-
Jan Voung authored
Do basic lowering for add, sub, and, or, xor, mul. We don't yet take advantage of commuting immediate operands (e.g., use rsb to reverse subtract instead of sub) or inverting immediate operands (use bic to bit clear instead of using and). The binary operations can set the flags register (e.g., to have the carry bit for use with a subsequent adc instruction). That is optional for the "data processing" instructions. I'm not yet able to compile 8bit.pnacl.ll and 64bit.pnacl.ll so 8-bit and 64-bit are not well tested yet. Only tests are in the arith.ll file (like arith-opt.ll, but assembled instead of testing the "verbose inst" output). Not doing divide yet. ARM divide by 0 does not trap, but PNaCl requires uniform behavior for such bad code. Thus, in LLVM we insert a 0 check and would have to do the same. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1127003003
-
Jim Stichnoth authored
This is instead of explicit control flow which may interfere with branch prediction. However, explicit control flow is still needed for types other than i16 and i32, due to cmov limitations. The assembler for cmov is extended to allow the non-dest operand to be a memory operand. The select lowering is getting large enough that it was in our best interest to combine the default lowering with the bool-folding optimization. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4095 R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1125323004
-
- 18 May, 2015 1 commit
-
-
Jan Voung authored
Adds basic assignment instructions, mov, movn, movw, movt, ldr, etc. in order to copy around the first few integer (i32, i64) arguments out of r0 - r3, and then return then. The "mov" instruction is a bit special and can actually be a "str" when the dest is a stack slot. Model the Memory operand types, and the "flexible Operand2". Add a few tests demonstrating the flexibility of the immediate encoding. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1127963004
-
- 17 May, 2015 1 commit
-
-
Jim Stichnoth authored
Originally there was a peephole-style optimization in lowerIcmp() that looks ahead to see if the next instruction is a conditional branch with the right properties, and if so, folds the icmp and br into a single lowering sequence. However, sometimes extra instructions come between the icmp and br instructions, disabling the folding even though it would still be possible. One thought is to do the folding inside lowerBr() instead of lowerIcmp(), by looking backward for a suitable icmp instruction. The problem here is that the icmp lowering code may leave lowered instructions that can't easily be dead-code eliminated, e.g. instructions lacking a dest variable. Instead, before lowering a basic block, we do a prepass on the block to identify folding candidates. For the icmp/br example, the prepass would tentatively delete the icmp instruction and then the br lowering would fold in the icmp. This folding can also be extended to several producers: icmp (i32 operands), icmp (i64 operands), fcmp, trunc .. to i1 and several consumers: br, select, sext, zext This CL starts with 2 combinations: icmp32 paired with br & select. Other combinations will be added in later CLs. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4162 BUG= https://code.google.com/p/nativeclient/issues/detail?id=4095 R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1141213004
-
- 16 May, 2015 1 commit
-
-
Jan Voung authored
Previously it would print both the targets compiled-in and the target requested on the commandline, but we really only care about what's compiled-in. BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1132883003
-
- 14 May, 2015 1 commit
-
-
Jan Voung authored
Wasn't sure how to allow TargetX8632 and TargetARM32 to both define "ConstantInteger32::emit(GlobalContext *)", and define them differently if both targets happen to be ifdef'ed into the code. Rearranged things so that it's now "TargetFoo::emit(ConstantInteger32 *)", so that each TargetFoo can have a separate definition. Some targets may allow emitting some types of constants while other targets do not (64-bit int for x86-64?). Also they emit constants with a different style. E.g., the prefix for x86 is "$" while the prefix for ARM is "#" and there isn't a prefix for mips(?). Renamed emitWithoutDollar to emitWithoutPrefix. Did this sort of multi-method dispatch via a visitor pattern, which is a bit verbose though. We may be able to remove the emitWithoutDollar/Prefix for ConstantPrimitive by just inlining that into the few places that need it (only needed for ConstantInteger32). This undoes the unreachable methods added by: https://codereview.chromium.org/1017373002/diff/60001/src/IceTargetLoweringX8632.cpp The only place extra was for emitting calls to constants. There was already an inlined instance for OperandX8632Mem. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1129263005
-
- 12 May, 2015 2 commits
-
-
Jan Voung authored
Modify run-pnacl-sz to pass in the correct assembler/disasembler flags for ARM when not using the integrated assembler. Model the "ret" pseudo instruction (special form of "bx" inst). Separate from "bx" to allow epilogue insertion to find the terminator. Add a flag "--skip-unimplemented" to skip through all of the "Not yet implemented" assertions, and use that in the test. Set up a stack trace printer when ALLOW_DUMP so that the UnimplementedError prints out some useful information of *which* case is unimplemented. Change the .type ...,@function from @function to %function. ARM assembler seems to only like %function because "@" is a comment character. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1136793002
-
Karl Schimpf authored
Dependent on https://codereview.chromium.org/1122423005 being committed first. BUG=https://code.google.com/p/nativeclient/issues/detail?id=4164 R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1130313002
-
- 07 May, 2015 1 commit
-
-
Jim Stichnoth authored
The original code for 64-bit icmp lowering had separate cases for eq/ne versus other conditions, mostly because eq/ne need two 32-bit comparisons while the others need three. However, with small changes, we can handle everything uniformly, simplifying the code. This gets thoroughly tested by the test_icmp cross test. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4162 R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1130023002
-
- 04 May, 2015 1 commit
-
-
Jim Stichnoth authored
For an example like: %a = icmp eq i32 %b, %c The original icmp lowering sequence for i8/i16/i32 was something like: cmpl b, c movb 1, a je label movb 0, a label: The improved sequence is: cmpl b, c sete a In O2 mode, this doesn't help when successive compare/branch instructions are fused, but it does help when the boolean result needs to be saved and later used. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1118353005
-
- 30 Apr, 2015 3 commits
-
-
Karl Schimpf authored
Fixes code to follow new editing constants in class NaClMungedBitcode rather than obsolete editing constants in class NaClBitcodeMunger. BUG=None R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1120853002
-
Jan Voung authored
This is to conditionally (ifdef) include only the enabled target assemblers. Also rename the assembler's "x86" namespace to "X8632" for similar reasons. The namespace was created to hide generic sounding classes like "Address" which are used all over the assembler. Plop the somewhat empty AssemblerARM32 in an ARM32 namespace for consistency. BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1114223002
-
Jim Stichnoth authored
It's sometimes useful to know whether a use of a stack variable (as opposed to a physical register) is the last use of that variable. For example, in a code sequence like: movl %edx, 24(%esp) movl 24(%esp), %edx it would be nice to know whether the code sequence is merely bad (i.e., 24(%esp) will be used later), or horrible (i.e., this ends 24(%esp)'s live range). We add stack variables to the per-instruction live-range-end annotation, but not to the per-block live-in and live-out annotations, because the latter would clutter the output greatly while adding very little actionable information. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4135 R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1113133002
-
- 29 Apr, 2015 1 commit
-
-
Jim Stichnoth authored
The "pnacl-sz --asm-verbose=1" mode annotates the asm output with physical register liveness information, including which registers are live at the beginning and end of each basic block, and which registers' live ranges end at each instruction. Computing this information requires a final liveness analysis pass. One of the side effects of liveness analysis is to remove dead instructions, which happens when the instruction's dest variable is not live and the instruction lacks important side effects. In some cases, direct manipulation of physical registers was missing extra fakedef/fakeuse/etc., and as as result these instructions could be eliminated, leading to incorrect code. Without --asm-verbose, these instructions were being created after the last run of liveness analysis, so they had no chance of being eliminated and everything was fine. But with --asm-verbose, some instructions would be eliminated. This CL fixes the omissions so that the resulting code is runnable. An alternative would be to add a flag to liveness analysis directing it not to dead-code eliminate any more instructions. However, it's better to get the liveness right in case future late-stage optimizations rely on it. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4135 TEST= pydir/szbuild_spec2k.py --filetype=asm -v --sz=--asm-verbose=1 --force R=jvoung@chromium.org, kschimpf@google.com Review URL: https://codereview.chromium.org/1113683002
-
- 28 Apr, 2015 1 commit
-
-
Jim Stichnoth authored
In an earlier version of Subzero, the text output stream object was stack-allocated within main. A later refactoring moved its allocation into a helper function, but it was still being stack-allocated, which was bad when the helper function returned. This change allocates the object via "new", which fixes that problem, but reveals another problem: the raw_ostream object for some reason doesn't finish writing everything to disk and yielding a truncated output file. This is solved in the style of the ELF streamer, by using raw_fd_ostream instead. BUG= none R=kschimpf@google.com Review URL: https://codereview.chromium.org/1111603003
-
- 22 Apr, 2015 2 commits
-
-
Karl Schimpf authored
Adds a notion of an (optional) error stream to the existing log and emit streams. If not specified, the log stream is used. Error messages in parser/translation are sent to this new error stream. In the browser compiler server, a separate error (string) stream is created to capture errors. Method onEndCallBack returns the contents of the error stream (if non-empty) instead of a generic error message. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4138 R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1052833003
-
Jan Voung authored
Later commits will add more information, but this tests the conditional compilation and build setup. One way to do conditional compilation: determine this early, at LLVM configure/CMake time. Configure will fill in the template of SZTargets.def.in to get a SZTargets.def file. LLVM change: https://codereview.chromium.org/1084753002/ NaCl change: https://codereview.chromium.org/1082953002/ I suppose an alternative is to fill in the .def file via -D flags in CXXFLAGS. For conditional lit testing, pnacl-sz dumps the attributes when given the --build-atts so we just build on top of that. We do that instead of go the LLVM way of filling in a lit.site.cfg.in -> lit.site.cfg at configure/CMake time. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4076 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1075363002
-
- 21 Apr, 2015 2 commits
-
-
Jim Stichnoth authored
If you switch between "cmake" and "autoconf" toolchain builds, and neglect to clean out pnacl_newlib_raw/ in between, the wrong libgtest and libgtest_main may get pulled in for the autoconf build, leading to an assertion failure in "make check-unit". This tweak fixes that problem by rejiggering the lib search path. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1099093005
-
Jim Stichnoth authored
The CMAKE=1 option is no longer needed. Pretty much all the tools we need are now in pnacl_newlib_raw/bin, so use PNACL_BIN_PATH set to that instead of using LLVM_BIN_PATH and BINUTILS_BIN_PATH. However, for the autoconf build, libgtest and libtest_main and clang-format are only under the llvm_x86_64_linux_work directory, so they need special casing. This also means that you have to actually do an LLVM build and not blow away the work directory in order to "make check-unit" or "make format". BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1085733002
-
- 16 Apr, 2015 5 commits
-
-
Karl Schimpf authored
Same as CL https://codereview.chromium.org/1071423003 (which has LGTM). BUG= https://code.google.com/p/nativeclient/issues/detail?id=4138 Review URL: https://codereview.chromium.org/1097563003
-
Karl Schimpf authored
This reverts commit 187b3dfa. A unit test fails when it shouldn't. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4138 Review URL: https://codereview.chromium.org/1071423003
-
Karl Schimpf authored
This reverts commit a7340883. I reverted the wrong patch! BUG= https://code.google.com/p/nativeclient/issues/detail?id=4138 Review URL: https://codereview.chromium.org/1089323005
-
Karl Schimpf authored
This reverts commit d8fb3d33. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4138 Review URL: https://codereview.chromium.org/1091123002
-
Karl Schimpf authored
The method TopLevelParser::ErrorAt applies a lock to print the error message. Unfortunately, it keeps the lock longer than necessary, resulting in deadlock (on following fatal message) if error recovery is not allowed. Fixed by limiting scope of lock to only apply to the printing of the error message. Modified ClFlags to allow a "reset", and made ClFlags modifiable by bitcode munge tests. This allowed us to test this problem as a unit test. BUG= https://code.google.com/p/nativeclient/issues/detail?id=4138 R=jvoung@chromium.org Review URL: https://codereview.chromium.org/1091023002
-
- 10 Apr, 2015 1 commit
-
-
Jan Voung authored
This is to go with toolchain_build changes which make LLVM cmake also use libc++: https://codereview.chromium.org/978963002/ May help with the memory sanitizer build, which wants most code to be built with memory sanitizer (e.g., make a special build of libc++). BUG= https://code.google.com/p/nativeclient/issues/detail?id=4119 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1074253002
-