- 11 Dec, 2014 1 commit
-
-
Jim Stichnoth authored
This is toward the goal of pulling non-POD fields out of the CfgNode class so that CfgNode can be arena-allocated and not leak memory. For now, PhiList and AssignList are defined as InstList. Ideally, they would be ilist<> of InstPhi and InstAssign, but SFINAE happens. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/794923002
-
- 10 Dec, 2014 2 commits
-
-
Jim Stichnoth authored
Instead, non-empty node names are kept in a single vector in the Cfg object. This is toward the goal of pulling non-POD fields out of the CfgNode class so that CfgNode can be arena-allocated and not leak memory. Also, actual setting of the node name is now guarded by ALLOW_DUMP. BUG= none R=jvoung@chromium.org, kschimpf@google.com Review URL: https://codereview.chromium.org/787333005
-
Karl Schimpf authored
Removes the need to call function llvm::DecodeBinaryOp. In turn, this removes the need for enum type llvm::Instruction::BinaryOps, llvm::Type.isFPOrFPVectorTy, and one call to llvm::convertToLLVMType. BUG= None R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/788283002
-
- 09 Dec, 2014 1 commit
-
-
Karl Schimpf authored
Also removes the need for DataLayout. BUG=None R=stichnot@chromium.org Review URL: https://codereview.chromium.org/789483003
-
- 08 Dec, 2014 1 commit
-
-
Jim Stichnoth authored
Specifically, don't bother to collect "-timing" and "-szstats" information since they anyway don't get printed out under the MINIMAL build. This is done by using the ALLOW_DUMP flag to guard whether code and timing stats are collected. This ends up reducing the native translator size by about 3%. ALLOW_DUMP is used as the guard since it already guards the output of the collected data - no sense collecting the data if it can never be printed out. To minimize the number of ALLOW_DUMP tests, we push the tests into the timing/stats class methods. BUG= none R=jvoung@chromium.org, kschimpf@google.com Review URL: https://codereview.chromium.org/788713002
-
- 07 Dec, 2014 1 commit
-
-
Jim Stichnoth authored
This is consistent with how LLVM is built, and makes it easier to analyze the potential size of a translator build and what may be inappropriately brought into the build. BUG= none R=kschimpf@google.com Review URL: https://codereview.chromium.org/783023002
-
- 06 Dec, 2014 1 commit
-
-
Jim Stichnoth authored
BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/775953003
-
- 05 Dec, 2014 1 commit
-
-
Jim Stichnoth authored
BUG= none R=kschimpf@google.com Review URL: https://codereview.chromium.org/785583002
-
- 04 Dec, 2014 4 commits
-
-
Jim Stichnoth authored
BUG= https://code.google.com/p/nativeclient/issues/detail?id=4006 R=kschimpf@google.com Review URL: https://codereview.chromium.org/773583004
-
Jim Stichnoth authored
This generally uses less memory and fewer allocations. Also removes the commented-out alternative implementation using std::set<>, which would almost certainly never be a good idea here. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/734053006
-
Jim Stichnoth authored
BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/780783003
-
Jim Stichnoth authored
Profiling indicated a noticeable amount of time spent on malloc/free related to the std::list<> implementation of UnorderedRanges. Therefore, we change the implementation to be std::vector<>, and up-front reserve a conservative amount of space to avoid expansion. The push_back() operation is always constant time with no allocation. Removing an element from the middle of the vector is done by swapping with the last element and then popping the last element, which is reasonable in principle because it is used as an unordered collection. Because of the swapping trick, the UnorderedRanges iterators are changed to iterate in reverse. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/781683002
-
- 03 Dec, 2014 2 commits
-
-
Jan Voung authored
BUG=build failure R=kschimpf@google.com Review URL: https://codereview.chromium.org/773853005
-
Karl Schimpf authored
When LLVM 3.5 was merged, the handling of errors was broken. This is being fixed in the CL listed below. This CL fixes Subzero's call so that it will work with the CL listed below. Relavant LLVM CL: https://codereview.chromium.org/770853002 BUG= https://code.google.com/p/nativeclient/issues/detail?id=4006 R=jfb@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/775173002
-
- 02 Dec, 2014 1 commit
-
-
Jan Voung authored
Able to write out the ELF file header w/ a text section, a symbol table, and string table. Write text buffer directly to file after translating each CFG. This means that the header is written out early w/ fake data and then we seek back and write the real header at the very end. Does not yet handle relocations, data, rodata, constant pools, bss, or -ffunction-sections, more than 64K sections or more than 2^24 symbols. Numbers w/ current NOASSERT=1 build on 176.gcc: w/out -elf-writer: 0.233771 (21.1%): [ 1287] emit 28MB .s file w/ -elf-writer: 0.051056 ( 5.6%): [ 1287] emit 2.4MB .o file BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/678533005
-
- 01 Dec, 2014 2 commits
-
-
Karl Schimpf authored
Using lit.local.cfg, don't allow reader tests unless dumping of IR is allowed. This was suggested by Jan in: https://codereview.chromium.org/686913005 BUG=None R=jvoung@chromium.org Review URL: https://codereview.chromium.org/735513002
-
Jim Stichnoth authored
In -O2 mode, postLower() is supposed to iterate over just the instructions that were most recently added. Instead, it was iterating all the way to the end of the block, also post-lowering high-level ICE instructions that hadn't yet been lowered. This was basically harmless, given that the spec2k asm code is identical after this patch, but it improves performance. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/721333004
-
- 26 Nov, 2014 1 commit
-
-
Jim Stichnoth authored
Use a bigger block size in the bump-pointer allocators, since we basically know up front that we'll need lots of memory. The 1MB value (versus the default of 4KB) was chosen somewhat arbitrarily, and succeeds in pretty much removing bump-pointer related mallocs from the profile. Pre-reserve the a priori known number of edges in getTerminatorEdges() to avoid vector resizing. BUG= none R=kschimpf@google.com Review URL: https://codereview.chromium.org/760973002
-
- 24 Nov, 2014 1 commit
-
-
Jim Stichnoth authored
We need to link with -lpthread now. The CALLTARGETS workaround in our lit tests can be removed, since llvm-objdump has gotten more accurate than before with respect to symbols. The -stats and -rng-seed options need to be renamed to avoid conflicting with the LLVM options being brought in. BUG= https://code.google.com/p/nativeclient/issues/detail?id=3930 R=jvoung@chromium.org Review URL: https://codereview.chromium.org/756543002
-
- 21 Nov, 2014 1 commit
-
-
JF Bastien authored
R=stichnot@chromium.org BUG= https://code.google.com/p/nativeclient/issues/detail?id=3930 Review URL: https://codereview.chromium.org/752603003
-
- 20 Nov, 2014 1 commit
-
-
Jim Stichnoth authored
Internally, create a separate constant pool for each integer type, instead of a single i64 pool that uses the Ice::Type value as part of the key. This means each constant pool key can be a simple primitive value, rather than a tuple. Represent the pools using std::unordered_map instead of std::map since we're using C++11 now. Use signed integers instead of unsigned integers for the integer constant pools, to benefit from sign extension and to be more consistent. Remove the SuppressMangling field from hash and comparison functions on RelocatableTuple, since we'll never have two symbols with the same name but different values of SuppressMangling. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/737513008
-
- 18 Nov, 2014 1 commit
-
-
Jim Stichnoth authored
It's not meant to be comprehensive, but rather to help someone new get started, assuming they already have PNaCl working. BUG= none R=jfb@chromium.org, jvoung@chromium.org Review URL: https://codereview.chromium.org/727583003
-
- 17 Nov, 2014 1 commit
-
-
Karl Schimpf authored
Remove the dump/emit routines when ALLOW_DUMP=0. Also fixes some verbosity messages to not print if ALLOW_DUMP=0. Note: emit routines needed for emitIAS are not turned off. BUG=None R=stichnot@chromium.org Review URL: https://codereview.chromium.org/686913005
-
- 14 Nov, 2014 4 commits
-
-
Jim Stichnoth authored
This removes the need for Om1's postLower() code which did its own ad-hoc register allocation. And it actually speeds up Om1 translation significantly. This mode of register allocation only allocates for infinite-weight Variables, while respecting live ranges of pre-colored Variables. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/733643005
-
Jan Voung authored
Seems to be part of the non-sfi link now: https://codereview.chromium.org/686723003/diff/180001/pnacl/driver/pnacl-translate.py Otherwise I get: x86-32-linux/lib/unsandboxed_irt.o:(.rodata+0x68): undefined reference to `nacl_secure_random' BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/726093002
-
Jim Stichnoth authored
Even after earlier simplifications, FakeKill was still handled somewhat inefficiently for the register allocator. For x86-32, any function containing call instructions would result in about 11 pre-colored Variables, each with an identical and relatively complex live range consisting of points. They would start out on the UnhandledPrecolored list, then all move to the Inactive list, where they would be repeatedly compared against each register allocation candidate via overlapsRange(). We improve this by keeping around a single copy of that live range and directly masking out the Free[] register set when that live range overlaps the current candidate's live range. This saves ~10 overlaps() calculations per candidate while FakeKills are still pending. Also, slightly rearrange the initialization of the Unhandled etc. sets into a separate init routine, which will make it easier to reuse the register allocator in other situations such as Om1 post-lowering. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/720343003
-
Jim Stichnoth authored
This is purely for convenience of personal testing/debugging. To demonstrate its correctness in this CL, -build-on-read=0 is removed from the two .ll lit tests that explicitly use it, and also from the crosstest.py script. The lit test wrapper run-llvm2ice.py is left unchanged to be safe. BUG= none R=kschimpf@google.com Review URL: https://codereview.chromium.org/732583002
-
- 11 Nov, 2014 1 commit
-
-
Jim Stichnoth authored
Instead, separately compute it during prolog generation via another pass over the Cfg. This may slow down translation by ~1%, but it greatly simplifies the management of this flag/property. The higher motivation is to pull this management out of register allocation to make it easier to extend register allocation for other uses. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/692633004
-
- 06 Nov, 2014 4 commits
-
-
Jim Stichnoth authored
Use LLVM's intrusive list ADT template to implement instruction lists. This embeds prev/next pointers into the instruction, and as such, iterators essentially double as instruction pointers. This means stripping off one level of indirection when dereferencing, and also the range-based for loop can't be used. The performance difference in translation time seems to be 1-2%. I tried to also do this for the much less used PhiList and AssignList, but ran into SFINAE problems. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/709533002
-
Karl Schimpf authored
This CL allows one to time Subzero's bitcode parsing without IR generation (other than types, function declarations, and uninitialized global variables) for performance testing. BUG=None R=jvoung@chromium.org Review URL: https://codereview.chromium.org/696383004
-
Jan Voung authored
Eventually, I wanted to have a flag "UseELFWriter" like: https://codereview.chromium.org/678533005/diff/120001/src/IceCfg.cpp Where the emit OStream would not have text, and only have binary. This refactor hopefully means fewer places to check for a flag to disable the text version of IAS, and be able to write binary. Otherwise, there are some text labels for branches that are still being dumped out. BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/700263003
-
Jim Stichnoth authored
Currently NodeList is defined as std::vector<CfgNode*>, but in the future it may be desirable to change it to something like std::list<CfgNode*> so that it is easier to split edges and insert the new nodes at the right locations, rather than re-sorting them in a separate pass. This gets us closer by using foo.front() instead of foo[0]. There are still a couple more places using the [] operator, but the changes would be more intrusive. Also, a few instances of ".size()==0" are changed to the possibly more efficient ".empty()". BUG= none R=jvoung@chromium.org, kschimpf@google.com Review URL: https://codereview.chromium.org/704753007
-
- 05 Nov, 2014 1 commit
-
-
Jan Voung authored
There is one in IceDefs.h and one in IceGlobalInits.h. Can we just use one? BUG=none (mini cleanup) R=kschimpf@google.com, stichnot@chromium.org Review URL: https://codereview.chromium.org/695563006
-
- 04 Nov, 2014 4 commits
-
-
Jan Voung authored
More consistently use auto while doing llvm::dyn_cast/cast in the emit and emitIAS routines. BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/699923002
-
Jim Stichnoth authored
Also deletes a few entirely trivial test files. These tests were useful in the early days of Subzero, but now mostly just clutter things up. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/705513002
-
Jim Stichnoth authored
The FakeKill instruction is always used for killing scratch registers, and a fair amount of effort is spent building new copies of the same scratch register list each time (i.e., for each lowered call instruction). As such, we can create one master list of scratch registers and share it among all FakeKill instructions. Also, in all situations where an instruction's Srcs[] were considered for liveness, we had to either explicitly ignore an InstFakeKill instruction, or treat it specially. Now that InstFakeKill lacks any Srcs[] (or Dest), it doesn't need to be specially ignored, and the code is simplified. In addition, the text asm emitter no longer clutters the output with FakeKill comments (and FakeUse as well). BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/691693003
-
Jim Stichnoth authored
BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/701673002
-
- 03 Nov, 2014 3 commits
-
-
Jim Stichnoth authored
The integrated assembler assumed there is at most one fixup per instruction. For x86, there could actually be two fixups, for any instruction that allows memory and immediate operands at the same time. Using the now-default -build-on-read flag, it happens in spec2k - the smallest function I found where this happens is 176.gcc and perm_tree_cons. This changes the textual emission of integrated assembler code to allow for multiple consecutive fixups in a single instruction. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/693393002
-
Jim Stichnoth authored
Build a pure-Subzero binary when neither --include nor --exclude is specified. A pure-Subzero binary is built without flags: -externalize -ffunction-sections -fdata-sections which is good because that configuration is closer to what real usage will be, and will get more testing during development. BUG= none R=kschimpf@google.com Review URL: https://codereview.chromium.org/693393003
-
Karl Schimpf authored
Adds timers to each bitcode block parser in Subzero, to get a reading on how much time is used by the bitcode parser. BUG=None R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/688543003
-