- 04 Apr, 2016 1 commit
-
-
Jim Stichnoth authored
1. Generate dummy FunctionXXX function names when either of those flags is given. 2. Remove the browser code that automatically sets F/G prefixes instead of Function/Global, since that performance tweak is no longer relevant. 3. Fix a presumably long-standing bug where -timing-focus would accumulate timings into the TLS copy of the timers, but would then try to print timing info based on the currently-empty GlobalContext copy of the timers. BUG= none R=kschimpf@google.com Review URL: https://codereview.chromium.org/1855683002 .
-
- 02 Apr, 2016 3 commits
-
-
Karl Schimpf authored
Fixes error reporting in function blocks by adding a start address with parallel parses, and then using the getErrorBitNo() method of the bitstream cursor to correct error bit number, based on the start address. Dependent on CL https://codereview.chromium.org/1851163002. BUG= https://bugs.chromium.org/p/nativeclient/issues/detail?id=4363 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1848313003 .
-
Karl Schimpf authored
This CL removes all indirect pointer chasing to get the values of command line flags. Since we are only using 1 copy of ClFlags, this CL introduces a static field Flags to hold the defined command line flags (it was previously a static field of GlobalContext). For those few contexts where one must change CL flags due to context (such as testsing and running in the browser), use ClFlags::Flags. In the remainder of the cases, the code uses getFlags() which returns a constant reference to ClFlags::Flags, allowing access to the get accessors. BUG=None R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1848303003 .
-
Jim Stichnoth authored
This corrects a "cleanup" mistake in line 414/417 of https://codereview.chromium.org/1838753002/diff/120001/src/IceTargetLoweringX8664.cpp . This wasn't caught before because "make presubmit" doesn't run any sandboxed x86-64 tests. So we add that to the presubmit script. In addition, the "make check-spec" is changed to run each spec component via the "--run" flag of szbuild_spec2k.py. Otherwise, the makefile was hard-coding running the native binary setup instead of the sandboxed or nonsfi setup. BUG= none R=jpp@chromium.org Review URL: https://codereview.chromium.org/1852713004 .
-
- 01 Apr, 2016 2 commits
-
-
John Porto authored
BUG= R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1850163003 .
-
John Porto authored
Creates a local arena allocator for holding liveness data structures. BUG= https://bugs.chromium.org/p/nativeclient/issues/detail?id=4366 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1838973005 .
-
- 31 Mar, 2016 5 commits
-
-
Karl Schimpf authored
The code previously had to navigate through 1 (or more) indirect pointers to find the CL flags. Since CL flags are a static field in the global context, simplify these references. Note: I have added member functions to do this (where appropriate) so that fixing code could be easy if we choose to move where the command line flags are stored. BUG=None R=eholk@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/1845913003 .
-
Karl Schimpf authored
When threads=0, it doesn't pay to run a parallel parse, since each parallel parse slows down parsing by requiring a copy of bits in the function block. BUG=None R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1848873002 .
-
Jim Stichnoth authored
Liveness analysis uses a pair of bit vectors in each CFG node. The early bits correspond to "global" variables that are referenced in more than one block, and the latter bits correspond to "local" variables that are referenced in only that particular single block. Due to an oversight, variables that have no uses are conservatively classified as global, and consume space in every liveness bit vector. This CL improves memory usage by reducing liveness bit vector size: 1. Identify variables with no actual uses and exclude them from the bit vectors. 2. Don't do liveness analysis on rematerializable variables, because they have no need to be involved in register allocation or dead code elimination. BUG= https://bugs.chromium.org/p/nativeclient/issues/detail?id=4366 R=jpp@chromium.org Review URL: https://codereview.chromium.org/1844713004 .
-
Jim Stichnoth authored
The problem is that the memory usage that comes from the -track-memory option is not very well-defined (it's a hidden LLVM option after all). It gives an OK sense of memory growth over time, but sometimes we really want to know how much CFG-local arena memory was allocated for a particular function. To help with this, we add another row to the stats output, giving the MB size of the CFG arena at the end of translating the method. BUG= https://bugs.chromium.org/p/nativeclient/issues/detail?id=4366 R=kschimpf@google.com Review URL: https://codereview.chromium.org/1848733002 .
-
Karl Schimpf authored
This CL modifies the code so that we can do sequential and parallel parsing of function blocks in bitcode files, based on a command line argument. The command line argument was added because during testing, I had one compilation failure (transient), and do not know the cause. Hence, I was reluctant to install this CL without a command-line flag. To test the new parallel parser, the easiest solution is to edit IceClFlags.def and set the default value of ParseParallel to true. This code also fixes up unit parsing tests, as well as one parsing test. The cause of these problems was the implicit assumption that function blocks are parsed sequentially, which no longer applies when function blocks are parsed in parallel. To fix this, the "threads=0" command line argument was added. It also added the starting up of worker threads, since parsing of function blocks will happen in the translation thread if parallel parsing is turned on. The OptQ queue was modified to contain OptWorkerItem instances with a single virtual to get the parsed code. This allows the IceConverter to continue to work, by simply passing the generated Cfg as a work item. BUG= https://bugs.chromium.org/p/nativeclient/issues/detail?id=4363 R=jpp@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/1834473002 .
-
- 30 Mar, 2016 1 commit
-
-
Jim Stichnoth authored
The motivating example (simplified) is this: %__698:eax = phi i32 [ %__646:ebx, %__188 ] // LIVEEND={%__646:ebx} %__617:bh = phi i8 [ %__618:ah, %__188 ] // LIVEEND={%__618:ah} %__615:bl = phi i8 [ %__616:al, %__188 ] // LIVEEND={%__616:al} By default, it first lowers the __698 assignment. However, that assignment has two "predecessors" because there are two other instructions whose dest variable aliases the __698 assignment's source operand. This triggers an assertion failure where we assume there is only one predecessor. The fix is two-pronged. First, we go ahead and generate as many temp assignments as needed to break the cycle, simply by changing an "if" to a "while". Second, when we need to break a cycle, we give preference to an instruction with only one predecessor so that only one temp assignment needs to be added. (It might be possible to prove that the second approach, i.e. preferring single-predecessor assignments, makes the first approach unnecessary, i.e. changing "if" to "while".) This change has no effect on the x86 output for spec2k. BUG= https://bugs.chromium.org/p/nativeclient/issues/detail?id=4365 R=jpp@chromium.org Review URL: https://codereview.chromium.org/1839263003 .
-
- 29 Mar, 2016 1 commit
-
-
Jim Stichnoth authored
The purpose is to get control over excess string creation and deletion, especially when the strings are long enough to involve malloc/free. Strings that interface with the outside world, or used for dump/debug, are now explicitly std::string. Variable names and node names are represented as string IDs and pooled locally in the CFG. (In a non-DUMP build, this pool should always be empty.) Other strings that are used across functions are represented as string IDs and pooled globally in the GlobalContext. The --dump-strings flag allows these strings to be dumped for sanity checking, even in a MINIMAL build. In a MINIMAL build, the set of strings includes external symbol names, intrinsic names, helper function names, and label names of pooled constants to facilitate deterministic ELF string table output. For now, it also includes jump table entry names until that gets sorted out. Constants are fixed so that the name and pooled fields are properly immutable after the constant is fully initialized. BUG= https://bugs.chromium.org/p/nativeclient/issues/detail?id=4360 R=jpp@chromium.org Review URL: https://codereview.chromium.org/1838753002 .
-
- 25 Mar, 2016 1 commit
-
-
Karl Schimpf authored
This CL fixes pnacl-sz to handle smoothnacl.pexe. BUG= https://bugs.chromium.org/p/nativeclient/issues/detail?id=4364 R=jpp@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/1831043002 .
-
- 24 Mar, 2016 1 commit
-
-
John Porto authored
Valgrind used to report errors about uninitialized variable access in Subzero, when it was built with -O2. The problem was traced to size_t Alignment = Var->getAlignment; Alignment = std::max(MinAlignment, Var->getAlignment()) Apparently, the compiler will not correctly zero-extend Var->getAlignment(), and thus Alignment's upper 32-bits would be garbage. BUG= R=kschimpf@google.com, stichnot@chromium.org Review URL: https://codereview.chromium.org/1825363003 .
-
- 21 Mar, 2016 3 commits
-
-
Jim Stichnoth authored
BUG= none R=kschimpf@google.com Review URL: https://codereview.chromium.org/1819153002 .
-
Jim Stichnoth authored
The big reduction is in greatly reducing the set of non-native cross tests. Also, not so many copies of spec2k are run, and only one "representative" sandboxed target is built. BUG= none R=eholk@chromium.org Review URL: https://codereview.chromium.org/1824723002 .
-
John Porto authored
BUG= R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1803403002 .
-
- 17 Mar, 2016 2 commits
-
-
David Sehr authored
BUG= R=jpp@chromium.org, kschimpf@google.com, stichnot@chromium.org Review URL: https://codereview.chromium.org/1781213002 .
-
Karl Schimpf authored
This code moves some constants used by the target lowering, so that they are defined (and cached) during static initialization of the target lowering, rather than looking up everytime they are used. This CL does this for the constant zero, and predefined helper functions. BUG= https://bugs.chromium.org/p/nativeclient/issues/detail?id=4076 R=jpp@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/1775253003 .
-
- 16 Mar, 2016 1 commit
-
-
John Porto authored
Uses the compiler's frontend -MD -MP options to auto-generate make dependencies. BUG= R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1801273003 .
-
- 15 Mar, 2016 3 commits
-
-
Jim Stichnoth authored
Reverts part of 843142fe (https://codereview.chromium.org/1776343004). Makes global prefixes short in all non-DUMP builds, not just the browser build, so that the user doesn't need to remember to supply the override options in a command-line MINIMAL build. BUG= none R=jpp@chromium.org Review URL: https://codereview.chromium.org/1805593002 .
-
John Porto authored
Because explicit memory ownership is awesome! BUG= R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1804133002 .
-
John Porto authored
This allows Subzero to release the global initializers once they've been lowered. This CL also modifies the global initializer types to ensure they are trivially destructible -- therefore not requiring destructors to run. BUG= https://bugs.chromium.org/p/nativeclient/issues/detail?id=4360 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1776473007 .
-
- 14 Mar, 2016 1 commit
-
-
Jim Stichnoth authored
Normally, deleted instructions are preserved in the Cfg, and printed as part of dump output. This helps debugging by partially explaining the provenance of new instructions that originated from the deleted instructions. However, these instructions slow down iteration over the instruction list, and checking their deleted status may pollute the cache. As such, in a non-DUMP enabled build, we repurpose the renumberInstructions() pass to also unlink deleted instructions as needed. A flag is provided to override this behavior, in case we have to debug a situation where a bug only manifests in a DUMP build and not a non-DUMP build, or vice versa. BUG= https://bugs.chromium.org/p/nativeclient/issues/detail?id=4360 R=eholk@chromium.org, jpp@chromium.org Review URL: https://codereview.chromium.org/1790063003 .
-
- 12 Mar, 2016 2 commits
-
-
Jim Stichnoth authored
This makes it easier to focus on reducing the translator size. Enable with e.g.: make -f Makefile.standalone MINIMAL=1 SZTARGET=X8664 bloat-sb BUG= https://bugs.chromium.org/p/nativeclient/issues/detail?id=4362 R=jpp@chromium.org Review URL: https://codereview.chromium.org/1787143002 .
-
Jim Stichnoth authored
https://codereview.chromium.org/1784243006/ added the ALLOW_TIMERS define, and I forgot to add it to Makefile and CMakeLists.txt. BUG= none R=jpp@chromium.org Review URL: https://codereview.chromium.org/1788873004 .
-
- 11 Mar, 2016 4 commits
-
-
Jim Stichnoth authored
Several things are done here: 1. Move timer support to be guarded by the ALLOW_TIMERS define, or the BuildDefs::timers() constexpr method. 2. Add a NODUMP build configuration to control whether dump support is built in. So "make -f Makefile.standalone NODUMP=1 NOASSERT=1" is pretty close to a MINIMAL build with timer support. 3. Add some missing timers: alloca analysis, RMW analysis, helper call pre-lowering, load optimization analysis. These omitted pass timings were being rolled up into the "O2" bucket. 4. Add timers around push and pop operations on the translate queue and the emit queue. 5. Refactor the clumsy code to push/pop function timers (as opposed to pass timers), so that it fits into the nice RAII TimerMarker class like the pass timers. 6. It turns out that even with MINIMAL or NODUMP builds, we still construct a longish std::string every time Cfg::dump() is called, even though the string isn't used in MINIMAL/NODUMP mode. The dump() arg might as well be a const char * arg instead. BUG= https://bugs.chromium.org/p/nativeclient/issues/detail?id=4360 R=kschimpf@google.com Review URL: https://codereview.chromium.org/1784243006 .
-
Jim Stichnoth authored
In some cases, Subzero needs to insert into a std::vector at a particular index, resizing the vector as necessary. It appears that our vector implementation sets the capacity to exactly the size when growing the vector, without leaving any extra capacity. This causes lots of mallocs and recopies each time the vector size is increased. (Adding elements via push_back() or emplace_back() doesn't seem to have that behavior.) We help this by reserving some extra space before resizing - bump to the next power of 2 up to some point, then bump to the next multiple of a chunk size beyond that point. BUG= https://bugs.chromium.org/p/nativeclient/issues/detail?id=4360 R=kschimpf@google.com Review URL: https://codereview.chromium.org/1783113002 .
-
Jim Stichnoth authored
A fresh checkout of native_client lacks some components that Subzero's "make -f Makefile.standalone presubmit" needs. Add explicit checks for these components, and when missing, print suggestions for how to create them. BUG= https://bugs.chromium.org/p/nativeclient/issues/detail?id=4359 R=jpp@chromium.org, smklein@chromium.org Review URL: https://codereview.chromium.org/1782343003 .
-
Sean Klein authored
This CL updates "isPNaClABIExternalName" -- Subzero checked for the symbol "__pnacl_pso_root" as a function, but it is a declaration, and should be checked as one. Additionally, when the PNaClTranslator is verifying the linkage of declarations, allow "__pnacl_pso_root" to be flipped to external as a special case. Previously, translating a pso with --use-sz caused the warning: "cannot find entry symbol '__pnacl_pso_root'". That warning is removed with this CL. Fixes revert from https://codereview.chromium.org/1745783002/ TEST=external_declaration.ll BUG=https://bugs.chromium.org/p/nativeclient/issues/detail?id=4351 R=kschimpf@google.com, stichnot@chromium.org Review URL: https://codereview.chromium.org/1774383002 .
-
- 10 Mar, 2016 2 commits
-
-
John Porto authored
BUG= https://bugs.chromium.org/p/nativeclient/issues/detail?id=4360 BUG= https://bugs.chromium.org/p/nativeclient/issues/detail?id=4077 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1783893002 . Patch from John Porto <jpp@chromium.org>.
-
Jim Stichnoth authored
This reverts commit 5526c171 (https://codereview.chromium.org/1778663003) and implements it per jpp's suggestion. BUG= none R=jpp@chromium.org Review URL: https://codereview.chromium.org/1780773003 .
-
- 09 Mar, 2016 5 commits
-
-
Sean Klein authored
Additionally, refactor "GetObjdumpCmd" and "GetObjcopyCmd". BUG=https://bugs.chromium.org/p/nativeclient/issues/detail?id=4361 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/1777103002 .
-
Jim Stichnoth authored
1. Subzero constructs many strings based in part on function name. When function names are not present (as in properly finalized pexes), they are synthesized as something like "Function12345". We can shorten these strings to e.g. "F12345" by using --default-function-prefix=F . Similar for global variable names. Using short strings makes it much less likely to have to use malloc. As such, we force that to be the default in the browser translator build. For perf-testing the command-line version, the user can just add the option manually for now. Ultimately, we should avoid use of strings in this way. 2. The register allocator uses a few instances of llvm::SmallVector that are sized too small and therefore end up using malloc. This can be fixed in a clean way, and there is a TODO for it, but in the meantime we just bump the size. BUG= https://bugs.chromium.org/p/nativeclient/issues/detail?id=4360 R=jpp@chromium.org Review URL: https://codereview.chromium.org/1776343004 .
-
Sean Klein authored
le32-nacl-objdump has been deprecated, and should no longer be used. Instead, "arm-nacl-objdump" is being used. "le32-nacl-objdump" used to be a hard link to "arm-nacl-objdump", but has since been deleted in NaCl's "toolchain_build_pnacl.py" script. R=phosek@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/1776843002 .
-
Jim Stichnoth authored
BUG= none TBR=jpp@chromium.org Review URL: https://codereview.chromium.org/1778663003 .
-
Jim Stichnoth authored
The ConstantRelocatable objects for pushing local labels are allocated from the Assembler arena, and are no longer pooled, which restricts the memory growth from sandboxing x86-64 calls. Because the Assembler arena is destroyed while the fixups are still active, these fixups have to be fixed up by holding a pointer to the symbol rather than the constant. On the 10MB test pexe, the overall growth by the end is ~20MB, instead of ~130MB as before. This also partially fixes an existing bug with arm32/nonsfi/iasm, exposed by running cross tests and forcing iasm output. BUG= none R=jpp@chromium.org Review URL: https://codereview.chromium.org/1773503003 .
-
- 08 Mar, 2016 2 commits
-
-
Karl Schimpf authored
The previous implementation was charging about 24% more time that it should to the function parser. The cause was that the time to "queue" the parsed functions, and the time to emit the assembled code (again including "queue" time) was not accounted for. About 15% was going to queuing costs, and 7% to emitting the ELF file. This CL adds timing of function translateFunctions, which captures most of the queueing costs, and timing for each of the major ELF emission functions (emitELF). This allows the corresponding costs to be better bucketed, and not charged to the time it takes to parse functions in bitcode files. Bug=None R=jpp@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/1775603002 .
-
John Porto authored
BUG= https://bugs.chromium.org/p/nativeclient/issues/detail?id=4076 R=eholk@chromium.org, kschimpf@google.com, stichnot@chromium.org Review URL: https://codereview.chromium.org/1768823002 .
-