- 12 Feb, 2015 1 commit
-
-
Jim Stichnoth authored
(This is a continuation of https://codereview.chromium.org/876083007/ .) Emission is done in a separate thread when -threads=N with N>0 is specified. This includes both functions and global initializers. Emission is deterministic. The parser assigns sequence numbers, and the emitter thread reassembles work units into their original order, regardless of the number of threads. Dump output, however, is not intended to be in deterministic, reassembled order. As such, lit tests that test dump output (i.e., '-verbose inst') are explicitly run with -threads=0. For -elf-writer and -ias=1, the translator thread invokes Cfg::emitIAS() and the assembler buffer is passed to the emitter thread. For -ias=0, the translator thread passed the Cfg to the emitter thread which then invokes Cfg::emit() to produce the textual asm. Minor cleanup along the way: * Removed Flags from the Ice::Translator object and ctor, since it was redundant with Ctx->getFlags(). * Cfg::getAssembler<> is the same as Cfg::getAssembler<Assembler> and is useful for just passing the assembler around. * Removed the redundant Ctx argument from TargetDataLowering::lowerConstants() . BUG= https://code.google.com/p/nativeclient/issues/detail?id=4075 R=jvoung@chromium.org Review URL: https://codereview.chromium.org/916653004
-
- 10 Feb, 2015 1 commit
-
-
Karl Schimpf authored
Fixes the PNaCl bitcode reader to maintain two lists of global variables. The first, VariableDeclarations, is the list of variable declarations to be lowered by the emitter. The second, ValueIDConstants, is the corresponding constant symbol to use when references to the corresponding global variable declaration is referenced when processing functions. BUG=None R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/883673005
-
- 09 Feb, 2015 1 commit
-
-
Karl Schimpf authored
Allows one to define explicit overrides in get accessors, based on compilation features. To show usage, modified SubConstantCalls to never be enabled if building a minimal llvm2ice. BUG=None R=stichnot@chromium.org Review URL: https://codereview.chromium.org/905463003
-
- 06 Feb, 2015 1 commit
-
-
Jan Voung authored
Followup to a previous code review. Saves 2KB from the minimal build =) BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/904783002
-
- 05 Feb, 2015 1 commit
-
-
Karl Schimpf authored
When specified (via command line) replaces all called constant addresses with a stubbed call to the first defined function in the bitcode file. This allows testing of subzero without having to fix that downstream code (after parsing) may not handle such addresses. BUG=None R=jvoung@chromium.org Review URL: https://codereview.chromium.org/902713002
-
- 04 Feb, 2015 1 commit
-
-
Jan Voung authored
Also note to keep that up to date. See also Patch set 1 of https://codereview.chromium.org/574133002/, vs later patch sets. Some things that were changed: (*) Headers / constants use Ice version (RegX8632::Encoded_Reg_eax vs EAX), (KB / MB -> other...) (*) Use llvm/Subzero allocator instead of Dart one. (*) Class/Field/On-stack-replacement/Dart runtime stuff is removed (*) Relocation/Fixups are now POD -- rather than a class with a virtual method for fixup. For now, we write out an ELF relocation, but later we may do a target pass to handle function calls within the same section, etc. (*) ASSERT -> assert (*) uword -> uintptr_t (should check). (*) clang-format (*) ??? BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/901453002
-
- 03 Feb, 2015 4 commits
-
-
Jim Stichnoth authored
The unittest .o files also depend on some of the llvm2ice headers, as well as the unittest headers. To reproduce the problem, try this: make -f Makefile.standalone clean make -f Makefile.standalone check-unit 2>/dev/null | grep BitcodeMunge.cpp This will print a line containing BitcodeMunge.cpp. Now do: touch unittest/BitcodeMunge.h make -f Makefile.standalone check-unit 2>/dev/null | grep BitcodeMunge.cpp This should print a line, but it doesn't. Finally: touch src/PNaClTranslator.h make -f Makefile.standalone check-unit 2>/dev/null | grep BitcodeMunge.cpp This should also print a line, but it doesn't. With this patch, the unittest files get rebuilt after header file changes. BUG= none R=jvoung@chromium.org, kschimpf@google.com Review URL: https://codereview.chromium.org/895143003
-
Jan Voung authored
Also handle empty global variable lists -- and initialize ShAddralign to 1 instead of 0, just in case. Previously it would try to align by 0 when the variable list was empty. This should help the crosstests pass with --elf. BUG=none R=kschimpf@google.com, stichnot@chromium.org Review URL: https://codereview.chromium.org/899483002
-
Jan Voung authored
I'd like to bump the *trusted* clang compiler also, since the really old trusted clang compiler seems to crash if we pair old clang with new libcxx. (So the merge will probably have to bump the trusted clang compiler to a newer rev). BUG= https://code.google.com/p/nativeclient/issues/detail?id=4026 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/898693002
-
Jim Stichnoth authored
The Cfg::create() method now returns a unique_ptr. Once the parser fully builds the Cfg, it is released onto the work queue, and then acquired and ultimately deleted by the translator thread. BUG= none R=jfb@chromium.org Review URL: https://codereview.chromium.org/892063002
-
- 01 Feb, 2015 2 commits
-
-
Jan Voung authored
Preliminary linking tests, seems to show that the linker and objcopy are happy to use 'em on spec2k, and the result runs! (Had to be careful to clobber the old .s and .o files to make it's testing the right copy). Haven't tried crosstests yet. BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/889613004
-
Jim Stichnoth authored
This also implicitly applies to szbuild_spec2k.py. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/892803002
-
- 31 Jan, 2015 2 commits
-
-
Jim Stichnoth authored
Updates of current-function and cumulative stats are done entirely in TLS. At the end, cumulative stats are merged across all threads' TLS into the global cumulative stats. Printing of cumulative stats after every function is removed, since there's very little value from that. It was probably done in the first place just to give partial cumulative information in the face of crashes or assertion failures. BUG= none R=jfb@chromium.org Review URL: https://codereview.chromium.org/887213002
-
JF Bastien authored
MinGW's GCC 4.8.1 was sad because SectionType was shadowing the other SectionType. Also, the enum's values are in the ELFObjectWriter namespace, not ELFObjectWriter::SectionType. R=stichnot@chromium.org, jvoung@chromium.org BUG= Windows build is sad Review URL: https://codereview.chromium.org/891953002
-
- 30 Jan, 2015 2 commits
-
-
Jim Stichnoth authored
Now that multithreaded parsing and translation is in place, timer operations have to be made thread-local. After the non-main threads end, their thread-local timer data needs to be merged into the global timer data, which resides in the GlobalContext object. The merge is a bit tricky because the internal timer stack structure is built up dynamically as items are pushed and popped. Two threads may have radically different timing data: 1. The parser thread profile is completely different from a translator thread. 2. For -timing-funcs, two translator threads hold data for entirely different sets of functions. A bit more tweaking will need to be done to make the timing output fully usable in a multithreaded run. Because of multiple threads, times may add up to >100%. Also, time spent blocked is being "unfairly" attributed to the caller of the blocking operation - we should either count the user time instead of wall-clock time, or add a special timer marker for blocking locking operations. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/878383004
-
Jim Stichnoth authored
The problem showed up after the link step failed, in which case $(OBJDIR)/llvm2ice was deleted but the ./llvm2ice symlink still existed. A subsequent "make check-lit" or "make check" would fail, so the basic "make" would have to be done first. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/887873002
-
- 29 Jan, 2015 1 commit
-
-
Jan Voung authored
The local symbol relocations are a bit different from llvm-mc, which are section-relative. E.g., instead "bytes", it will be ".data + offsetof(bytes, .data)". So the contents of the text/data/rodata sections can also differ since the offsets written in place are different. Still need to fill the symbol table with undefined symbols (e.g., memset, and szrt lib functions) before trying to link. BUG=none R=kschimpf@google.com, stichnot@chromium.org Review URL: https://codereview.chromium.org/874353006
-
- 28 Jan, 2015 6 commits
-
-
JF Bastien authored
__attribute__((aligned(MaxCacheLineSize))) triggers a GCC bug because enum {MaxCacheLineSize = 64 }; isn't constant enough. Adding zero to it makes it that much more constant. R= stichnot@chromium.org BUG= https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55382 Review URL: https://codereview.chromium.org/867483004 -
JF Bastien authored
The period, it was missing R= stichnot@chromium.org BUG= none . Review URL: https://codereview.chromium.org/888473002
-
JF Bastien authored
<mutex> is already included from IceDefs.h (where GlobalLockType is defined) but <condition_variable> isn't included anywhere. It's probably included indirectly in some standard libraries and not others, causing build failures on Windows. TBR= stichnot@chromium.org BUG= none Review URL: https://codereview.chromium.org/884283002
-
Karl Schimpf authored
Cleans up code by removing unnecessary fields/data structures in top-level parser of Subzero. In particular: 1) Uses FunctionDeclarationList.size() instead of NumFunctionIds. 2) Removes the need for vector DefiningFunctionDeclarationList. Instead uses an (incremented) index NextDefiningFunctionID into vector FunctionDeclarationList. BUG=None R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/883493002
-
Jim Stichnoth authored
BUG= none R=jfb@chromium.org Review URL: https://codereview.chromium.org/865093003
-
Jim Stichnoth authored
This also requires modifying the ICE_CACHELINE_BOUNDARY macro to avoid a warning about anonymous structs: src/IceUtils.h:132:3: warning: anonymous structs are a GNU extension [-Wgnu-anonymous-struct] BUG= none R=jfb@chromium.org Review URL: https://codereview.chromium.org/883983002
-
- 27 Jan, 2015 3 commits
-
-
JF Bastien authored
GCC 4.8.1 is sad; There are extra semicolons in Subzero; It removes the semicolons or it gets the build warning hose again;^H R=stichnot@chromium.org BUG= none Review URL: https://codereview.chromium.org/882743003
-
Jim Stichnoth authored
There are two problems with "make format" and "make format-diff" in Makefile.standalone: 1. You have to make sure clang-format and clang-format-diff.py are available in $PATH. 2. Different users may have different versions installed (even for the same user on different machines), leading to whitespace wars. Can't we all just get along? Since the normal LLVM build that Subzero depends on also exposes and builds clang-format and friends, we might as well use it. The clang-format binary is found in $LLVM_BIN_PATH, and clang-format-diff.py is found relative to $LLVM_SRC_PATH. As long as the user's LLVM build is fairly up to date, whitespace wars are unlikely. Given this, there's a much higher incentive to use "make format" regularly instead of "make format-diff". In particular, inline comments on variable/field declaration lists can get lined up more nicely by looking at the entire context, rather than the small diff window. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/877003003
-
Jim Stichnoth authored
Provides a single-producer, multiple-consumer translation queue where the number of translation threads is given by the -threads=N argument. The producer (i.e., bitcode parser) blocks if the queue size is >=N, in order to control the memory footprint. If N=0 (which is the default), execution is purely single-threaded. If N=1, there is a single translation thread running in parallel with the parser thread. "make check" succeeds with the default changed to N=1. Currently emission is also done by the translation thread, which limits scalability since the emit stream has to be locked. Also, since the ELF writer stream is not locked, it won't be safe to use N>1 with the ELF writer. Furthermore, for N>1, emitted function ordering is nondeterministic and needs to be recombobulated. This will all be fixed in a follow-on CL. The -timing option is broken for N>0. This will be fixed in a follow-on CL. Verbose flags are now managed in the Cfg instead of (or in addition to) the GlobalContext, due to the -verbose-focus option which wants to temporarily change the verbose level for a particular function. TargetLowering::emitConstants() and related methods are changed to be static, so that a valid TargetLowering object isn't required. This is because the TargetLowering object wants to hold a valid Cfg, and none really exists after all functions are translated and the constant pool is ready for emission. The Makefile.standalone now has a TSAN=1 option to enable ThreadSanitizer. BUG= none R=jfb@chromium.org Review URL: https://codereview.chromium.org/870653002
-
- 26 Jan, 2015 1 commit
-
-
Jim Stichnoth authored
Manages thread_local pointer fields through a set of macros. If ICE_THREAD_LOCAL_HACK is defined, the thread_local definitions and accesses are defined in terms of pthread operations. This assumes that the underlying std::thread library is based on pthread. BUG= none R=jfb@chromium.org, jvoung@chromium.org Review URL: https://codereview.chromium.org/872933002
-
- 25 Jan, 2015 1 commit
-
-
Jan Voung authored
This reduces the number of conditionals, and will more closely reflect the structure of the ELF writer's version of the same thing. Without fdata-sections, the ELF writer version will have to batch all initializers of a certain type so that they can be contiguous on the file and the overall alignment can be determined. A downside of this is that, .s files will be different from llc's output. The spec .o and executables are identical before/after the change. BUG=none R=kschimpf@google.com, stichnot@chromium.org Review URL: https://codereview.chromium.org/870123003
-
- 23 Jan, 2015 1 commit
-
-
Jim Stichnoth authored
MacOS doesn't support the thread_local keyword until 10.7 or later, and our bots run 10.6. Who knows whether Visual Studio supports it yet. In the meantime, use the old-style syntax. BUG= https://codereview.chromium.org/873443004/ R=jfb@chromium.org Review URL: https://codereview.chromium.org/865973006
-
- 22 Jan, 2015 2 commits
-
-
JF Bastien authored
The following CL enables -Werror: https://codereview.chromium.org/863093002/ There were two warnings left in our subzero build: - Dead default cases because all of an enum's values were handled by the switch. - Use of C99 VLA. R=stichnot@chromium.org TEST= make check BUG= none Review URL: https://codereview.chromium.org/862853003
-
Karl Schimpf authored
The previous code did not do this. Also localizes the lock to when the error is actually printed. Note: requires https://codereview.chromium.org/865963002 BUG=None R=stichnot@chromium.org Review URL: https://codereview.chromium.org/864383002
-
- 20 Jan, 2015 3 commits
-
-
Jim Stichnoth authored
Elements were added to this vector, but never inspected, so it is essentially a useless field. Plus, the removal allows us to remove a couple of friend declarations. BUG=none R=kschimpf@google.com Review URL: https://codereview.chromium.org/814163004
-
Jim Stichnoth authored
This just gets the locking in place. Actual multithreading will be added later. Mutexes are added for accessing the GlobalContext allocator, the constant pool, the stats data, and the profiling timers. These are managed via the LockedPtr<> helper. Finer grain locks on the constant pool may be added later, i.e. a separate lock for each data type. An vector of pointers to TLS objects is added to GlobalContext. Each new thread will get its own TLS object, whose address is added to the vector. (After threads complete, things like stats can be combined by iterating over the vector.) The dump/emit streams are guarded by a separate lock, to avoid fine-grain interleaving of output by multiple threads. E.g., lock the streams, emit an entire function, and unlock the streams. This works for dumping too, though dump output for different passes on the same function may be interleaved with that of another thread. There is an OstreamLocker helper class to keep this simple. CodeStats is made an inner class of GlobalContext (this was missed on a previous CL). BUG= none R=jfb@chromium.org, jvoung@chromium.org, kschimpf@google.com Review URL: https://codereview.chromium.org/848193003
-
Karl Schimpf authored
BUG=None R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/848473002
-
- 15 Jan, 2015 1 commit
-
-
Jim Stichnoth authored
This hasn't been used in a very long time, and there's no intention of using it again. Originally there was the idea of a "fast" block-local register allocator for an O1-like configuration, which would allocate registers for infinite-weight temporaries during target lowering, using a "local register manager". This verbose option was for tracing execution of this register manager. However, by now it seems unlikely that this would do a better/faster job than the current Om1 register allocation approach, which reuses the linear-scan code quite effectively and does very well at separation of concerns. So adios IceV_RegManager! BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/831663008
-
- 13 Jan, 2015 1 commit
-
-
Jan Voung authored
Pass the full assembler pointer to the elf writer, so that it has access to both the text buffer and the fixups. Remove some child classes of AssemblerFixups. They didn't really do much, and were pretty much identical to the original AssemblerFixup class. Dart had a virtual method for fixups to do necessary patching, but we currently don't do the patching and just emit the relocations. TODO see if patching is more efficient than writing out relocations and letting the linker do the work. This CL also makes AssemblerFixups POD. Change the fixup kind to be a plain unsigned int, which the target can fill w/ target/container-specific values. Move the fwd declaration of Assembler to IceDefs and remove the others. Do similar for fwd declaration refactoring for ELFWriter. Make the createAssembler method return a std::unique_ptr. BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/828873002
-
- 12 Jan, 2015 2 commits
-
-
Jim Stichnoth authored
BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/848603002
-
Jim Stichnoth authored
BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/846763002
-
- 09 Jan, 2015 2 commits
-
-
Jan Voung authored
This avoids doing getConstantSym to avoid hitting the global context's getConstantSym during emitIAS(), which may be desirable for multi-threading, since each function's emitIAS() should be able to happen on a separate thread. The stringification is moved till later, so it still happens, just without creating a constant relocatable w/ offset of 0. This ends up tickling an issue where -O0 on 252.eon now gets 2x as many page faults, and I'm not sure exactly why. This makes the overall time higher, though emit time is lower. When translating with -O2 # of page faults is about the same before/after, so that oddness is restricted to O0. Before this change, tweaking the slab size at O0 doesn't seem to affect as drastically as 2x swings either. To work around this, I turned the slab size of the assembler down to 32KB. === Move all the .L$type$poolid into a function (replacing getPoolEntryID). BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/837553009
-
Karl Schimpf authored
Extends the NaCl bitcode munger so that the PNaClTranslator parser can be applied to the defined sequence of record values. BUG=None R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/800883006
-