- 26 Sep, 2014 5 commits
-
-
Jim Stichnoth authored
Subzero translation is stable enough that szbuild.py should prefer Subzero-translated symbols by default. The exception is that if you explicitly use --include, the intuitive interpretation is that you only want Subzero to include those symbols (minus any given with --exclude). BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/605283002
-
Jim Stichnoth authored
Not necessary for the LLVM 3.5 merge, but nice to have anyway. BUG= https://code.google.com/p/nativeclient/issues/detail?id=3930 R=jfb@chromium.org, jvoung@chromium.org Review URL: https://codereview.chromium.org/605123002
-
Karl Schimpf authored
szdiff is an approximate match tool used in early tests. When Subzero's bitcode reader tests already exist for failing cases of szdiff, remove the broken tests. BUG=None R=stichnot@chromium.org Review URL: https://codereview.chromium.org/609813003
-
Karl Schimpf authored
This test was previously failing because insertelement returned the wrong type. However, a previous CL fixed this problem and the test now works with Subzero's bitcode reader. BUG=None R=stichnot@chromium.org Review URL: https://codereview.chromium.org/605273002
-
Jim Stichnoth authored
This just adds -std=c++11 to the compiler flags and fixes the resulting errors/warnings. Later CLs can fix things related to the LLVM 3.5 merge. BUG= https://code.google.com/p/nativeclient/issues/detail?id=3930 R=jfb@chromium.org, kschimpf@google.com Review URL: https://codereview.chromium.org/607443003
-
- 25 Sep, 2014 3 commits
-
-
Karl Schimpf authored
Instruction insertelement was incorrectly generating a result corresponding to the element type, instead of the updated vector type. BUG= None R=jvoung@chromium.org Review URL: https://codereview.chromium.org/604023003
-
Jim Stichnoth authored
Originally, for a given Variable, register preference and overlap were manually specified. That is, when choosing a free register for a Variable, it would be manually specified which (if any) related Variable would be a good choice for register selection, all things being equal. Also, it allowed the rather dangerous "AllowOverlap" specification which let the Variable use its preferred Variable's register, even if their live ranges overlap. Now, all this selection is automatic, and the machinery for manual specification is removed. A few other changes in this CL: - Address mode inference leverages the more precise - Better regalloc dump messages to follow the logic - "-verbose most" enables all verbose options except regalloc and time - "-ias" is an alias for "-integrated-as" - Bug fix: prevent 8-bit register ah from being used in register allocation, unless it is pre-colored - Bug fix: the _mov helper where Dest is NULL wasn't always actually creating a new Variable - A few tests are updated based on slightly different O2 register allocation decisions The static stats actually improve slightly across the board (around 1%), except that frame size improves by 6-10%. This is probably from smarter register allocation decisions, particularly involving phi lowering temporaries, where the manual hints weren't too good to start with. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/597003004
-
Karl Schimpf authored
Adds the python script run-llvm2ice.py (was llvm2iceinsts.py) that automatically handles conversion of LLVM source to a PEXE file, and then runs llvm2ice on the corresponding PEXE file. Also, defines three paths in tests, based on the executable chosen: %lc2i - Directly reads from LLVM source, and converts to Subzero. %l2i - Parses a PEXE file into LLVM IR, and converts to Subzero. %p2i - Parses a PEXE directly into Subzero. Note that for all three executables, the same arguments can be used, making it easy to change how the input is handled. Also moves tests to use %p2i whenever possible. BUG= https://code.google.com/p/nativeclient/issues/detail?id=3892 R=jvoung@chromium.org Review URL: https://codereview.chromium.org/600043002
-
- 24 Sep, 2014 1 commit
-
-
Jan Voung authored
Extend the bswap test to have a case which will exhibit a bit of register pressure to test register encoding more (at first wasn't sure if it was 0xC8 + reg or 0xC8 | reg... but it should be the same since there's only 0-7 for regs). BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/595093002
-
- 23 Sep, 2014 2 commits
-
-
Jan Voung authored
BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/597643002
-
Jan Voung authored
Add a flag to use the integrated assembler. Handle simple XMM binary op instructions as an initial example of how instructions might be handled. This tests fixups in a very limited sense -- Track buffer locations of fixups for floating point immediates. Patchset one shows the original dart assembler code (revision 39313), so that it can be diffed. BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/574133002
-
- 22 Sep, 2014 3 commits
-
-
Jim Stichnoth authored
This affects tracking of two kinds of Variable metadata: whether a Variable is block-local (i.e., all uses are in a single block) and if so, which CfgNode that is; and whether a Variable has a single defining instruction, and if so, which Inst that is. Originally, this metadata was constructed incrementally, which was quite fragile and most likely inaccurate under many circumstances. In the new approach, this metadata is reconstructed in a separate pass as needed. As a side benefit, the metadata fields are removed from each Variable and pulled into a separate structure, shrinking the size of Variable. There should be no functional changes, except that simple stack slot coalescing is turned off under Om1, since it takes a separate pass to calculate block-local variables, and passes are minimized under Om1. As a result, a couple of the lit tests needed to be changed. There are a few non-mechanical changes, generally to tighten up Variable tracking for liveness analysis. This is being done mainly to get precise Variable definition information so that register allocation can infer the best register preferences as well as when overlapping live ranges are allowable. BUG=none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/589003002
-
Jan Voung authored
Should be fixed now. BUG=https://code.google.com/p/nativeclient/issues/detail?id=3929 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/588893005
-
Karl Schimpf authored
Adds workaround that uses IceConverter's convertGlobals to generate global initializers. This should complete the initial implementation of Subzero's bitcode reader. BUG= https://code.google.com/p/nativeclient/issues/detail?id=3892 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/587893003
-
- 20 Sep, 2014 1 commit
-
-
Jim Stichnoth authored
When doing a bitcast between int and FP types, the way lowering works is that a spill temporary is created, with regalloc weight of zero to inhibit register allocation, and this spill temporary is used for the cvt instruction. If the other variable does not get register-allocated, then addProlog() forces the spill temporary to share the same stack slot as the other variable. Currently, the lowering code passes this information to addProlog() by using the setPreferredRegister() mechanism. This is changed by creating a target-specific subclass of Variable, so that only the spill temporaries need to carry this extra information. Ultimately, many of the existing Variable fields will be refactored into a separate structure, and only generated/used as needed by various optimization passes. The spill temporary linkage is the one thing that is still needed with Om1 when no optimizations are enabled, motivating this change. A couple other minor cleanups are also done here. The key test is that the cast cross tests continue to work, specifically the bitcast tests. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/586943003
-
- 19 Sep, 2014 3 commits
-
-
Jan Voung authored
See: https://codereview.chromium.org/580983002 BUG=none R=kschimpf@google.com, stichnot@chromium.org Review URL: https://codereview.chromium.org/581293003
-
Karl Schimpf authored
BUG= https://code.google.com/p/nativeclient/issues/detail?id=3892 R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/577353003
-
Jan Voung authored
Lift the enums out of IceInstX8632.h and IceTargetLoweringX8632.h. This will later allow the assembler to share the enum values and use them as encodings where appropriate. E.g., to avoid having a separate enum in: https://codereview.chromium.org/476323004/diff/680001/src/assembler_constants_ia32.h The "all registers" enum is retained, but separate GPRRegister and XmmRegister enums are created with tags "Encoded_Reg_foo" to represent the encoded value of register "foo". Functions are added to convert from the "all registers" namespace to the encoded ones. Re-order the BrCond so that they match the encoding according to the "Instruction Subcode" in B.1 of the Intel Manuals. BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/582113003
-
- 18 Sep, 2014 2 commits
-
-
Jim Stichnoth authored
Use --llc to pass extra arguments to pnacl-translate. Use --sz to pass extra arguments to llvm2ice. The --stats argument is removed from the script because it is Subzero-only, and can now be done with --sz=--stats . BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/582593002
-
Jim Stichnoth authored
1. Unconditional branch to the next basic block is removed. 2. For a conditional branch with a "false" edge to the next basic block, remove the unconditional branch to the fallthrough block. 3. For a conditional branch with a "true" edge to the next basic block, invert the condition and do like #2. This is enabled only for O2, particularly because inverting the branch condition is a marginally risky operation. This decreases the instruction count by about 5-6%. Also, --stats prints a final tally to make it easier to post-process the output. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/580903005
-
- 17 Sep, 2014 5 commits
-
-
Karl Schimpf authored
BUG= https://code.google.com/p/nativeclient/issues/detail?id=3892 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/576243002
-
Jim Stichnoth authored
BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/559723003
-
Karl Schimpf authored
BUG= https://code.google.com/p/nativeclient/issues/detail?id=3892 R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/576853002
-
Jim Stichnoth authored
This is needed since we are now using an absolute (and non-standard) path to clang++. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/567393007
-
Jim Stichnoth authored
The following are collected: - Number of machine instructions emitted - Number of registers saved/restored in prolog/epilog - Number of stack frame bytes (non-alloca) allocated - Number of "spills", or stores to stack slots - Number of "fills", or loads/operations from stack slots - Fill+Spill count (sum of above two) These are somewhat reasonable approximations of code quality, and the primary intention is to compare before-and-after when trying out an optimization. The statistics are dumped after translating each function. Per-function and cumulative statistics are collected. The output lines have a prefix that is easy to filter. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/580633002
-
- 16 Sep, 2014 5 commits
-
-
Jan Voung authored
In many cases, we expect a constant to be 32-bits or less. This simplifies range checking for x86 memory operand displacements (can only be 32-bit), or immediates in instructions (also 32-bit), since we only store 32-bits (so it trivially fits in 32-bits). Checks for whether a constant fits in 8-bits can be done on the 32-bit value instead of the 64-bit value. When TargetLowering sees a 64-bit immediate as an operand on a 64-bit instruction, it should have split the 64-bit immediate into a 32-bit loOperand(), and a 32-bit hiOperand(). So what's left for the Emit pass should be 32-bit constants. Other places which work with constants: - intrinsic operands (the ABI only allows i32 params for atomic mem order, or atomic is lock free byte-size, or the longjmp param). - addressing mode optimization (gep expansion should be working with i32 constants). - insertelement, and extractelement constant indices (bitcode reader restricts the type of the index to be i32 also). I guess now you may end up with multiple copies of what may be the "same" constant (i64 0 vs i32 0). BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/569033002
-
Karl Schimpf authored
Also fixes error messages on instruction operands, to print out the operand (rather than pointer to it), since we can now print out operands. BUG= https://code.google.com/p/nativeclient/issues/detail?id=389 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/577703002
-
Jim Stichnoth authored
BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/551723003
-
Jim Stichnoth authored
One wants to omit the --init option to make things go faster, but then there's the risk that the pexe or llvm2ice or llc has changed and --init is actually necessary. This change automatically compares the modification timestamps of the pexe and the llvm2ice and llc binaries against the modification timestamps of the object files to determine whether re-translation is needed. The --init option forces re-translation regardless of timestamps. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/551373008
-
Jim Stichnoth authored
A "standalone" version of dump() is provided, taking just an Ostream argument and not requiring a Cfg or GlobalContext argument. BUG= none R=kschimpf@google.com Review URL: https://codereview.chromium.org/570713006
-
- 15 Sep, 2014 2 commits
-
-
Karl Schimpf authored
LLVM objects/libraries are now built using clang from chrome. This CL changes the compiler to chrome clang, and adds an appropriate dynamic library path to the linked executable so that llvm2ice can be run in any directory. BUG=None R=jvoung@chromium.org Review URL: https://codereview.chromium.org/571973004
-
Jim Stichnoth authored
Also, refactor the key part of the address mode inference into separate functions, since it's getting unwieldy. The main thing is that we mark phi temporaries as multi-definition, and disallow address mode inference transformations that involve such temporaries, because this is incorrect particular when there are backward branches involved. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/557953007
-
- 12 Sep, 2014 4 commits
-
-
Jim Stichnoth authored
In x86-32, floating point values are returned to the caller on the top of the x87 floating point stack. The caller is required to remove it from the x87 FP stack, e.g. via the fstp instruction. This must be done even when the return value is not actually used anywhere else in the function, in which case O2 is likely to want to dead-code eliminate the fstp instruction. We enforce this by adding a fake use of the fstp destination. BUG= none R=jvoung@chromium.org Review URL: https://codereview.chromium.org/563303003
-
Jan Voung authored
ffs() vs findFirstSet() are slightly different, indexing is 0-based instead of 1-based. Example mingw error: http://build.chromium.org/p/tryserver.nacl/builders/nacl-toolchain-win7-pnacl-x86_64/builds/1920/steps/llvm_i686_w64_mingw32%20%28build%29/logs/stdio BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/563303002
-
Karl Schimpf authored
This is a workaround for issue that Subzero currently assumes all global addresses have a name, but finalized pexe files leave most global addresses unnamed. It does this by allowing two optional command-line flag name prefixes that are used to generate names for unnamed global addresses. BUG= None R=stichnot@chromium.org Review URL: https://codereview.chromium.org/567703003
-
Jan Voung authored
BUG=none R=stichnot@chromium.org Review URL: https://codereview.chromium.org/567553003
-
- 11 Sep, 2014 3 commits
-
-
Karl Schimpf authored
BUG= https://code.google.com/p/nativeclient/issues/detail?id=3892 R=jvoung@chromium.org, stichnot@chromium.org Review URL: https://codereview.chromium.org/568473002
-
Jim Stichnoth authored
BUG= none R=jfb@chromium.org Review URL: https://codereview.chromium.org/565553002
-
Karl Schimpf authored
Also fixes minor issues with branches in instructions (i.e. defining entry node and computing predecessors). BUG= https://code.google.com/p/nativeclient/issues/detail?id=3892 R=stichnot@chromium.org Review URL: https://codereview.chromium.org/561823002
-
- 10 Sep, 2014 1 commit
-
-
Karl Schimpf authored
Fixes call to extractElement. BUG= None R=stichnot@chromium.org Review URL: https://codereview.chromium.org/562783002
-