llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-27 22:12:47 +01:00

Author	SHA1	Message	Date
Michael J. Spencer	c195b8a813	Replace Count{Leading,Trailing}Zeros_{32,64} with count{Leading,Trailing}Zeros. llvm-svn: 182680	2013-05-24 22:23:49 +00:00
Ulrich Weigand	7b22c7a38a	[PowerPC] Use true offset value in "memrix" machine operands This is the second part of the change to always return "true" offset values from getPreIndexedAddressParts, tackling the case of "memrix" type operands. This is about instructions like LD/STD that only have a 14-bit field to encode immediate offsets, which are implicitly extended by two zero bits by the machine, so that in effect we can access 16-bit offsets as long as they are a multiple of 4. The PowerPC back end currently handles such instructions by carrying the 14-bit value (as it will get encoded into the actual machine instructions) in the machine operand fields for such instructions. This means that those values are in fact not the true offset, but rather the offset divided by 4 (and then truncated to an unsigned 14-bit value). Like in the case fixed in r182012, this makes common code operations on such offset values not work as expected. Furthermore, there doesn't really appear to be any strong reason why we should encode machine operands this way. This patch therefore changes the encoding of "memrix" type machine operands to simply contain the "true" offset value as a signed immediate value, while enforcing the rules that it must fit in a 16-bit signed value and must also be a multiple of 4. This change must be made simultaneously in all places that access machine operands of this type. However, just about all those changes make the code simpler; in many cases we can now just share the same code for memri and memrix operands. llvm-svn: 182032	2013-05-16 17:58:02 +00:00
Hal Finkel	91bd48d046	Implement PPC counter loops as a late IR-level pass The old PPCCTRLoops pass, like the Hexagon pass version from which it was derived, could only handle some simple loops in canonical form. We cannot directly adapt the new Hexagon hardware loops pass, however, because the Hexagon pass contains a fundamental assumption that non-constant-trip-count loops will contain a guard, and this is not always true (the result being that incorrect negative counts can be generated). With this commit, we replace the pass with a late IR-level pass which makes use of SE to calculate the backedge-taken counts and safely generate the loop-count expressions (including any necessary max() parts). This IR level pass inserts custom intrinsics that are lowered into the desired decrement-and-branch instructions. The most fragile part of this new implementation is that interfering uses of the counter register must be detected on the IR level (and, on PPC, this also includes any indirect branches in addition to function calls). Also, to make all of this work, we need a variant of the mtctr instruction that is marked as having side effects. Without this, machine-code level CSE, DCE, etc. illegally transform the resulting code. Hopefully, this can be improved in the future. This new pass is smaller than the original (and much smaller than the new Hexagon hardware loops pass), and can handle many additional cases correctly. In addition, the preheader-creation code has been copied from LoopSimplify, and after we decide on where it belongs, this code will be refactored so that it can be explicitly shared (making this implementation even smaller). The new test-case files ctrloop-{le,lt,ne}.ll have been adapted from tests for the new Hexagon pass. There are a few classes of loops that this pass does not transform (noted by FIXMEs in the files), but these deficiencies can be addressed within the SE infrastructure (thus helping many other passes as well). llvm-svn: 181927	2013-05-15 21:37:41 +00:00
Michael Liao	3b258b6b24	ArrayRefize getMachineNode(). No functionality change. llvm-svn: 179901	2013-04-19 22:22:57 +00:00
Ulrich Weigand	35b2470e07	PowerPC: Remove ADDIL patterns. The ADDI/ADDI8 patterns are currently duplicated into ADDIL/ADDI8L, which describe the same instruction, except that they accept a symbolLo[64] operand instead of a s16imm[64] operand. This duplication confuses the asm parser, and it actually not really needed, since symbolLo[64] already accepts immediate operands anyway. So this commit removes the duplicate patterns. No change in generated code. llvm-svn: 178004	2013-03-26 10:55:20 +00:00
Ulrich Weigand	c8c3c4dc22	Fix swapped BasePtr and Offset in pre-inc memory addresses. PPCTargetLowering::getPreIndexedAddressParts currently provides the base part of a memory address in the offset result, and the offset part in the base result. That swap is then undone again when an MI instruction is generated (in PPCDAGToDAGISel::Select for loads, and using .md Pat patterns for stores). This patch reverts this double swap, to make common code and back-end be in sync as to which part of the address is base and which is offset. To avoid performance regressions in certain cases, target code now checks whether the choice of base register would be rejected for pre-inc accesses by common code, and attempts to swap base and offset again in such cases. (Overall, this means that now pre-ice accesses are generated more frequently than before.) llvm-svn: 177733	2013-03-22 14:58:48 +00:00
Ulrich Weigand	4b7960b48d	Tighten iaddroff ComplexPattern. The iaddroff ComplexPattern is supposed to recognize displacement expressions that have been processed by a SelectAddressRegImm, which means it needs to accept TargetConstant and TargetGlobalAddress nodes. Currently, it erroneously also accepts some other nodes, in particular Constant and PPCISD::Lo. While this problem is currently latent, it would cause wrong-code bugs with a follow-on patch I'm about to commit, so this patch tightens the ComplexPattern. The equivalent change is made in PPCDAGToDAGISel::Select, where pre-inc load patterns are handled (as opposed to store patterns, the loads are handled in C++ code without making use of the .td ComplexPattern). llvm-svn: 177732	2013-03-22 14:58:17 +00:00
Ulrich Weigand	27c5e0b210	Remove the xaddroff ComplexPattern. The xaddroff pattern is currently (mistakenly) used to recognize the base register in pre-inc store patterns. This patch replaces those uses by ptr_rc_nor0 (as is elsewhere done to match the base register of an address), and removes the now unused ComplexPattern. llvm-svn: 177731	2013-03-22 14:57:48 +00:00
Hal Finkel	7e324aee83	Implement builtin_{setjmp/longjmp} on PPC This implements SJLJ lowering on PPC, making the Clang functions __builtin_{setjmp/longjmp} functional on PPC platforms. The implementation strategy is similar to that on X86, with the exception that a branch-and-link variant is used to get the right jump address. Credit goes to Bill Schmidt for suggesting the use of the unconditional bcl form (instead of the regular bl instruction) to limit return-address-cache pollution. Benchmarking the speed at -O3 of: static jmp_buf env_sigill; void foo() { __builtin_longjmp(env_sigill,1); } main() { ... for (int i = 0; i < c; ++i) { if (__builtin_setjmp(env_sigill)) { goto done; } else { foo(); } done:; } ... } vs. the same code using the libc setjmp/longjmp functions on a P7 shows that this builtin implementation is ~4x faster with Altivec enabled and ~7.25x faster with Altivec disabled. This comparison is somewhat unfair because the libc version must also save/restore the VSX registers which we don't yet support. llvm-svn: 177666	2013-03-21 21:37:52 +00:00
Bill Schmidt	396d92f47f	Trivial cleanup llvm-svn: 175771	2013-02-21 17:26:05 +00:00
Bill Schmidt	049ba390f5	Large code model support for PowerPC. Large code model is identical to medium code model except that the addis/addi sequence for "local" accesses is never used. All accesses use the addis/ld sequence. The coding changes are straightforward; most of the patch is taken up with creating variants of the medium model tests for large model. llvm-svn: 175767	2013-02-21 17:12:27 +00:00
Bill Schmidt	4a815e468d	Code review cleanup for r175697 llvm-svn: 175739	2013-02-21 14:35:42 +00:00
Bill Schmidt	0e7935e723	PPCDAGToDAGISel::PostprocessISelDAG() This patch implements the PPCDAGToDAGISel::PostprocessISelDAG virtual method to perform post-selection peephole optimizations on the DAG representation. One optimization is implemented here: folds to clean up complex addressing expressions for thread-local storage and medium code model. It will also be useful for large code model sequences when those are added later. I originally thought about doing this on the MI representation prior to register assignment, but it's difficult to do effective global dead code elimination at that point. DCE is trivial on the DAG representation. A typical example of a candidate code sequence in assembly: addis 3, 2, globalvar@toc@ha addi 3, 3, globalvar@toc@l lwz 5, 0(3) When the final instruction is a load or store with an immediate offset of zero, the offset from the add-immediate can replace the zero, provided the relocation information is carried along: addis 3, 2, globalvar@toc@ha lwz 5, globalvar@toc@l(3) Since the addi can in general have multiple uses, we need to only delete the instruction when the last use is removed. llvm-svn: 175697	2013-02-21 00:38:25 +00:00
Bill Schmidt	bcb4fa48fa	Additional fixes for bug 15155. This handles the cases where the 6-bit splat element is odd, converting to a three-instruction sequence to add or subtract two splats. With this fix, the XFAIL in test/CodeGen/PowerPC/vec_constants.ll is removed. llvm-svn: 175663	2013-02-20 20:41:42 +00:00
Bill Schmidt	93b2fc9f50	Fix PR15155: lost vadd/vsplat optimization. During lowering of a BUILD_VECTOR, we look for opportunities to use a vector splat. When the splatted value fits in 5 signed bits, a single splat does the job. When it doesn't fit in 5 bits but does fit in 6, and is an even value, we can splat on half the value and add the result to itself. This last optimization hasn't been working recently because of improved constant folding. To circumvent this, create a pseudo VADD_SPLAT that can be expanded during instruction selection. llvm-svn: 175632	2013-02-20 15:50:31 +00:00
Krzysztof Parzyszek	b6d2a1c1ee	Add registration for PPC-specific passes to allow the IR to be dumped via -print-after-all. llvm-svn: 175058	2013-02-13 17:40:07 +00:00
Chandler Carruth	7c0d682f47	Sort all of the includes. Several files got checked in with mis-sorted includes. llvm-svn: 172891	2013-01-19 08:03:47 +00:00
Bill Schmidt	250d836387	This patch addresses bug 14678 by fixing two problems in medium code model code generation. Variables addressed through a GlobalAlias were not being handled, and variables with available_externally linkage were treated incorrectly. The patch contains two new tests to verify the correct code generation for these cases. llvm-svn: 171778	2013-01-07 19:29:18 +00:00
Chandler Carruth	4c1f3c24db	Move all of the header files which are involved in modelling the LLVM IR into their new header subdirectory: include/llvm/IR. This matches the directory structure of lib, and begins to correct a long standing point of file layout clutter in LLVM. There are still more header files to move here, but I wanted to handle them in separate commits to make tracking what files make sense at each layer easier. The only really questionable files here are the target intrinsic tablegen files. But that's a battle I'd rather not fight today. I've updated both CMake and Makefile build systems (I think, and my tests think, but I may have missed something). I've also re-sorted the includes throughout the project. I'll be committing updates to Clang, DragonEgg, and Polly momentarily. llvm-svn: 171366	2013-01-02 11:36:10 +00:00
Bill Schmidt	07e3927917	This is another cleanup patch for 64-bit PowerPC TLS processing. I had some hackery in place that hid my poor use of TblGen, which I've now sorted out and cleaned up. No change in observable behavior, so no new test cases. llvm-svn: 170149	2012-12-13 20:57:10 +00:00
Bill Schmidt	7a93daad1a	This patch implements local-dynamic TLS model support for the 64-bit PowerPC target. This is the last of the four models, so we now have full TLS support. This is mostly a straightforward extension of the general dynamic model. I had to use an additional Chain operand to tie ADDIS_DTPREL_HA to the register copy following ADDI_TLSLD_L; otherwise everything above the ADDIS_DTPREL_HA appeared dead and was removed. As before, there are new test cases to test the assembly generation, and the relocations output during integrated assembly. The expected code gen sequence can be read in test/CodeGen/PowerPC/tls-ld.ll. There are a couple of things I think can be done more efficiently in the overall TLS code, so there will likely be a clean-up patch forthcoming; but for now I want to be sure the functionality is in place. Bill llvm-svn: 170003	2012-12-12 19:29:35 +00:00
Bill Schmidt	45b56f7632	This patch implements the general dynamic TLS model for 64-bit PowerPC. Given a thread-local symbol x with global-dynamic access, the generated code to obtain x's address is: Instruction Relocation Symbol addis ra,r2,x@got@tlsgd@ha R_PPC64_GOT_TLSGD16_HA x addi r3,ra,x@got@tlsgd@l R_PPC64_GOT_TLSGD16_L x bl __tls_get_addr(x@tlsgd) R_PPC64_TLSGD x R_PPC64_REL24 __tls_get_addr nop <use address in r3> The implementation borrows from the medium code model work for introducing special forms of ADDIS and ADDI into the DAG representation. This is made slightly more complicated by having to introduce a call to the external function __tls_get_addr. Using the full call machinery is overkill and, more importantly, makes it difficult to add a special relocation. So I've introduced another opcode GET_TLS_ADDR to represent the function call, and surrounded it with register copies to set up the parameter and return value. Most of the code is pretty straightforward. I ran into one peculiarity when I introduced a new PPC opcode BL8_NOP_ELF_TLSGD, which is just like BL8_NOP_ELF except that it takes another parameter to represent the symbol ("x" above) that requires a relocation on the call. Something in the TblGen machinery causes BL8_NOP_ELF and BL8_NOP_ELF_TLSGD to be treated identically during the emit phase, so this second operand was never visited to generate relocations. This is the reason for the slightly messy workaround in PPCMCCodeEmitter.cpp:getDirectBrEncoding(). Two new tests are included to demonstrate correct external assembly and correct generation of relocations using the integrated assembler. Comments welcome! Thanks, Bill llvm-svn: 169910	2012-12-11 20:30:11 +00:00
Bill Schmidt	9d8cdcda41	This patch introduces initial-exec model support for thread-local storage on 64-bit PowerPC ELF. The patch includes code to handle external assembly and MC output with the integrated assembler. It intentionally does not support the "old" JIT. For the initial-exec TLS model, the ABI requires the following to calculate the address of external thread-local variable x: Code sequence Relocation Symbol ld 9,x@got@tprel(2) R_PPC64_GOT_TPREL16_DS x add 9,9,x@tls R_PPC64_TLS x The register 9 is arbitrary here. The linker will replace x@got@tprel with the offset relative to the thread pointer to the generated GOT entry for symbol x. It will replace x@tls with the thread-pointer register (13). The two test cases verify correct assembly output and relocation output as just described. PowerPC-specific selection node variants are added for the two instructions above: LD_GOT_TPREL and ADD_TLS. These are inserted when an initial-exec global variable is encountered by PPCTargetLowering::LowerGlobalTLSAddress(), and later lowered to machine instructions LDgotTPREL and ADD8TLS. LDgotTPREL is a pseudo that uses the same LDrs support added for medium code model's LDtocL, with a different relocation type. The rest of the processing is straightforward. llvm-svn: 169281	2012-12-04 16:18:08 +00:00
Chandler Carruth	a490793037	Use the new script to sort the includes of every file under lib. Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131	2012-12-03 16:50:05 +00:00
Bill Schmidt	0975882ed4	This patch implements medium code model support for 64-bit PowerPC. The default for 64-bit PowerPC is small code model, in which TOC entries must be addressable using a 16-bit offset from the TOC pointer. Additionally, only TOC entries are addressed via the TOC pointer. With medium code model, TOC entries and data sections can all be addressed via the TOC pointer using a 32-bit offset. Cooperation with the linker allows 16-bit offsets to be used when these are sufficient, reducing the number of extra instructions that need to be executed. Medium code model also does not generate explicit TOC entries in ".section toc" for variables that are wholly internal to the compilation unit. Consider a load of an external 4-byte integer. With small code model, the compiler generates: ld 3, .LC1@toc(2) lwz 4, 0(3) .section .toc,"aw",@progbits .LC1: .tc ei[TC],ei With medium model, it instead generates: addis 3, 2, .LC1@toc@ha ld 3, .LC1@toc@l(3) lwz 4, 0(3) .section .toc,"aw",@progbits .LC1: .tc ei[TC],ei Here .LC1@toc@ha is a relocation requesting the upper 16 bits of the 32-bit offset of ei's TOC entry from the TOC base pointer. Similarly, .LC1@toc@l is a relocation requesting the lower 16 bits. Note that if the linker determines that ei's TOC entry is within a 16-bit offset of the TOC base pointer, it will replace the "addis" with a "nop", and replace the "ld" with the identical "ld" instruction from the small code model example. Consider next a load of a function-scope static integer. For small code model, the compiler generates: ld 3, .LC1@toc(2) lwz 4, 0(3) .section .toc,"aw",@progbits .LC1: .tc test_fn_static.si[TC],test_fn_static.si .type test_fn_static.si,@object .local test_fn_static.si .comm test_fn_static.si,4,4 For medium code model, the compiler generates: addis 3, 2, test_fn_static.si@toc@ha addi 3, 3, test_fn_static.si@toc@l lwz 4, 0(3) .type test_fn_static.si,@object .local test_fn_static.si .comm test_fn_static.si,4,4 Again, the linker may replace the "addis" with a "nop", calculating only a 16-bit offset when this is sufficient. Note that it would be more efficient for the compiler to generate: addis 3, 2, test_fn_static.si@toc@ha lwz 4, test_fn_static.si@toc@l(3) The current patch does not perform this optimization yet. This will be addressed as a peephole optimization in a later patch. For the moment, the default code model for 64-bit PowerPC will remain the small code model. We plan to eventually change the default to medium code model, which matches current upstream GCC behavior. Note that the different code models are ABI-compatible, so code compiled with different models will be linked and execute correctly. I've tested the regression suite and the application/benchmark test suite in two ways: Once with the patch as submitted here, and once with additional logic to force medium code model as the default. The tests all compile cleanly, with one exception. The mandel-2 application test fails due to an unrelated ABI compatibility with passing complex numbers. It just so happens that small code model was incredibly lucky, in that temporary values in floating-point registers held the expected values needed by the external library routine that was called incorrectly. My current thought is to correct the ABI problems with _Complex before making medium code model the default, to avoid introducing this "regression." Here are a few comments on how the patch works, since the selection code can be difficult to follow: The existing logic for small code model defines three pseudo-instructions: LDtoc for most uses, LDtocJTI for jump table addresses, and LDtocCPT for constant pool addresses. These are expanded by SelectCodeCommon(). The pseudo-instruction approach doesn't work for medium code model, because we need to generate two instructions when we match the same pattern. Instead, new logic in PPCDAGToDAGISel::Select() intercepts the TOC_ENTRY node for medium code model, and generates an ADDIStocHA followed by either a LDtocL or an ADDItocL. These new node types correspond naturally to the sequences described above. The addis/ld sequence is generated for the following cases: * Jump table addresses * Function addresses * External global variables * Tentative definitions of global variables (common linkage) The addis/addi sequence is generated for the following cases: * Constant pool entries * File-scope static global variables * Function-scope static variables Expanding to the two-instruction sequences at select time exposes the instructions to subsequent optimization, particularly scheduling. The rest of the processing occurs at assembly time, in PPCAsmPrinter::EmitInstruction. Each of the instructions is converted to a "real" PowerPC instruction. When a TOC entry needs to be created, this is done here in the same manner as for the existing LDtoc, LDtocJTI, and LDtocCPT pseudo-instructions (I factored out a new routine to handle this). I had originally thought that if a TOC entry was needed for LDtocL or ADDItocL, it would already have been generated for the previous ADDIStocHA. However, at higher optimization levels, the ADDIStocHA may appear in a different block, which may be assembled textually following the block containing the LDtocL or ADDItocL. So it is necessary to include the possibility of creating a new TOC entry for those two instructions. Note that for LDtocL, we generate a new form of LD called LDrs. This allows specifying the @toc@l relocation for the offset field of the LD instruction (i.e., the offset is replaced by a SymbolLo relocation). When the peephole optimization described above is added, we will need to do similar things for all immediate-form load and store operations. The seven "mcm-n.ll" test cases are kept separate because otherwise the intermingling of various TOC entries and so forth makes the tests fragile and hard to understand. The above assumes use of an external assembler. For use of the integrated assembler, new relocations are added and used by PPCELFObjectWriter. Testing is done with "mcm-obj.ll", which tests for proper generation of the various relocations for the same sequences tested with the external assembler. llvm-svn: 168708	2012-11-27 17:35:46 +00:00
Adhemerval Zanella	ac3ba40bc2	PowerPC: More support for Altivec compare operations This patch adds more support for vector type comparisons using altivec. It adds correct support for v16i8, v8i16, v4i32, and v4f32 vector types for comparison operators ==, !=, >, >=, <, and <=. llvm-svn: 167015	2012-10-30 13:50:19 +00:00
Bill Schmidt	5f0844eeb4	The PowerPC VRSAVE register has been somewhat of an odd beast since the Altivec extensions were introduced. Its use is optional, and allows the compiler to communicate to the operating system which vector registers should be saved and restored during a context switch. In practice, this information is ignored by the various operating systems using the SVR4 ABI; the kernel saves and restores the entire register state. Setting the VRSAVE register is no longer performed by the AIX XL compilers, the IBM i compilers, or by GCC on Power Linux systems. It seems best to avoid this logic within LLVM as well. This patch avoids generating code to update and restore VRSAVE for the PowerPC SVR4 ABIs (32- and 64-bit). The code remains in place for the Darwin ABI. llvm-svn: 165656	2012-10-10 20:54:15 +00:00
Adhemerval Zanella	91fa3a3479	PR12716: PPC crashes on vector compare Vector compare using altivec 'vcmpxxx' instructions have as third argument a vector register instead of CR one, different from integer and float-point compares. This leads to a failure in code generation, where 'SelectSETCC' expects a DAG with a CR register and gets vector register instead. This patch changes the behavior by just returning a DAG with the vector compare instruction based on the type. The patch also adds a testcase for all vector types llvm defines. It also included a fix on signed 5-bits predicates printing, where signed values were not handled correctly as signed (char are unsigned by default for PowerPC). This generates 'vspltisw' (vector splat) instruction with SIM out of range. llvm-svn: 165419	2012-10-08 18:59:53 +00:00
Sylvestre Ledru	b77340e506	Revert 'Fix a typo 'iff' => 'if''. iff is an abreviation of if and only if. See: http://en.wikipedia.org/wiki/If_and_only_if Commit 164767 llvm-svn: 164768	2012-09-27 10:14:43 +00:00
Sylvestre Ledru	1c5e7904de	Fix a typo 'iff' => 'if' llvm-svn: 164767	2012-09-27 09:59:43 +00:00
Hal Finkel	caa4701e37	Optimize zext on PPC64. The zeroextend IR instruction is lowered to an 'and' node with an immediate mask operand, which in turn gets legalised to a sequence of ori's & ands. This can be done more efficiently using the rldicl instruction. Patch by Tobias von Koch. llvm-svn: 162724	2012-08-28 02:10:15 +00:00
Hal Finkel	bc9be7c0e5	Treat TargetGlobalAddress as a constant for the purpose of matching pre-inc stores on PPC. Thanks to Tobias von Koch for pointing out this problem. llvm-svn: 158932	2012-06-21 20:10:48 +00:00
Hal Finkel	a94da28a6d	Add support for generating reg+reg (indexed) pre-inc loads on PPC. llvm-svn: 158823	2012-06-20 15:43:03 +00:00
Hal Finkel	42b797225a	Add support for generating reg+reg preinc stores on PPC. PPC will now generate STWUX and friends. llvm-svn: 158698	2012-06-19 02:34:32 +00:00
Hal Finkel	fa52a34d78	Rename the PPC target feature gpul to mfocrf. The PPC target feature gpul (IsGigaProcessor) was only used for one thing: To enable the generation of the MFOCRF instruction. Furthermore, this instruction is available on other PPC cores outside of the G5 line. This feature now corresponds to the HasMFOCRF flag. No functionality change. llvm-svn: 158323	2012-06-11 19:57:01 +00:00
Craig Topper	a0bf6c3af3	Convert some uses of XXXRegisterClass to &XXXRegClass. No functional change since they are equivalent. llvm-svn: 155186	2012-04-20 06:31:50 +00:00
Rafael Espindola	88a1aeb123	Always compute all the bits in ComputeMaskedBits. This allows us to keep passing reduced masks to SimplifyDemandedBits, but know about all the bits if SimplifyDemandedBits fails. This allows instcombine to simplify cases like the one in the included testcase. llvm-svn: 154011	2012-04-04 12:51:34 +00:00
David Blaikie	06ecc99a56	More dead code removal (using -Wunreachable-code) llvm-svn: 148578	2012-01-20 21:51:11 +00:00
Hal Finkel	fc4a5889c2	MTCTR needs to be glued to BCTR so that CTR is not marked dead in MTCTR (another find by -verify-machineinstrs) llvm-svn: 146137	2011-12-08 04:36:44 +00:00
Evan Cheng	1acd685d87	Add bundle aware API for querying instruction properties and switch the code generator to it. For non-bundle instructions, these behave exactly the same as the MC layer API. For properties like mayLoad / mayStore, look into the bundle and if any of the bundled instructions has the property it would return true. For properties like isPredicable, only return true if all of the bundled instructions have the property. For properties like canFoldAsLoad, isCompare, conservatively return false for bundles. llvm-svn: 146026	2011-12-07 07:15:52 +00:00
Evan Cheng	2b239cbcf6	Sink codegen optimization level into MCCodeGenInfo along side relocation model and code model. This eliminates the need to pass OptLevel flag all over the place and makes it possible for any codegen pass to use this information. llvm-svn: 144788	2011-11-16 08:38:26 +00:00
Evan Cheng	2e96785311	Rename TargetAsmParser to MCTargetAsmParser and TargetAsmLexer to MCTargetAsmLexer; rename createAsmLexer to createMCAsmLexer and createAsmParser to createMCAsmParser. llvm-svn: 136027	2011-07-26 00:24:13 +00:00
Roman Divacky	79578394f5	Don't apply on PPC64 the 32bit ADDIC optimizations as there's no overflow with 32bit values. llvm-svn: 133439	2011-06-20 15:28:39 +00:00
Roman Divacky	3624922127	Fix wrong usages of CTR/MCTR where CTR8/MCTR8 was meant. - Check for MTCTR8 in addition to MTCTR when looking up a hazard. - When lowering an indirect call use CTR8 when targeting 64bit. - Introduce BCTR8 that uses CTR8 and use it on 64bit when expanding ISD::BRIND. The last change fixes PR8487. With those changes, we are able to compile a running "ls" and "sh" on FreeBSD/PowerPC64. llvm-svn: 132552	2011-06-03 15:47:49 +00:00
Cameron Zwarich	b5755a9dc2	Fix PR8828 by removing the explicit def in MovePCToLR as well as the pointless piclabel operand. The operand in the tablegen definition doesn't actually turn into an MI operand, so it just confuses anything checking the TargetInstrDesc for the number of operands. It suffices to just have an implicit def of LR. llvm-svn: 131626	2011-05-19 02:56:28 +00:00
Jakob Stoklund Olesen	e2e0850651	Fix the last virtual register enumerations. llvm-svn: 123102	2011-01-08 23:11:11 +00:00
Andrew Trick	134b2a5907	Various bits of framework needed for precise machine-level selection DAG scheduling during isel. Most new functionality is currently guarded by -enable-sched-cycles and -enable-sched-hazard. Added InstrItineraryData::IssueWidth field, currently derived from ARM itineraries, but could be initialized differently on other targets. Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is active, and if so how many cycles of state it holds. Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry into the scheduler's available queue. ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to get information about it's SUnits, provides RecedeCycle for bottom-up scheduling, correctly computes scoreboard depth, tracks IssueCount, and considers potential stall cycles when checking for hazards. ScheduleDAGRRList now models machine cycles and hazards (under flags). It tracks MinAvailableCycle, drives the hazard recognizer and priority queue's ready filter, manages a new PendingQueue, properly accounts for stall cycles, etc. llvm-svn: 122541	2010-12-24 05:03:26 +00:00
Andrew Trick	53f4556c64	whitespace llvm-svn: 122539	2010-12-24 04:28:06 +00:00
Chris Lattner	65c5243bd6	rename MVT::Flag to MVT::Glue. "Flag" is a terrible name for something that just glues two nodes together, even if it is sometimes used for flags. llvm-svn: 122310	2010-12-21 02:38:05 +00:00
Chris Lattner	55043ef46a	fix a long standing wart: all the ComplexPattern's were being passed the root of the match, even though only a few patterns actually needed this (one in X86, several in ARM [which should be refactored anyway], and some in CellSPU that I don't feel like detangling). Instead of requiring all ComplexPatterns to take the dead root, have targets opt into getting the root by putting SDNPWantRoot on the ComplexPattern. llvm-svn: 114471	2010-09-21 20:31:19 +00:00
Chris Lattner	8df3ffd7ac	zap dead code. llvm-svn: 113073	2010-09-04 18:12:00 +00:00
Dale Johannesen	78714b5dc9	The PPC MFCR instruction implicitly uses all 8 of the CR registers. Currently it is not so marked, which leads to VCMPEQ instructions that feed into it getting deleted. If it is so marked, local RA complains about this sequence: vreg = MCRF CR0 MFCR <kill of whatever preg got assigned to vreg> All current uses of this instruction are only interested in one of the 8 CR registers, so redefine MFCR to be a normal unary instruction with a CR input (which is emitted only as a comment). That avoids all problems. 7739628. llvm-svn: 104238	2010-05-20 17:48:26 +00:00
Dan Gohman	a0f855157e	Use const qualifiers with TargetLowering. This eliminates several const_casts, and it reinforces the design of the Target classes being immutable. SelectionDAGISel::IsLegalToFold is now a static member function, because PIC16 uses it in an unconventional way. There is more room for API cleanup here. And PIC16's AsmPrinter no longer uses TargetLowering. llvm-svn: 101635	2010-04-17 15:26:15 +00:00
Chris Lattner	58b7cca257	use DebugLoc default ctor instead of DebugLoc::getUnknownLoc() llvm-svn: 100214	2010-04-02 20:16:16 +00:00
Benjamin Kramer	9bbdbd2dba	Make isInt?? and isUint?? template specializations of the generic versions. This makes calls a little bit more consistent and allows easy removal of the specializations in the future. Convert all callers to the templated functions. llvm-svn: 99838	2010-03-29 21:13:41 +00:00
Chris Lattner	1707a88a2c	Sink InstructionSelect() out of each target into SDISel, and rename it DoInstructionSelection. Inline "SelectRoot" into it from DAGISelHeader. Sink some other stuff out of DAGISelHeader into SDISel. Eliminate the various 'Indent' stuff from various targets, which dates to when isel was recursive. 17 files changed, 114 insertions(+), 430 deletions(-) llvm-svn: 97555	2010-03-02 06:34:30 +00:00
Dan Gohman	92b6122204	Fix "the the" and similar typos. llvm-svn: 95781	2010-02-10 16:03:48 +00:00
Dan Gohman	9bcfdf98f1	Change SelectCode's argument from SDValue to SDNode , to make it more clear what information these functions are actually using. This is also a micro-optimization, as passing a SDNode around is simpler than passing a { SDNode *, int } by value or reference. llvm-svn: 92564	2010-01-05 01:24:18 +00:00
Dale Johannesen	ee890316d5	Make capitalization of names starting "is" more consistent. No functional change. llvm-svn: 89724	2009-11-24 01:09:07 +00:00
Dale Johannesen	45f80d39f6	Remove an incorrect overaggressive optimization (PPC specific). llvm-svn: 89496	2009-11-20 22:16:40 +00:00
Dan Gohman	eec0f1c506	Remove uninteresting and confusing debug output. llvm-svn: 86149	2009-11-05 18:47:09 +00:00
Nick Lewycky	2b8400628d	Remove includes of Support/Compiler.h that are no longer needed after the VISIBILITY_HIDDEN removal. llvm-svn: 85043	2009-10-25 06:57:41 +00:00
Nick Lewycky	711c726c97	Remove VISIBILITY_HIDDEN from class/struct found inside anonymous namespaces. Chris claims we should never have visibility_hidden inside any .cpp file but that's still not true even after this commit. llvm-svn: 85042	2009-10-25 06:33:48 +00:00
Dan Gohman	0c4e55a2f8	Rename getTargetNode to getMachineNode, for consistency with the naming scheme used in SelectionDAG, where there are multiple kinds of "target" nodes, but "machine" nodes are nodes which represent a MachineInstr. llvm-svn: 82790	2009-09-25 18:54:59 +00:00
Devang Patel	c071d6c1b4	Record variable debug info at ISel time directly. llvm-svn: 79742	2009-08-22 17:12:53 +00:00
Dale Johannesen	ff6e66e502	PowerPC inline asm was emitting two output operands for a single "m" constraint; this is wrong because the opcode of a load or store would have to change in parallel. This patch makes it always compute addresses into a register, which is correct but not as efficient as possible. 7144566. llvm-svn: 79292	2009-08-18 00:18:39 +00:00
Dan Gohman	fcb5e33aa1	Simplify a few more things, eliminating a few more dependencies on "the current basic block". llvm-svn: 79069	2009-08-15 02:07:36 +00:00
Owen Anderson	48f2f0ae72	Split EVT into MVT and EVT, the former representing _just_ a primitive type, while the latter is capable of representing either a primitive or an extended type. llvm-svn: 78713	2009-08-11 20:47:22 +00:00
Owen Anderson	b4bce99769	Rename MVT to EVT, in preparation for splitting SimpleValueType out into its own struct type. llvm-svn: 78610	2009-08-10 22:56:29 +00:00
Dan Gohman	f28b3bb262	Reapply r77654 with a fix: MachineFunctionPass's getAnalysisUsage shouldn't do AU.setPreservesCFG(), because even though CodeGen passes don't modify the LLVM IR CFG, they may modify the MachineFunction CFG, and passes like MachineLoop are registered with isCFGOnly set to true. llvm-svn: 77691	2009-07-31 18:16:33 +00:00
Daniel Dunbar	60d71a790c	Revert r77654, it appears to be causing llvm-gcc bootstrap failures, and many failures when building assorted projects with clang. --- Reverse-merging r77654 into '.': U include/llvm/CodeGen/Passes.h U include/llvm/CodeGen/MachineFunctionPass.h U include/llvm/CodeGen/MachineFunction.h U include/llvm/CodeGen/LazyLiveness.h U include/llvm/CodeGen/SelectionDAGISel.h D include/llvm/CodeGen/MachineFunctionAnalysis.h U include/llvm/Function.h U lib/Target/CellSPU/SPUISelDAGToDAG.cpp U lib/Target/PowerPC/PPCISelDAGToDAG.cpp U lib/CodeGen/LLVMTargetMachine.cpp U lib/CodeGen/MachineVerifier.cpp U lib/CodeGen/MachineFunction.cpp U lib/CodeGen/PrologEpilogInserter.cpp U lib/CodeGen/MachineLoopInfo.cpp U lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp D lib/CodeGen/MachineFunctionAnalysis.cpp D lib/CodeGen/MachineFunctionPass.cpp U lib/CodeGen/LiveVariables.cpp llvm-svn: 77661	2009-07-31 03:02:41 +00:00
Dan Gohman	645f1122c0	Manage MachineFunctions with an analysis Pass instead of the Annotable mechanism. To support this, make MachineFunctionPass a little more complete. llvm-svn: 77654	2009-07-31 01:52:50 +00:00
Torok Edwin	f955a6ef49	llvm_unreachable->llvm_unreachable(0), LLVM_UNREACHABLE->llvm_unreachable. This adds location info for all llvm_unreachable calls (which is a macro now) in !NDEBUG builds. In NDEBUG builds location info and the message is off (it only prints "UREACHABLE executed"). llvm-svn: 75640	2009-07-14 16:55:14 +00:00
Torok Edwin	ae8a3ff177	assert(0) -> LLVM_UNREACHABLE. Make llvm_unreachable take an optional string, thus moving the cerr<< out of line. LLVM_UNREACHABLE is now a simple wrapper that makes the message go away for NDEBUG builds. llvm-svn: 75379	2009-07-11 20:10:48 +00:00
Torok Edwin	358888da3a	Implement changes from Chris's feedback. Finish converting lib/Target. llvm-svn: 75043	2009-07-08 20:53:28 +00:00
Chris Lattner	d12470d85e	Fix some failures in targets on available_externally functions, this fixes a crash on CodeGen/Generic/externally_available.ll on ppc hosts. Thanks to Nicholas L for pointing this out. llvm-svn: 69333	2009-04-17 00:26:12 +00:00
Dale Johannesen	428a8fcc3d	Remove refs to non-DebugLoc version of BuildMI from PowerPC. llvm-svn: 64431	2009-02-13 02:27:39 +00:00
Chris Lattner	aeec838015	fix PR3538 for PPC llvm-svn: 64383	2009-02-12 17:37:15 +00:00
Dale Johannesen	7099fa18a4	Eliminate remaining non-DebugLoc version of getTargetNode. llvm-svn: 63951	2009-02-06 19:16:40 +00:00
Dale Johannesen	e95c76b65e	Get rid of one more non-DebugLoc getNode and its corresponding getTargetNode. Lots of caller changes. llvm-svn: 63904	2009-02-06 01:31:28 +00:00
Dale Johannesen	d27bc65e74	Remove non-DebugLoc forms of CopyToReg and CopyFromReg. Adjust callers. llvm-svn: 63789	2009-02-04 23:02:30 +00:00
Evan Cheng	3c00875658	Fix 80 col violations. llvm-svn: 62518	2009-01-19 18:57:29 +00:00
Evan Cheng	2f50b49f22	Handle ISD::DECLARE with PIC relocation model. llvm-svn: 62516	2009-01-19 18:31:51 +00:00
Evan Cheng	d7cc550900	Fix PPC ISD::Declare isel and eliminate the need for PPCTargetLowering::LowerGlobalAddress to check if isVerifiedDebugInfoDesc() is true. Given the recent changes, it would falsely return true for a lot of GlobalAddressSDNode's. llvm-svn: 62373	2009-01-16 22:57:32 +00:00
Dan Gohman	3e0dcbbd15	Generalize the HazardRecognizer interface so that it can be used to support MachineInstr-based scheduling in addition to SDNode-based scheduling. llvm-svn: 62284	2009-01-15 22:18:12 +00:00
Dan Gohman	6fcee67989	Move a few containers out of ScheduleDAGInstrs::BuildSchedGraph and into the ScheduleDAGInstrs class, so that they don't get destructed and re-constructed for each block. This fixes a compile-time hot spot in the post-pass scheduler. To help facilitate this, tidy and do some minor reorganization in the scheduler constructor functions. llvm-svn: 62275	2009-01-15 19:20:50 +00:00
Dale Johannesen	bc914a7cf9	Make FP tests requiring two compares work on PPC (PR 642). This is Chris' patch from the PR, modified to realize that SETUGT/SETULT occur legitimately with integers, plus two fixes in LegalizeDAG to pass a valid result type into LegalizeSetCC. The argument of TLI.getSetCCResultType is ignored on PPC, but I think I'm following usage elsewhere. llvm-svn: 58871	2008-11-07 22:54:33 +00:00
Dan Gohman	1db84e57c5	Reintroduce a comment that was removed with the AddToISelQueue changes. llvm-svn: 58760	2008-11-05 17:16:24 +00:00
Dan Gohman	cd4b68bee9	Eliminate the ISel priority queue, which used the topological order for a priority function. Instead, just iterate over the AllNodes list, which is already in topological order. This eliminates a fair amount of bookkeeping, and speeds up the isel phase by about 15% on many testcases. The impact on most targets is that AddToISelQueue calls can be simply removed. In the x86 target, there are two additional notable changes. The rule-bending AND+SHIFT optimization in MatchAddress that creates new pre-isel nodes during isel is now a little more verbose, but more robust. Instead of either creating an invalid DAG or creating an invalid topological sort, as it has historically done, it can now just insert the new nodes into the node list at a position where they will be consistent with the topological ordering. Also, the address-matching code has logic that checked to see if a node was "already selected". However, when a node is selected, it has all its uses taken away via ReplaceAllUsesWith or equivalent, so it won't recieve any further visits from MatchAddress. This code is now removed. llvm-svn: 58748	2008-11-05 04:14:16 +00:00
David Greene	93f9f0f718	Have TableGen emit setSubgraphColor calls under control of a -gen-debug flag. Then in a debugger developers can set breakpoints at these calls to see waht is about to be selected and what the resulting subgraph looks like. This really helps when debugging instruction selection. llvm-svn: 58278	2008-10-27 21:56:29 +00:00
Dan Gohman	90f776986d	Trim #includes. llvm-svn: 57649	2008-10-16 20:18:31 +00:00
Dan Gohman	00034b1416	Avoid creating two TargetLowering objects for each target. Instead, just create one, and make sure everything that needs it can access it. Previously most of the SelectionDAGISel subclasses all had their own TargetLowering object, which was redundant with the TargetLowering object in the TargetMachine subclasses, except on Sparc, where SparcTargetMachine didn't have a TargetLowering object. Change Sparc to work more like the other targets here. llvm-svn: 57016	2008-10-03 16:55:19 +00:00
Dan Gohman	89660301e3	Rename ConstantSDNode::getValue to getZExtValue, for consistency with ConstantInt. This led to fixing a bug in TargetLowering.cpp using getValue instead of getAPIntValue. llvm-svn: 56159	2008-09-12 16:56:44 +00:00
Dan Gohman	7ee14837e6	Clean up uses of TargetLowering::getTargetMachine. llvm-svn: 55769	2008-09-04 15:39:15 +00:00
Gabor Greif	7db742d8c2	fix a bunch of 80-col violations llvm-svn: 55588	2008-08-31 15:37:04 +00:00
Gabor Greif	86c795a8ca	erect abstraction boundaries for accessing SDValue members, rename Val -> Node to reflect semantics llvm-svn: 55504	2008-08-28 21:40:38 +00:00
Dan Gohman	a9d5f9b006	Move the point at which FastISel taps into the SelectionDAGISel process up to a higher level. This allows FastISel to leverage more of SelectionDAGISel's infastructure, such as updating Machine PHI nodes. Also, implement transitioning from SDISel back to FastISel in the middle of a block, so it's now possible to go back and forth. This allows FastISel to hand individual CallInsts and other complicated things off to SDISel to handle, while handling the rest of the block itself. To help support this, reorganize the SelectionDAG class so that it is allocated once and reused throughout a function, instead of being completely reallocated for each block. llvm-svn: 55219	2008-08-23 02:25:05 +00:00
Dan Gohman	4b801d38a1	Simplify SelectRoot's interface, and factor out some common code from all targets. llvm-svn: 55124	2008-08-21 16:36:34 +00:00
Dan Gohman	9742f7772d	Rename SDOperand to SDValue. llvm-svn: 54128	2008-07-27 21:46:04 +00:00
Dan Gohman	8981962672	Add a new function, ReplaceAllUsesOfValuesWith, which handles bulk replacement of multiple values. This is slightly more efficient than doing multiple ReplaceAllUsesOfValueWith calls, and theoretically could be optimized even further. However, an important property of this new function is that it handles the case where the source value set and destination value set overlap. This makes it feasible for isel to use SelectNodeTo in many very common cases, which is advantageous because SelectNodeTo avoids a temporary node and it doesn't require CSEMap updates for users of values that don't change position. Revamp MorphNodeTo, which is what does all the work of SelectNodeTo, to handle operand lists more efficiently, and to correctly handle a number of corner cases to which its new wider use exposes it. This commit also includes a change to the encoding of post-isel opcodes in SDNodes; now instead of being sandwiched between the target-independent pre-isel opcodes and the target-dependent pre-isel opcodes, post-isel opcodes are now represented as negative values. This makes it possible to test if an opcode is pre-isel or post-isel without having to know the size of the current target's post-isel instruction set. These changes speed up llc overall by 3% and reduce memory usage by 10% on the InstructionCombining.cpp testcase with -fast and -regalloc=local. llvm-svn: 53728	2008-07-17 19:10:17 +00:00

1 2 3 4 5 ...

398 Commits