llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00

Author	SHA1	Message	Date
Craig Topper	c8a0b9e381	Switch all uses of LLVM_FINAL to just use 'final', and remove the macro. llvm-svn: 202618	2014-03-02 08:08:51 +00:00
Hal Finkel	4937443651	Remove extra truncs/exts around i32 bit operations on PPC64 This generalizes the code to eliminate extra truncs/exts around i1 bit operations to also do the same on PPC64 for i32 bit operations. This eliminates a fairly prevalent code wart: int foo(int a) { return a == 5 ? 7 : 8; } On PPC64, because of the extension implied by the ABI, this would generate: cmplwi 0, 3, 5 li 12, 8 li 4, 7 isel 3, 4, 12, 2 rldicl 3, 3, 0, 32 blr where the 'rldicl 3, 3, 0, 32', the extension, is completely unnecessary. At least for the single-BB case (which is all that the DAG combine mechanism can handle), this unnecessary extension is no longer generated. llvm-svn: 202600	2014-03-01 21:36:57 +00:00
Hal Finkel	1970087008	Swap PPC isel operands to allow for 0-folding The PPC isel instruction can fold 0 into the first operand (thus eliminating the need to materialize a zero-containing register when the 'true' result of the isel is 0). When the isel is fed by a bit register operation that we can invert, do so as part of the bit-register-operation peephole routine. llvm-svn: 202469	2014-02-28 06:11:16 +00:00
Hal Finkel	3bd3f4e287	Trying to unbreak the darwin11 builder The CR bit tracking code broke PPC/Darwin; trying to get it working again... (the darwin11 builder, which defaults to the darwin ABI when running PPC tests, asserted when running test/CodeGen/PowerPC/inverted-bool-compares.ll) llvm-svn: 202459	2014-02-28 01:17:25 +00:00
Hal Finkel	94f3724df6	Try to unbreak the C++11 build Cannot use negative numbers in case statements without running afoul of -Wc++11-narrowing. llvm-svn: 202455	2014-02-28 00:45:27 +00:00
Hal Finkel	883c64377d	Add CR-bit tracking to the PowerPC backend for i1 values This change enables tracking i1 values in the PowerPC backend using the condition register bits. These bits can be treated on PowerPC as separate registers; individual bit operations (and, or, xor, etc.) are supported. Tracking booleans in CR bits has several advantages: - Reduction in register pressure (because we no longer need GPRs to store boolean values). - Logical operations on booleans can be handled more efficiently; we used to have to move all results from comparisons into GPRs, perform promoted logical operations in GPRs, and then move the result back into condition register bits to be used by conditional branches. This can be very inefficient, because the throughput of these CR <-> GPR moves have high latency and low throughput (especially when other associated instructions are accounted for). - On the POWER7 and similar cores, we can increase total throughput by using the CR bits. CR bit operations have a dedicated functional unit. Most of this is more-or-less mechanical: Adjustments were needed in the calling-convention code, support was added for spilling/restoring individual condition-register bits, and conditional branch instruction definitions taking specific CR bits were added (plus patterns and code for generating bit-level operations). This is enabled by default when running at -O2 and higher. For -O0 and -O1, where the ability to debug is more important, this feature is disabled by default. Individual CR bits do not have assigned DWARF register numbers, and storing values in CR bits makes them invisible to the debugger. It is critical, however, that we don't move i1 values that have been promoted to larger values (such as those passed as function arguments) into bit registers only to quickly turn around and move the values back into GPRs (such as happens when values are returned by functions). A pair of target-specific DAG combines are added to remove the trunc/extends in: trunc(binary-ops(binary-ops(zext(x), zext(y)), ...) and: zext(binary-ops(binary-ops(trunc(x), trunc(y)), ...) In short, we only want to use CR bits where some of the i1 values come from comparisons or are used by conditional branches or selects. To put it another way, if we can do the entire i1 computation in GPRs, then we probably should (on the POWER7, the GPR-operation throughput is higher, and for all cores, the CR <-> GPR moves are expensive). POWER7 test-suite performance results (from 10 runs in each configuration): SingleSource/Benchmarks/Misc/mandel-2: 35% speedup MultiSource/Benchmarks/Prolangs-C++/city/city: 21% speedup MultiSource/Benchmarks/MiBench/automotive-susan: 23% speedup SingleSource/Benchmarks/CoyoteBench/huffbench: 13% speedup SingleSource/Benchmarks/Misc-C++/Large/sphereflake: 13% speedup SingleSource/Benchmarks/Misc-C++/mandel-text: 10% speedup SingleSource/Benchmarks/Misc-C++-EH/spirit: 10% slowdown MultiSource/Applications/lemon/lemon: 8% slowdown llvm-svn: 202451	2014-02-28 00:27:01 +00:00
Aaron Ballman	9e5315239d	Silencing an MSVC signed comparison warning. llvm-svn: 202295	2014-02-26 20:22:20 +00:00
Hal Finkel	08c64addef	Account for 128-bit integer operations in PPCCTRLoops We need to abort the formation of counter-register-based loops where there are 128-bit integer operations that might become function calls. llvm-svn: 202192	2014-02-25 20:51:50 +00:00
Rafael Espindola	32da4bdd4b	Make DataLayout a plain object, not a pass. Instead, have a DataLayoutPass that holds one. This will allow parts of LLVM don't don't handle passes to also use DataLayout. llvm-svn: 202168	2014-02-25 17:30:31 +00:00
Rafael Espindola	6c834371d9	Make some DataLayout pointers const. No functionality change. Just reduces the noise of an upcoming patch. llvm-svn: 202087	2014-02-24 23:12:18 +00:00
Rafael Espindola	1f7e9d4bed	Rename a few more DataLayout variables. llvm-svn: 201833	2014-02-21 01:53:35 +00:00
Rafael Espindola	60133f3afe	move getNameWithPrefix and getSymbol to TargetMachine. TargetLoweringBase is implemented in CodeGen, so before this patch we had a dependency fom Target to CodeGen. This would show up as a link failure of llvm-stress when building with -DBUILD_SHARED_LIBS=ON. This fixes pr18900. llvm-svn: 201711	2014-02-19 20:30:41 +00:00
Rafael Espindola	aea6192f20	Add back r201608, r201622, r201624 and r201625 r201608 made llvm corretly handle private globals with MachO. r201622 fixed a bug in it and r201624 and r201625 were changes for using private linkage, assuming that llvm would do the right thing. They all got reverted because r201608 introduced a crash in LTO. This patch includes a fix for that. The issue was that TargetLoweringObjectFile now has to be initialized before we can mangle names of private globals. This is trivially true during the normal codegen pipeline (the asm printer does it), but LTO has to do it manually. llvm-svn: 201700	2014-02-19 17:23:20 +00:00
Daniel Jasper	bf4e7d8ac3	Revert r201622 and r201608. This causes the LLVMgold plugin to segfault. More information on the replies to r201608. llvm-svn: 201669	2014-02-19 12:26:01 +00:00
Rafael Espindola	d39a573c72	Fix PR18743. The IR @foo = private constant i32 42 is valid, but before this patch we would produce an invalid MachO from it. It was invalid because it would use an L label in a section where the liker needs the labels in order to atomize it. One way of fixing it would be to just reject this IR in the backend, but that would not be very front end friendly. What this patch does is use an 'l' prefix in sections that we know the linker requires symbols for atomizing them. This allows frontends to just use private and not worry about which sections they go to or how the linker handles them. One small issue with this strategy is that now a symbol name depends on the section, which is not available before codegen. This is not a problem in practice. The reason is that it only happens with private linkage, which will be ignored by the non codegen users (llvm-nm and llvm-ar). llvm-svn: 201608	2014-02-18 22:24:57 +00:00
Rafael Espindola	c898de3245	Rename a DebugLoc variable to DbgLoc and a DataLayout to DL. This is quiet a bit less confusing now that TargetData was renamed DataLayout. llvm-svn: 201606	2014-02-18 22:05:46 +00:00
Daniel Sanders	7a3a160940	Re-commit: Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call Summary: AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for targets with mature MC support. Such targets will always parse the inline assembly (even when emitting assembly). Targets without mature MC support continue to use EmitRawText() for assembly output. The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler to parse inline assembly (even when emitting assembly output). UseIntegratedAs is set to true for targets that consider any failure to parse valid assembly to be a bug. Target specific subclasses generally enable the integrated assembler in their constructor. The default value can be overridden with -no-integrated-as. All tests that rely on inline assembly supporting invalid assembly (for example, those that use mnemonics such as 'foo' or 'hello world') have been updated to disable the integrated assembler. Changes since review (and last commit attempt): - Fixed test failures that were missed due to configuration of local build. (fixes crash.ll and a couple others). - Fixed tests that happened to pass because the local build was on X86 (should fix 2007-12-17-InvokeAsm.ll) - mature-mc-support.ll's should no longer require all targets to be compiled. (should fix ARM and PPC buildbots) - Object output (-filetype=obj and similar) now forces the integrated assembler to be enabled regardless of default setting or -no-integrated-as. (should fix SystemZ buildbots) Reviewers: rafael Reviewed By: rafael CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2686 llvm-svn: 201333	2014-02-13 14:44:26 +00:00
Daniel Sanders	656c4d360b	Revert r201237+r201238: Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call It introduced multiple test failures in the buildbots. llvm-svn: 201241	2014-02-12 15:39:20 +00:00
Daniel Sanders	e647d6441b	Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call Summary: AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for targets with mature MC support. Such targets will always parse the inline assembly (even when emitting assembly). Targets without mature MC support continue to use EmitRawText() for assembly output. The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler to parse inline assembly (even when emitting assembly output). UseIntegratedAs is set to true for targets that consider any failure to parse valid assembly to be a bug. Target specific subclasses generally enable the integrated assembler in their constructor. The default value can be overridden with -no-integrated-as. All tests that rely on inline assembly supporting invalid assembly (for example, those that use mnemonics such as 'foo' or 'hello world') have been updated to disable the integrated assembler. Reviewers: rafael Reviewed By: rafael CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2686 llvm-svn: 201237	2014-02-12 14:44:54 +00:00
Rafael Espindola	8d47aa1e4e	Pass the Mangler by reference. It is never null and it is not used in casts, so there is no reason to use a pointer. This matches how we pass TM. llvm-svn: 201025	2014-02-08 14:53:28 +00:00
Rafael Espindola	1d50d1310d	Add LLVM_OVERRIDE to a few declarations. llvm-svn: 201022	2014-02-08 06:07:27 +00:00
Rafael Espindola	c27ddf11ba	Just returning false is the default. llvm-svn: 200890	2014-02-06 00:03:15 +00:00
Matt Arsenault	7b69102edb	Add address space argument to allowsUnalignedMemoryAccess. On R600, some address spaces have more strict alignment requirements than others. llvm-svn: 200887	2014-02-05 23:15:53 +00:00
Rafael Espindola	98165a6a91	Remove support for not using .loc directives. Clang itself was not using this. The only way to access it was via llc. llvm-svn: 200862	2014-02-05 18:00:21 +00:00
Hal Finkel	c649b3ff57	Replace PPC instruction-size code with MCInstrDesc getSize As part of the cleanup done to enable the disassembler, the PPC instructions now have a valid Size description field. This can now be used to replace some custom logic in a few places to compute instruction sizes. Patch by David Wiberg! llvm-svn: 200623	2014-02-02 06:12:27 +00:00
David Woodhouse	6c8fefd999	Delete MCSubtargetInfo data members from target MCCodeEmitter classes The subtarget info is explicitly passed to the EncodeInstruction method and we should use that subtarget info to influence any encoding decisions. llvm-svn: 200350	2014-01-28 23:13:25 +00:00
David Woodhouse	a79a37b435	Propagate MCSubtargetInfo through TableGen's getBinaryCodeForInstr() llvm-svn: 200349	2014-01-28 23:13:18 +00:00
David Woodhouse	4a4c611e36	Explictly pass MCSubtargetInfo to MCCodeEmitter::EncodeInstruction() llvm-svn: 200348	2014-01-28 23:13:07 +00:00
David Woodhouse	5d0b529d58	Change MCStreamer EmitInstruction interface to take subtarget info llvm-svn: 200345	2014-01-28 23:12:42 +00:00
Iain Sandoe	f16c0c72d6	Provide a stub Target Streamer implementation for PPC MachO At present, this handles .tc (error) and needs to be expanded to deal properly with .machine llvm-svn: 200309	2014-01-28 11:03:17 +00:00
Hal Finkel	5c72f63cb7	Handle spilling the PPC GPRC_NOR0 register class GPRC_NOR0 is not a subclass of GPRC (because it also contains the ZERO pseudo register). As a result, we also need to check for it in the spilling code. llvm-svn: 200288	2014-01-28 05:32:58 +00:00
Eric Christopher	2b6e161fce	Revert r199871 and replace it with a simple check in the debug info code to see if we're emitting a function into a non-default text section. This is still a less-than-ideal solution, but more contained than r199871 to determine whether or not we're emitting code into an array of comdat sections. llvm-svn: 200269	2014-01-28 00:49:26 +00:00
Rafael Espindola	bfdd58b802	Pass a MCSubtargetInfo down to the TargetStreamer creation. With this the target streamers will be able to know the target features that are in use. llvm-svn: 200135	2014-01-26 06:38:58 +00:00
Rafael Espindola	806f778fa0	Construct the MCStreamer before constructing the MCTargetStreamer. This has a few advantages: * Only targets that use a MCTargetStreamer have to worry about it. * There is never a MCTargetStreamer without a MCStreamer, so we can use a reference. * A MCTargetStreamer can talk to the MCStreamer in its constructor. llvm-svn: 200129	2014-01-26 06:06:37 +00:00
Rafael Espindola	bb8a9e36c5	Remove an easy use of EmitRawText from PPC. This makes lib/Target/PowerPC EmitRawText free. llvm-svn: 200065	2014-01-25 02:35:56 +00:00
Juergen Ributzka	e4d29eb495	Add final and owerride keywords to TargetTransformInfo's subclasses. llvm-svn: 200021	2014-01-24 18:22:59 +00:00
Alp Toker	1c4b33e8e5	Fix known typos Sweep the codebase for common typos. Includes some changes to visible function names that were misspelt. llvm-svn: 200018	2014-01-24 17:20:08 +00:00
Eric Christopher	4de8f33ce6	Add a variable to track whether or not we've used a unique section, e.g. linkonce, to TargetMachine and set it when we've done so for ELF targets currently. This involved making TargetMachine non-const in a TLOF use and propagating that change around - I'm open to other ideas. This will be used in a future commit to handle emitting debug information with ranges. llvm-svn: 199871	2014-01-23 06:47:25 +00:00
Rafael Espindola	012bf3f3ac	Fix pr18515. My understanding (from reading just the llvm code) is that * most ppc cpus have a "sync n" instruction and an msync alias that is "sync 0". * "book e" cpus instead have a msync instruction and not the more general "sync n" This patch reflects that in the .td files, allowing a single codepath for asm ond obj streamer and incidentelly fixes a crash when EmitRawText was called on a obj streamer. llvm-svn: 199832	2014-01-22 20:20:52 +00:00
Hal Finkel	ca5b9beeb1	Fix pointer info on PPC byval stores For PPC64 SVR (and Darwin), the stores that take byval aggregate parameters from registers into the stack frame had MachinePointerInfo objects with incorrect offsets. These offsets are relative to the object itself, not to the stack frame base. This fixes self hosting on PPC64 when compiling with -enable-aa-sched-mi. llvm-svn: 199763	2014-01-21 20:15:58 +00:00
Rafael Espindola	ebbf4c6cd5	Make getTargetStreamer return a possibly null pointer. This will allow it to be called from target independent parts of the main streamer that don't know if there is a registered target streamer or not. This in turn will allow targets to perform extra actions at specified points in the interface: add extra flags for some labels, extra work during finalization, etc. llvm-svn: 199174	2014-01-14 01:21:46 +00:00
Chandler Carruth	98adff6224	[PM] Split DominatorTree into a concrete analysis result object which can be used by both the new pass manager and the old. This removes it from any of the virtual mess of the pass interfaces and lets it derive cleanly from the DominatorTreeBase<> template. In turn, tons of boilerplate interface can be nuked and it turns into a very straightforward extension of the base DominatorTree interface. The old analysis pass is now a simple wrapper. The names and style of this split should match the split between CallGraph and CallGraphWrapperPass. All of the users of DominatorTree have been updated to match using many of the same tricks as with CallGraph. The goal is that the common type remains the resulting DominatorTree rather than the pass. This will make subsequent work toward the new pass manager significantly easier. Also in numerous places things became cleaner because I switched from re-running the pass (!!! mid way through some other passes run!!!) to directly recomputing the domtree. llvm-svn: 199104	2014-01-13 13:07:17 +00:00
Chandler Carruth	ee051af6e2	[cleanup] Move the Dominators.h and Verifier.h headers into the IR directory. These passes are already defined in the IR library, and it doesn't make any sense to have the headers in Analysis. Long term, I think there is going to be a much better way to divide these matters. The dominators code should be fully separated into the abstract graph algorithm and have that put in Support where it becomes obvious that evn Clang's CFGBlock's can use it. Then the verifier can manually construct dominance information from the Support-driven interface while the Analysis library can provide a pass which both caches, reconstructs, and supports a nice update API. But those are very long term, and so I don't want to leave the really confusing structure until that day arrives. llvm-svn: 199082	2014-01-13 09:26:24 +00:00
Saleem Abdulrasool	b90512a41c	correct target directive handling error handling The target specific parser should return `false' if the target AsmParser handles the directive, and `true' if the generic parser should handle the directive. Many of the target specific directive handlers would `return Error' which does not follow these semantics. This change simply changes the target specific routines to conform to the semantis of the ParseDirective correctly. Conformance to the semantics improves diagnostics emitted for the invalid directives. X86 is taken as a sample to ensure that multiple diagnostics are not presented for a single error. llvm-svn: 199068	2014-01-13 01:15:39 +00:00
Chandler Carruth	53468087f3	Put the functionality for printing a value to a raw_ostream as an operand into the Value interface just like the core print method is. That gives a more conistent organization to the IR printing interfaces -- they are all attached to the IR objects themselves. Also, update all the users. This removes the 'Writer.h' header which contained only a single function declaration. llvm-svn: 198836	2014-01-09 02:29:41 +00:00
Rafael Espindola	4dc5af8bc2	Move the llvm mangler to lib/IR. This makes it available to tools that don't link with target (like llvm-ar). llvm-svn: 198708	2014-01-07 21:19:40 +00:00
Chandler Carruth	7aa902a488	Move the LLVM IR asm writer header files into the IR directory, as they are part of the core IR library in order to support dumping and other basic functionality. Rename the 'Assembly' include directory to 'AsmParser' to match the library name and the only functionality left their -- printing has been in the core IR library for quite some time. Update all of the #includes to match. All of this started because I wanted to have the layering in good shape before I started adding support for printing LLVM IR using the new pass infrastructure, and commandline support for the new pass infrastructure. llvm-svn: 198688	2014-01-07 12:34:26 +00:00
Chandler Carruth	87f14b4eec	Re-sort all of the includes with ./utils/sort_includes.py so that subsequent changes are easier to review. About to fix some layering issues, and wanted to separate out the necessary churn. Also comment and sink the include of "Windows.h" in three .inc files to match the usage in Memory.inc. llvm-svn: 198685	2014-01-07 11:48:04 +00:00
Bill Wendling	e1a9065ca0	Remove unnecessary #includes. llvm-svn: 198585	2014-01-06 06:00:00 +00:00
Bill Wendling	c3b5643da4	Refactor function that checks that __builtin_returnaddress's argument is constant. This moves the check up into the parent class so that all targets can use it without having to copy (and keep in sync) the same error message. llvm-svn: 198579	2014-01-06 00:43:20 +00:00
Bill Wendling	be9af41475	Emit an error message if the value passed to __builtin_returnaddress isn't a constant __builtin_returnaddress requires that the value passed into is be a constant. However, at -O0 even a constant expression may not be converted to a constant. Emit an error message intead of crashing. llvm-svn: 198531	2014-01-05 01:47:20 +00:00
Rafael Espindola	eae6386a1e	Make the llvm mangler depend only on DataLayout. Before this patch any program that wanted to know the final symbol name of a GlobalValue had to link with Target. This patch implements a compromise solution where the mangler uses DataLayout. This way, any tool that already links with Target (llc, clang) gets the exact behavior as before and new IR files can be mangled without linking with Target. With this patch the mangler is constructed with just a DataLayout and DataLayout is extended to include the information the Mangler needs. llvm-svn: 198438	2014-01-03 19:21:54 +00:00
Hal Finkel	d0f8236add	[PPC] Fix comment to match function name llvm-svn: 198362	2014-01-02 22:09:39 +00:00
Hal Finkel	0e77f939b4	[PPC] Fix the scheduling of CR logicals on the P7 CR logicals (crand, crxor, etc.) on the P7 need to be in the first slot of each dispatch group. The old itinerary entry was just wrong (but has not mattered because we don't generate these instructions). This will matter when, in an upcoming commit, we start generating these instructions. llvm-svn: 198359	2014-01-02 21:38:26 +00:00
Hal Finkel	dc47cd8c25	[PPC] Use the correct immediate operands on 64-bit instructions Several of the 64-bit fixed-point instructions with immediate operands were using the 32-bit (i32) operand nodes instead of the corresponding 64-bit (i64) operand definitions (u16imm instead of u16imm64, for example). This error has had no effect so far, but would have caused type-checking violations with an upcoming change. llvm-svn: 198356	2014-01-02 21:26:59 +00:00
Roman Divacky	f81810bdfd	Use r2 when encoding tls on ppc32. Fixes PR18305. llvm-svn: 197878	2013-12-22 10:45:37 +00:00
Roman Divacky	7114adfec5	Add some comments. llvm-svn: 197875	2013-12-22 09:48:38 +00:00
Roman Divacky	513296cd04	Implement initial-exec TLS for PPC32. llvm-svn: 197824	2013-12-20 18:08:54 +00:00
Rafael Espindola	6cf8a98d5f	Long doubles are required to be aligned to 128 bits and svr4 32 bits. Clang was already getting this right. llvm-svn: 197694	2013-12-19 16:23:59 +00:00
Hal Finkel	ce61543897	Add a disassembler to the PowerPC backend The tests for the disassembler were adapted from the encoder tests, and for the most part, the output from the disassembler matches that encoder-test inputs. There are some places where more-informative mnemonics could be produced (notably for the branch instructions), and those cases are noted in the tests with FIXMEs. Future work includes: - Generating more-informative mnemonics when possible (this may also be done in the printer). - Remove the dependence on positional "numbered" operand-to-variable mapping (for both encoding and decoding). - Internally using 64-bit instruction variants in 64-bit mode (if this turns out to matter). llvm-svn: 197693	2013-12-19 16:13:01 +00:00
Rafael Espindola	9b3064d2fb	Fix f64 and f128 for ppc-darwin. This patch adds -f64:32:64 to 32 bit ppc darwin since a f64 inside a structure are only 32 bit aligned. The patch also drop -f128:64:128 from all ppc darwin, since f128 is 128 bit aligned. llvm-svn: 197574	2013-12-18 15:06:25 +00:00
Rafael Espindola	e1792e72e1	One ppc32-darwin, a i64 inside a structure can have 32 bit alignment. Thanks for Iain Sandoe for testing this with the original gcc. Clang was already getting this right. llvm-svn: 197572	2013-12-18 14:35:37 +00:00
Hal Finkel	0cc169d05b	Eliminate PPC instruction decoding ambiguities The instruction definitions in the PPC backend have a number of variants defined for the same instruction to represent differences between 64-bit and 32-bit semantics. In order to generate a disassembler for the PPC backend, we need to mark all but one of these as CodeGen only. No functionality change intended; this is prep work for PPC disassembly support. llvm-svn: 197535	2013-12-17 23:05:18 +00:00
Rafael Espindola	4a6f55789a	Fix the pointer size for the PS3 datalayout. This will be tested from clang. llvm-svn: 197501	2013-12-17 15:29:48 +00:00
Andrew Trick	a3aa2ba174	Allow MachineCSE to coalesce trivial subregister copies the same way that it coalesces normal copies. Without this, MachineCSE is powerless to handle redundant operations with truncated source operands. This required fixing the 2-addr pass to handle tied subregisters. It isn't clear what combinations of subregisters can legally be tied, but the simple case of truncated source operands is now safely handled: %vreg11<def> = COPY %vreg1:sub_32bit; GR32:%vreg11 GR64:%vreg1 %vreg12<def> = COPY %vreg2:sub_32bit; GR32:%vreg12 GR64:%vreg2 %vreg13<def,tied1> = ADD32rr %vreg11<tied0>, %vreg12<kill>, %EFLAGS<imp-def> Test case: cse-add-with-overflow.ll. This exposed an existing bug in PPCInstrInfo::commuteInstruction. Thanks to Rafael for the test case: PowerPC/crash.ll. llvm-svn: 197465	2013-12-17 04:50:45 +00:00
Andrew Trick	935a379f21	whitespace llvm-svn: 197464	2013-12-17 04:50:40 +00:00
Rafael Espindola	559bceac20	The preferred alignment defaults to the abi alignment. Omit if it is the same. llvm-svn: 197400	2013-12-16 18:01:51 +00:00
Rafael Espindola	65c80ee4a2	On DataLayout, omit the default of p:64:64:64. llvm-svn: 197397	2013-12-16 17:15:29 +00:00
Hal Finkel	951040af31	Set has_asmparser in PowerPC/LLVMBuild.txt PowerPC now has an asm parser (and has for many months now); indicate this in PowerPC/LLVMBuild.txt. llvm-svn: 197393	2013-12-16 15:48:09 +00:00
Iain Sandoe	d78fe2e004	[Powerpc darwin] AsmParser Base implementation. This is a base implementation of the powerpc-apple-darwin asm parser dialect. * Enables infrastructure (essentially isDarwin()) and fixes up the parsing of asm directives to separate out ELF and MachO/Darwin additions. * Enables parsing of {r,f,v}XX as register identifiers. * Enables parsing of lo16() hi16() and ha16() as modifiers. The changes to the test case are from David Fang (fangism). llvm-svn: 197324	2013-12-14 13:34:02 +00:00
Rafael Espindola	2bd13393e0	Assume defaults to produce smaller datalayout strings. llvm-svn: 197249	2013-12-13 17:56:11 +00:00
Iain Sandoe	fc27a861a1	test commit. Amend a comment. llvm-svn: 197237	2013-12-13 15:46:48 +00:00
Gabor Greif	c84043be50	typo in comment llvm-svn: 197136	2013-12-12 08:00:34 +00:00
Hal Finkel	bb547872c5	Remove unused multiclass from PPCInstrInfo.td llvm-svn: 197100	2013-12-12 00:23:29 +00:00
Hal Finkel	e8c7fef754	Improve instruction scheduling for the PPC POWER7 Aside from a few minor latency corrections, the major change here is a new hazard recognizer which focuses on better dispatch-group formation on the POWER7. As with the PPC970's hazard recognizer, the most important thing it does is avoid load-after-store hazards within the same dispatch group. It uses the POWER7's special dispatch-group-terminating nop instruction (instead of inserting multiple regular nop instructions). This new hazard recognizer makes use of the scheduling dependency graph itself, built using AA information, to robustly detect the possibility of load-after-store hazards. significant test-suite performance changes (the error bars are 99.5% confidence intervals based on 5 test-suite runs both with and without the change -- speedups are negative): speedups: MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2 -0.55171% +/- 0.333168% MultiSource/Benchmarks/TSVC/CrossingThresholds-dbl/CrossingThresholds-dbl -17.5576% +/- 14.598% MultiSource/Benchmarks/TSVC/Reductions-dbl/Reductions-dbl -29.5708% +/- 7.09058% MultiSource/Benchmarks/TSVC/Reductions-flt/Reductions-flt -34.9471% +/- 11.4391% SingleSource/Benchmarks/BenchmarkGame/puzzle -25.1347% +/- 11.0104% SingleSource/Benchmarks/Misc/flops-8 -17.7297% +/- 9.79061% SingleSource/Benchmarks/Shootout-C++/ary3 -35.5018% +/- 23.9458% SingleSource/Regression/C/uint64_to_float -56.3165% +/- 25.4234% SingleSource/UnitTests/Vectorizer/gcc-loops -18.5309% +/- 6.8496% regressions: MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2000 18.351% +/- 12.156% SingleSource/Benchmarks/Shootout-C++/methcall 27.3086% +/- 14.4733% llvm-svn: 197099	2013-12-12 00:19:11 +00:00
Hal Finkel	4a76fae817	Fix the PPC subsumes-predicate check For one predicate to subsume another, they must both check the same condition register. Failure to check this prerequisite was causing miscompiles. Fixes PR18003. llvm-svn: 197089	2013-12-11 23:12:25 +00:00
NAKAMURA Takumi	54fa39136d	Prune redundant dependencies in LLVMBuild.txt. llvm-svn: 196988	2013-12-11 00:30:57 +00:00
Rafael Espindola	db206ff5b6	Move PPC's getDataLayoutString out of line and document it better. llvm-svn: 196987	2013-12-11 00:09:06 +00:00
David Fang	7d43532c99	on darwin<10, fallback to .weak_definition (PPC,X86) .weak_def_can_be_hidden was not yet supported by the system assembler llvm-svn: 196970	2013-12-10 21:37:41 +00:00
NAKAMURA Takumi	3fda77b6c7	Add proper dependencies to LLVMBuild.txt in llvm/lib. I'll prune redundant deps in LLVMBuild.txt, later. llvm-svn: 196881	2013-12-10 05:39:34 +00:00
Rafael Espindola	2b4db8d379	Remove the isImplicitlyPrivate argument of getNameWithPrefix. getSymbolWithGlobalValueBase use is to create a name of a new symbol based on the name of an existing GV. Assert that and then remove the last call to pass true to isImplicitlyPrivate. This gives the mangler API a 1:1 mapping from GV to names, which is what we need to drop the mangler dependency on the target (and use an extended datalayout instead). llvm-svn: 196472	2013-12-05 05:53:12 +00:00
Alp Toker	e845f8af67	Correct word hyphenations This patch tries to avoid unrelated changes other than fixing a few hyphen-related ambiguities and contractions in nearby lines. llvm-svn: 196471	2013-12-05 05:44:44 +00:00
Hal Finkel	2fbbf9299e	Remove PPCScoreboardHazardRecognizer PPCScoreboardHazardRecognizer was a subclass of ScoreboardHazardRecognizer which did only one thing: filtered out nodes in EmitInstruction for which DAG->getInstrDesc(SU) returned NULL. This used to be the case for PPC pseudo instructions. As far as I can tell, this is no longer true, and so we can use ScoreboardHazardRecognizer directly. llvm-svn: 196171	2013-12-02 23:52:46 +00:00
Rafael Espindola	5f33387c40	Refactor the setting of PrivateGlobalPrefix. No functionality change. llvm-svn: 196170	2013-12-02 23:39:26 +00:00
Rafael Espindola	c36d63a948	Move getSymbolWithGlobalValueBase to TargetLoweringObjectFile. This allows it to be used in TargetLoweringObjectFileImpl.cpp. llvm-svn: 196117	2013-12-02 16:25:47 +00:00
Rafael Espindola	cee11d6f43	Remove dead code. MO_JumpTableIndex and MO_ExternalSymbol don't show up on inline asm. Keeping parts of the old asm printer just to print inline asm to a string that we then parse back looks like a hack. llvm-svn: 196111	2013-12-02 15:36:37 +00:00
Rafael Espindola	427ca8d886	Change the default of AsmWriterClassName and isMCAsmWriter. llvm-svn: 196065	2013-12-02 04:55:42 +00:00
Rafael Espindola	3192965fa3	Refactor for clarity and efficiency. The PPC GetSymbolFromOperand already prefixed stubs of MO_ExternalSymbol, so this should be a nop. llvm-svn: 196059	2013-12-02 03:26:43 +00:00
Hal Finkel	725757ccc8	Add a scheduling model (with itinerary) for the PPC POWER7 This adds a scheduling model for the POWER7 (P7) core, and enables the machine-instruction scheduler when targeting the P7. Scheduling for the P7, like earlier ooo PPC cores, requires considering both dispatch group hazards, and functional unit resources and latencies. These are both modeled in a combined itinerary. Dispatch group formation is still handled by the post-RA scheduler (which still needs to be updated for the P7, but nevertheless does a pretty good job). One interesting aspect of this change is that I've also enabled to use of AA duing CodeGen for the P7 (just as it is for the embedded cores). The benchmark results seem to support this decision (see below), and while this is normally useful for in-order cores, and not for ooo cores like the P7, I think that the dispatch slot hazards are enough like in-order resources to make the AA useful. Test suite significant performance differences (where negative is a speedup, and positive is a regression) vs. the current situation: MultiSource/Benchmarks/BitBench/drop3/drop3 with AA: N/A without AA: -28.7614% +/- 19.8356% (significantly against AA) MultiSource/Benchmarks/FreeBench/neural/neural with AA: -17.7406% +/- 11.2712% without AA: N/A (significantly in favor of AA) MultiSource/Benchmarks/SciMark2-C/scimark2 with AA: -11.2079% +/- 1.80543% without AA: -11.3263% +/- 2.79651% MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt with AA: -41.8649% +/- 17.0053% without AA: -34.5256% +/- 23.7072% MultiSource/Benchmarks/mafft/pairlocalalign with AA: 25.3016% +/- 17.8614% without AA: 38.6629% +/- 14.9391% (significantly in favor of AA) MultiSource/Benchmarks/sim/sim with AA: N/A without AA: 13.4844% +/- 7.18195% (significantly in favor of AA) SingleSource/Benchmarks/BenchmarkGame/Large/fasta with AA: 15.0664% +/- 6.70216% without AA: 12.7747% +/- 8.43043% SingleSource/Benchmarks/BenchmarkGame/puzzle with AA: 82.2713% +/- 26.3567% without AA: 75.7525% +/- 41.1842% SingleSource/Benchmarks/Misc/flops-2 with AA: -37.1621% +/- 20.7964% without AA: -35.2342% +/- 20.2999% (significantly in favor of AA) These are 99.5% confidence intervals from 5 runs per configuration. Regarding the choice to turn on AA during CodeGen, of these results, four seem significantly in favor of using AA, and one seems significantly against. I'm not making this decision based on these numbers alone, but these results seem consistent with results I have from other tests, and so I think that, on balance, using AA is a win. llvm-svn: 195981	2013-11-30 20:55:12 +00:00
Hal Finkel	fa2b249f38	Split some PPC itinerary classes In preparation for adding scheduling definitions for the POWER7, split some PPC itinerary classes so that the P7's latencies and hazards can be better described. For the most part, this means differentiating indexed from non-index pre-increment loads and stores. Also, differentiate single from double-precision sqrt. No functionality change intended (except for a more-specific latency for single-precision sqrt on the A2). llvm-svn: 195980	2013-11-30 20:41:13 +00:00
Hal Finkel	2d7cc00415	Adjust PPC A2 input operand latencies On the PPC A2, instructions are only issued after their input operands are ready. Model this by specifying that input operands are read at dispatch (0 cycles after issue). This changes all input operand latencies from 1 to 0. Significant test-suite performance changes (these are 99.5% confidence intervals on 6 runs for both before and after): speedups: MultiSource/Benchmarks/sim/sim -1.21915% +/- 0.175063% MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -1.23946% +/- 1.05133% SingleSource/Benchmarks/Misc/flops-2 -1.24237% +/- 0.681362% MultiSource/Applications/JM/lencod/lencod -1.33992% +/- 0.757498% MultiSource/Benchmarks/TSVC/InductionVariable-flt/InductionVariable-flt -1.51802% +/- 1.21468% MultiSource/Benchmarks/TSVC/GlobalDataFlow-flt/GlobalDataFlow-flt -2.18818% +/- 1.28605% MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt -2.21977% +/- 1.19499% SingleSource/Benchmarks/BenchmarkGame/spectral-norm -2.29822% +/- 0.671871% MultiSource/Benchmarks/TSVC/Packing-dbl/Packing-dbl -2.40975% +/- 0.355931% SingleSource/Benchmarks/Misc/fp-convert -2.41899% +/- 1.04751% MultiSource/Benchmarks/TSVC/Searching-dbl/Searching-dbl -2.50349% +/- 0.126765% SingleSource/Benchmarks/Misc/flops-3 -3.00214% +/- 0.700795% MultiSource/Benchmarks/TSVC/LoopRestructuring-flt/LoopRestructuring-flt -3.56995% +/- 3.2929% MultiSource/Applications/sgefa/sgefa -4.24908% +/- 2.00413% MultiSource/Benchmarks/ASC_Sequoia/IRSmk/IRSmk -18.1294% +/- 3.96489% regressions: MultiSource/Benchmarks/TSVC/Reductions-dbl/Reductions-dbl 1.03249% +/- 0.178547% MultiSource/Applications/hexxagon/hexxagon 1.16597% +/- 0.285235% MultiSource/Benchmarks/TSVC/IndirectAddressing-flt/IndirectAddressing-flt 1.39576% +/- 1.07855% SingleSource/Benchmarks/Misc-C++/stepanov_v1p2 1.71539% +/- 0.173182% MultiSource/Benchmarks/Fhourstones-3.1/fhourstones3.1 1.90013% +/- 0.866472% MultiSource/Benchmarks/TSVC/Recurrences-dbl/Recurrences-dbl 2.39854% +/- 1.05914% MultiSource/Benchmarks/TSVC/ControlFlow-dbl/ControlFlow-dbl 2.4402% +/- 0.817904% MultiSource/Benchmarks/TSVC/LoopRestructuring-dbl/LoopRestructuring-dbl 5.87997% +/- 3.3172% MultiSource/Benchmarks/Trimaran/netbench-crc/netbench-crc 9.02643% +/- 5.79591% MultiSource/Benchmarks/VersaBench/bmm/bmm 10.3517% +/- 1.227% Obviously, there are data points on both sides of this; but I think, overall, this supports making the change. llvm-svn: 195951	2013-11-29 07:04:59 +00:00
Hal Finkel	69f21285ed	Create a PPC440 SchedMachineModel Some of the older PPC processor definitions don't have associated SchedMachineModels; correct this for the PPC440. llvm-svn: 195949	2013-11-29 06:32:17 +00:00
Hal Finkel	c5a38fd3e6	Fixup PPC440 load/store operand latencies The operand latencies for loads and stores in the PPC440 itinerary were wrong (the store operands are all inputs, and the "with update" (pre-increment) instructions need a latency for the additional output). llvm-svn: 195948	2013-11-29 06:19:43 +00:00
Hal Finkel	a9d93b1740	Adjust PPC440 operand latencies The operand latencies for the PPC440 should be specified relative to dispatch, not relative to the initial fetch-and-decode stages. Because most instructions (ignoring bypass) wait in dispatch until their operands are ready, this is modeled as reading input operands "at dispatch" (0 cycles after issue), and so every input and output operand has 4 cycles subtracted from it. This could alter scheduling slightly, but I don't expect a large effect. llvm-svn: 195947	2013-11-29 05:59:00 +00:00
Hal Finkel	f086fc01ab	Don't model the fetch and decode units for the PPC440 Modeling the fetch and decode units in the PPC440 itinerary does not add anything to the hazard detection capability (and so modeling them just wastes compile time). No functionality change intended. llvm-svn: 195946	2013-11-29 05:58:38 +00:00
NAKAMURA Takumi	9b851e876f	[CMake] Let add_public_tablegen_target() provide intrinsics_gen, too. I think, in principle, intrinsics_gen may be added explicitly. That said, it can be added incidentally, since each target already has dependencies to llvm-tblgen. Almost all source files depend on both CommonTaleGen and intrinsics_gen. Explicit add_dependencies() have been pruned under lib/Target. llvm-svn: 195929	2013-11-28 17:04:31 +00:00
NAKAMURA Takumi	99f544b37e	[CMake] Let add_public_tablegen_target responsible to provide dependency to CommonTableGen. add_public_tablegen_target adds *CommonTableGen to LLVM_COMMON_DEPENDS. LLVM_COMMON_DEPENDS affects add_llvm_library (and other add_target stuff) within its scope. llvm-svn: 195927	2013-11-28 17:04:04 +00:00
NAKAMURA Takumi	5dbd3bcf3d	[CMake] Prune include_directories() in llvm/lib/Target. add_llvm_target() sets them. llvm-svn: 195921	2013-11-28 14:53:30 +00:00
Rafael Espindola	05d05c4e8a	Use the mangler consistently instead of using getGlobalPrefix directly. llvm-svn: 195911	2013-11-28 08:59:52 +00:00
Hal Finkel	e49bc01fba	Don't share functional units among the PPC itineraries Instead of sharing functional unit names between the various PPC itineraries, give each core its own unit names prefixed with the core name. This follows the convention used by other backends (such as ARM), and removes a non-obvious ordering dependency between the various PPCSchedule*.td files. No functionality change intended. llvm-svn: 195908	2013-11-28 06:05:59 +00:00
Hal Finkel	40fc5609c6	Add IIC_ prefix to PPC instruction-class names This adds the IIC_ prefix to the instruction itinerary class names, giving the PPC backend a naming convention for itinerary classes that is more consistent with that used by the X86 and ARM backends. Instruction scheduling in the PPC backend needs a bunch of cleanup and improvement (especially for the ooo cores). This is just a preliminary step. No functionality change intended. llvm-svn: 195890	2013-11-27 23:26:09 +00:00
Rafael Espindola	916039a184	Don't set GlobalPrefix to the default value. llvm-svn: 195884	2013-11-27 21:57:54 +00:00
Hal Finkel	4dc7aae3a3	Fix comment in PPCA2Model llvm-svn: 195807	2013-11-27 03:12:56 +00:00
Hal Finkel	fb82ed6bb5	PPC popcnt[dw] do not have record forms The instruction definitions incorrectly specified that popcntd and popcntw have record forms; they do not. This mistake was causing invalid code generation. llvm-svn: 195272	2013-11-20 20:54:55 +00:00
Hal Finkel	d1fc028d62	PPC: Optimize rldicl generation for masked shifts Masking operations (where only some number of the low bits are being kept) are selected to rldicl(x, 0, mb). If x is a logical right shift (which would become rldicl(y, 64-n, n)), we might be able to fold the two instructions together: rldicl(rldicl(x, 64-n, n), 0, mb) -> rldicl(x, 64-n, mb) for n <= mb The right shift is really a left rotate followed by a mask, and if the explicit mask is a more-restrictive sub-mask of the mask implied by the shift, only one rldicl is needed. llvm-svn: 195185	2013-11-20 01:10:15 +00:00
Juergen Ributzka	5357a6d64b	[weak vtables] Remove a bunch of weak vtables This patch removes most of the trivial cases of weak vtables by pinning them to a single object file. The memory leaks in this version have been fixed. Thanks Alexey for pointing them out. Differential Revision: http://llvm-reviews.chandlerc.com/D2068 Reviewed by Andy llvm-svn: 195064	2013-11-19 00:57:56 +00:00
Alexey Samsonov	3bfef6bdb6	Revert r194865 and r194874. This change is incorrect. If you delete virtual destructor of both a base class and a subclass, then the following code: Base *foo = new Child(); delete foo; will not cause the destructor for members of Child class. As a result, I observe plently of memory leaks. Notable examples I investigated are: ObjectBuffer and ObjectBufferStream, AttributeImpl and StringSAttributeImpl. llvm-svn: 194997	2013-11-18 09:31:53 +00:00
Juergen Ributzka	ee3af15269	[weak vtables] Remove a bunch of weak vtables This patch removes most of the trivial cases of weak vtables by pinning them to a single object file. Differential Revision: http://llvm-reviews.chandlerc.com/D2068 Reviewed by Andy llvm-svn: 194865	2013-11-15 22:34:48 +00:00
Bob Wilson	d433cf7463	Avoid illegal integer promotion in fastisel Stop folding constant adds into GEP when the type size doesn't match. Otherwise, the adds' operands are effectively being promoted, changing the conditions of an overflow. Results are different when: sext(a) + sext(b) != sext(a + b) Problem originally found on x86-64, but also fixed issues with ARM and PPC, which used similar code. <rdar://problem/15292280> Patch by Duncan Exon Smith! llvm-svn: 194840	2013-11-15 19:09:27 +00:00
Hal Finkel	2d9d341e70	Add PPC option for full register names in asm On non-Darwin PPC systems, we currently strip off the register name prefix prior to instruction printing. So instead of something like this: mr r3, r4 we print this: mr 3, 4 The first form is the default on Darwin, and is understood by binutils, but not yet understood by our integrated assembler. Once our integrated-as understands full register names as well, this temporary option will be replaced by tying this functionality to the verbose-asm option. The numeric-only form is compatible with legacy assemblers and tools, and is also gcc's default on most PPC systems. On the other hand, it is harder to read, and there are some analysis tools that expect full register names. llvm-svn: 194384	2013-11-11 14:58:40 +00:00
Rui Ueyama	cac5d9f1f9	Use StringRef::startswith_lower. No functionality change. llvm-svn: 193796	2013-10-31 19:59:55 +00:00
Rafael Espindola	68ddc56344	Add a helper getSymbol to AsmPrinter. llvm-svn: 193627	2013-10-29 17:07:16 +00:00
Eric Christopher	25d167bd58	Add support for the VSX target attribute. No functional change as we don't actually use it to emit any code yet. llvm-svn: 192837	2013-10-16 20:38:58 +00:00
Rafael Espindola	90d8b36e1e	Add a MCAsmInfoELF class and factor some code into it. We had a MCAsmInfoCOFF, but no common class for all the ELF MCAsmInfos before. llvm-svn: 192760	2013-10-16 01:34:32 +00:00
Rafael Espindola	6267c79fdb	Add a MCTargetStreamer interface. This patch fixes an old FIXME by creating a MCTargetStreamer interface and moving the target specific functions for ARM, Mips and PPC to it. The ARM streamer is still declared in a common place because it is used from lib/CodeGen/ARMException.cpp, but the Mips and PPC are completely hidden in the corresponding Target directories. I will send an email to llvmdev with instructions on how to use this. llvm-svn: 192181	2013-10-08 13:08:17 +00:00
Rafael Espindola	e60c3625e3	Remove getEHExceptionRegister and getEHHandlerRegister. They haven't been used for a long time. Patch by MathOnNapkins. llvm-svn: 192099	2013-10-07 13:39:22 +00:00
Bill Schmidt	b5aca928c2	[PowerPC] Fix PR17354: Generate nop after local calls for PIC code. When generating code for shared libraries, even local calls may be intercepted, so we need a nop after the call for the linker to fix up the TOC. Test case adapted from the one provided in PR17354. llvm-svn: 191440	2013-09-26 17:09:28 +00:00
David Majnemer	be4c5e7ce1	PPC: Allow partial fills in writeNopData() When asked to pad an irregular number of bytes, we should fill with zeros. This is consistent with the behavior specified in the AIX Assembler Language Reference as well as other LLVM and binutils assemblers. N.B. There is a small deviation from binutils' PPC assembler: when handling pads which are greater than 4 bytes but not mod 4, binutils will not emit any NOP sequences at all and only use zeros. This may or may not be a bug but there is no excellent rationale as to why that behavior is important to emulate. If that behavior is needed, we can change writeNopData() to behave in the same way. This fixes PR17352. llvm-svn: 191426	2013-09-26 09:18:48 +00:00
David Majnemer	9ed79f2d96	PPC: Do not introduce ISD nodes for fctid and fctiw llvm-svn: 191421	2013-09-26 05:22:11 +00:00
David Majnemer	2652c503c6	PPC: Add support for fctid and fctiw Encodings were checked against the Power ISA documents and double checked against binutils. This fixes PR17350. llvm-svn: 191419	2013-09-26 04:11:24 +00:00
David Majnemer	1ffe32b198	MC: Add support for treating $ as a reference to the PC The binutils assembler supports a mode called DOLLAR_DOT which treats the dollar sign token as a reference to the current program counter if the dollar sign doesn't precede a constant or identifier. This commit adds a new MCAsmInfo flag stating whether or not a given target supports this interpretation of the dollar sign token; by default, this flag is not enabled. Further, enable this flag for PPC. The system assembler for AIX and binutils both support using the dollar sign in this manner. This fixes PR17353. llvm-svn: 191368	2013-09-25 10:47:21 +00:00
David Majnemer	30b6b79b54	MC: Remove vestigial PCSymbol field from AsmInfo llvm-svn: 191362	2013-09-25 09:36:11 +00:00
Tim Northover	c9a7e47164	ISelDAG: spot chain cycles involving MachineNodes Previously, the DAGISel function WalkChainUsers was spotting that it had entered already-selected territory by whether a node was a MachineNode (amongst other things). Since it's fairly common practice to insert MachineNodes during ISelLowering, this was not the correct check. Looking around, it seems that other nodes get their NodeId set to -1 upon selection, so this makes sure the same thing happens to all MachineNodes and uses that characteristic to determine whether we should stop looking for a loop during selection. This should fix PR15840. llvm-svn: 191165	2013-09-22 08:21:56 +00:00
Hal Finkel	7e8ba290f6	Correct the pre-increment load latencies in the PPC A2 itinerary Pre-increment loads are microcoded on the A2, and the address increment occurs only after the load completes. As a result, the latency of the GPR address update is an additional 2 cycles on top of the load latency. llvm-svn: 191156	2013-09-22 00:08:14 +00:00
Bill Schmidt	ae6c256723	[PowerPC] Add a FIXME. Documenting a design choice to generate only medium model sequences for TLS addresses at this time. Small and large code models could be supported if necessary. llvm-svn: 190883	2013-09-17 20:22:05 +00:00
Bill Schmidt	f26512486a	[PowerPC] Fix problems with large code model (PR17169). Large code model on PPC64 requires creating and referencing TOC entries when using the addis/ld form of addressing. This was not being done in all cases. The changes in this patch to PPCAsmPrinter::EmitInstruction() fix this. Two test cases are also modified to reflect this requirement. Fast-isel was not creating correct code for loading floating-point constants using large code model. This also requires the addis/ld form of addressing. Previously we were using the addis/lfd shortcut which is only applicable to medium code model. One test case is modified to reflect this requirement. llvm-svn: 190882	2013-09-17 20:03:25 +00:00
Bill Schmidt	b9dc991f55	[PowerPC] Fix PR17155 - Ignore COPY_TO_REGCLASS during emit. Fast-isel generates a COPY_TO_REGCLASS for widening f32 to f64, which is a nop on PPC64. This is needed to keep the register class system happy, but on the fast-isel path it is not removed before emit as it is for DAG select. Ignore this op when emitting instructions. llvm-svn: 190795	2013-09-16 17:25:12 +00:00
Hal Finkel	5bb449bca0	PPC: Don't restrict lvsl generation to after type legalization This is a re-commit of r190764, with an extra check to make sure that we're not performing the transformation on illegal types (a small test case has been added for this as well). Original commit message: The PPC backend uses a target-specific DAG combine to turn unaligned Altivec loads into a permutation-based sequence when possible. Unfortunately, the target-specific DAG combine is not always called on all loads of interest (sometimes the routines in DAGCombine call CombineTo such that the new node and users are not added to the worklist); allowing the combine to trigger early (before type legalization) mitigates this problem. Because the autovectorizers only create legal vector types, I don't expect a lot of cases where this optimization is enabled by type legalization in practice. llvm-svn: 190771	2013-09-15 22:09:58 +00:00
Hal Finkel	c45bfe85cc	Revert r190764: PPC: Don't restrict lvsl generation to after type legalization This is causing test-suite failures. Original commit message: The PPC backend uses a target-specific DAG combine to turn unaligned Altivec loads into a permutation-based sequence when possible. Unfortunately, the target-specific DAG combine is not always called on all loads of interest (sometimes the routines in DAGCombine call CombineTo such that the new node and users are not added to the worklist); allowing the combine to trigger early (before type legalization) mitigates this problem. Because the autovectorizers only create legal vector types, I don't expect a lot of cases where this optimization is enabled by type legalization in practice. llvm-svn: 190765	2013-09-15 15:41:11 +00:00
Hal Finkel	ae7feec56e	PPC: Don't restrict lvsl generation to after type legalization The PPC backend uses a target-specific DAG combine to turn unaligned Altivec loads into a permutation-based sequence when possible. Unfortunately, the target-specific DAG combine is not always called on all loads of interest (sometimes the routines in DAGCombine call CombineTo such that the new node and users are not added to the worklist); allowing the combine to trigger early (before type legalization) mitigates this problem. Because the autovectorizers only create legal vector types, I don't expect a lot of cases where this optimization is enabled by type legalization in practice. llvm-svn: 190764	2013-09-15 15:20:54 +00:00
Hal Finkel	ac7ef3f74f	Add missing break statement in PPCISelLowering As it turns out, not a problem in practice, but it should be there. llvm-svn: 190720	2013-09-13 20:09:02 +00:00
Chandler Carruth	e62d2f2be7	Remove an unused variable, fixing -Werror build with latest Clang. llvm-svn: 190640	2013-09-12 23:30:48 +00:00
Hal Finkel	4b3cfb4727	Fix PPC ABI for ByVal structs with vector members When a structure is passed by value, and that structure contains a vector member, according to the PPC ABI, the structure will receive enhanced alignment (so that the vector within the structure will always be aligned). This should resolve PR16641. llvm-svn: 190636	2013-09-12 23:20:06 +00:00
Hal Finkel	47bfa9a072	Make the PPC fast-math sqrt expansion safe at 0 In fast-math mode sqrt(x) is calculated using the fast expansion of the reciprocal of the reciprocal sqrt expansion. The reciprocal and reciprocal sqrt expansions use the associated estimate instructions along with some Newton iterations. Unfortunately, as a result, sqrt(0) was being calculated as NaN, which is not correct. Now we explicitly return a result of zero if the input is zero. llvm-svn: 190624	2013-09-12 19:04:12 +00:00
Roman Divacky	582df28090	Implement asm support for a few PowerPC bookIII that are needed for assembling FreeBSD kernel. llvm-svn: 190618	2013-09-12 17:50:54 +00:00
Hal Finkel	5ddc53b3ef	Mark PPC MFTB and DST (and friends) as deprecated Use the new instruction deprecation feature to mark mftb (now replaced with mfspr) and dst (along with the other Altivec cache control instructions) as deprecated when targeting cores supporting at least ISA v2.03. llvm-svn: 190605	2013-09-12 14:40:06 +00:00
Joey Gouly	fccb3bcae3	Add an instruction deprecation feature to TableGen. The 'Deprecated' class allows you to specify a SubtargetFeature that the instruction is deprecated on. The 'ComplexDeprecationPredicate' class allows you to define a custom predicate that is called to check for deprecation. For example: ComplexDeprecationPredicate<"MCR"> would mean you would have to define the following function: bool getMCRDeprecationInfo(MCInst &MI, MCSubtargetInfo &STI, std::string &Info) Which returns 'false' for not deprecated, and 'true' for deprecated and store the warning message in 'Info'. The MCTargetAsmParser constructor was chaned to take an extra argument of the MCInstrInfo class, so out-of-tree targets will need to be changed. llvm-svn: 190598	2013-09-12 10:28:05 +00:00
Hal Finkel	6164109851	PPC: Enable aggressive anti-dependency breaking Aggressive anti-dependency breaking is enabled by default for all PPC cores. This provides a general speedup on the P7 and other platforms (among other factors, the instruction group formation for the non-embedded PPC cores is done during post-RA scheduling). In order to do this safely, the incompatibility between uses of the MFOCRF instruction and anti-dependency breaking are resolved by marking MFOCRF with hasExtraSrcRegAllocReq. As noted in the removed FIXME, the problem was that MFOCRF's output is sensitive to the identify of the source register, and always paired with a shift to undo this effect. Because anti-dependency breaking is unaware of this hidden dependency of the shift amount on the source register of the MFOCRF instruction, changing that register must be inhibited. Two test cases were adjusted: The SjLj test was made more insensitive to register choices and scheduling; the saveCR test disabled anti-dependency breaking because part of what it is testing is proper register reuse. llvm-svn: 190587	2013-09-12 05:24:49 +00:00
Hal Finkel	2f2965c149	Greatly simplify the PPC A2 scheduling itinerary As Andy pointed out to me a long time ago, there are no structural hazards in the later pipeline stages of the A2, and so modeling them is useless. Also, modeling the top pre-dispatch stages is deceiving because, when multiple hardware threads are active, those resources are shared among the threads. The bypass definitions were mostly wrong, and so those have been removed. The resulting itinerary is much simpler, and more accurate. llvm-svn: 190562	2013-09-11 23:25:21 +00:00
Hal Finkel	bad4087f4a	Enable MI scheduling (and CodeGen AA) by default for embedded PPC cores For embedded PPC cores (especially the A2 core), using the MI scheduler with AA is far superior to the other scheduling options. llvm-svn: 190558	2013-09-11 23:05:25 +00:00
Hal Finkel	87afacb632	Implement TTI getUnrollingPreferences for PowerPC The PowerPC A2 core greatly benefits from aggressive concatenation unrolling; use the new getUnrollingPreferences to enable this by default when targeting the PPC A2 core. llvm-svn: 190549	2013-09-11 21:20:40 +00:00
Bill Wendling	2c532e9c9b	Generate compact unwind encoding from CFI directives. We used to generate the compact unwind encoding from the machine instructions. However, this had the problem that if the user used `-save-temps' or compiled their hand-written `.s' file (with CFI directives), we wouldn't generate the compact unwind encoding. Move the algorithm that generates the compact unwind encoding into the MCAsmBackend. This way we can generate the encoding whether the code is from a `.ll' or `.s' file. <rdar://problem/13623355> llvm-svn: 190290	2013-09-09 02:37:14 +00:00
Charles Davis	5191e0b0d0	Move everything depending on Object/MachOFormat.h over to Support/MachO.h. llvm-svn: 189728	2013-09-01 04:28:48 +00:00
Bill Schmidt	fbcebc7d75	[PowerPC] Fast-isel cleanup patch. Here are a few miscellaneous things to tidy up the PPC64 fast-isel implementation. I corrected a couple of commentary lapses, and added documentation of future opportunities. I also implemented TargetMaterializeAlloca, which I somehow forgot when I split up the original huge patch. Finally, I decided to delete SelectCmp. I hadn't previously hooked it in to TargetSelectInstruction(), and when I did I realized it wasn't serving any useful purpose. This is only useful for compares that don't feed a branch in the same block, and to handle that we would have to have logic to interpret i1 as a condition register. This could probably be done, but would require Unseemly Hackery, and honestly does not seem worth the hassle. This ends the current patch series. llvm-svn: 189715	2013-08-31 02:33:40 +00:00
Bill Schmidt	98963cd850	[PowerPC] Add integer truncation support to fast-isel. This is the last substantive patch I'm planning for fast-isel in the near future, adding fast selection of integer truncates. There are certainly more things that can be improved (many of which are called out in FIXMEs), but for now we are catching most of the important cases. I'll document some of the remaining work in a cleanup patch shortly. llvm-svn: 189706	2013-08-30 23:31:33 +00:00
Bill Schmidt	62a0b5c55b	Correct partially defined variable llvm-svn: 189705	2013-08-30 23:25:30 +00:00
Bill Schmidt	65bf01a470	[PowerPC] Call support for fast-isel. This patch adds fast-isel support for calls (but not intrinsic calls or varargs calls). It also removes a badly-formed assert. There are some new tests just for calls, and also for folding loads into arguments on calls to avoid extra extends. llvm-svn: 189701	2013-08-30 22:18:55 +00:00
Bill Schmidt	886231ba0f	[PowerPC] Add handling for conversions to fast-isel. Yet another chunk of fast-isel code. This one handles various conversions involving floating-point. (It also includes some miscellaneous handling throughout the back end for LWA_32 and LWAX_32 that should have been part of the load-store patch.) llvm-svn: 189677	2013-08-30 15:18:11 +00:00
Bill Schmidt	07bcdd6b9c	[PowerPC] Handle selection of compare instructions in fast-isel. Mostly trivial patch adding support for compares. The meat of the work was added with the branch support. llvm-svn: 189639	2013-08-30 03:16:48 +00:00
Bill Schmidt	3f2df6119d	Remove bogus debug statement. Sheesh. llvm-svn: 189638	2013-08-30 03:07:11 +00:00

1 2 3 4 5 ...

3874 Commits