Craig Topper
bdcc86a43c
Combine FMA4 PS/PD patterns with the instruction definitions.
...
llvm-svn: 147364
2011-12-30 03:17:15 +00:00
Craig Topper
33091db89a
Change FMA4 memory forms to use memopv* instead of alignedloadv*. No need to force alignment on these instructions. Add a couple testcases for memory forms.
...
llvm-svn: 147361
2011-12-30 02:18:36 +00:00
Craig Topper
e066262284
Fix load size for FMA4 SS/SD instructions. They need to use f32 and f64 size, but with the special handling to be compatible with the intrinsic expecting a vector. Similar handling is already used elsewhere.
...
llvm-svn: 147360
2011-12-30 01:49:53 +00:00
Hal Finkel
4a09216dfb
Cleanup stack/frame register define/kill states. This fixes two bugs:
...
1. The ST*UX instructions that store and update the stack pointer did not set define/kill on R1. This became a problem when I activated post-RA scheduling (and had incorrectly adjusted the Frames-large test).
2. eliminateFrameIndex did not kill its scavenged temporary register, and this could cause the scavenger to exhaust all available registers (and its emergency spill slot) when there were a lot of CR values to spill. The 2010-02-12-saveCR test has been adjusted to check for this.
llvm-svn: 147359
2011-12-30 00:34:00 +00:00
Rafael Espindola
db7319d272
Implement cfi_restore. Patch by Brian Anderson!
...
llvm-svn: 147356
2011-12-29 21:43:03 +00:00
Rafael Espindola
bc8c3d0ca0
Rename Remember and Restore to RememberState and RestoreState for consistency.
...
llvm-svn: 147354
2011-12-29 21:09:08 +00:00
Craig Topper
97e84c23a1
Fix execution domains for PS/PD FMA3 instructions. Add SS/SD forms o FMA3 instructions.
...
llvm-svn: 147353
2011-12-29 20:43:40 +00:00
Rafael Espindola
27298c6f33
Implement .cfi_escape. Patch by Brian Anderson!
...
llvm-svn: 147352
2011-12-29 20:24:47 +00:00
Craig Topper
bcfd070378
Expose FMA3 instructions to the disassembler.
...
llvm-svn: 147351
2011-12-29 20:03:14 +00:00
Craig Topper
9d664349d6
Make FMA3 imply AVX needs to be enabled. Particularly because 256-bit types aren't valid unless AVX is enabled.
...
llvm-svn: 147349
2011-12-29 19:46:19 +00:00
Craig Topper
21029d1f81
Change XOP detection to use the correct CPUID bit instead of using the FMA4 bit.
...
llvm-svn: 147348
2011-12-29 19:25:56 +00:00
Craig Topper
93d614dd3a
Add FeaturePOPCNT to all CPU types that lost it was removed from SSE42/SSE4A in r147339.
...
llvm-svn: 147347
2011-12-29 18:47:31 +00:00
Craig Topper
ba73cefabb
Mark non-VEX forms of PCLMUL instructions as requiring SSE2 to be enabled along with CLMUL. That's required for the XMM registers to be valid for integer data. Doesn't change any behavior since the CLMUL instructions don't have patterns yet.
...
llvm-svn: 147345
2011-12-29 18:08:36 +00:00
Craig Topper
63bb77ebe7
Mark non-VEX forms of AES instructions as requiring SSE2 to be enabled along with AES. Since that's required for the XMM registers to be valid for integer data. Doesn't change any behavior though since you can't use an intrinsic with an illegal type anyway. Just makes it consistent with the VEX forms.
...
llvm-svn: 147344
2011-12-29 18:00:08 +00:00
Craig Topper
04b3b369de
Remove the separate explicit AES instruction patterns. They are equivalent to the patterns specified by the instructions. Also remove unnecessary bitconverts from the AES patterns.
...
llvm-svn: 147342
2011-12-29 17:41:56 +00:00
Craig Topper
3ff20898e9
Make SSE42 and SSE4A not imply POPCNT. POPCNT should be able to be disabled on its own without disabling SSE4.2 or SSE4A.
...
llvm-svn: 147339
2011-12-29 15:51:45 +00:00
Craig Topper
0ff84d82cc
Make LowerBUILD_VECTOR keep node vector types consistent when creating MOVL for v16i16 and v32i8.
...
llvm-svn: 147337
2011-12-29 03:34:54 +00:00
Craig Topper
cb884f28d9
Remove some elses after returns.
...
llvm-svn: 147336
2011-12-29 03:20:51 +00:00
Craig Topper
4cbe88ceba
Remove trailing spaces. Fix an assert to use && instead of || before string. Add same assert on similar code path.
...
llvm-svn: 147335
2011-12-29 03:09:33 +00:00
Rafael Espindola
1935193ab7
Fix grammar error noticed by Duncan.
...
llvm-svn: 147333
2011-12-29 02:15:06 +00:00
Nick Lewycky
7425820374
Change CaptureTracking to pass a Use* instead of a Value* when a value is
...
captured. This allows the tracker to look at the specific use, which may be
especially interesting for function calls.
Use this to fix 'nocapture' deduction in FunctionAttrs. The existing one does
not iterate until a fixpoint and does not guarantee that it produces the same
result regardless of iteration order. The new implementation builds up a graph
of how arguments are passed from function to function, and uses a bottom-up walk
on the argument-SCCs to assign nocapture. This gets us nocapture more often, and
does so rather efficiently and independent of iteration order.
llvm-svn: 147327
2011-12-28 23:24:21 +00:00
Eli Friedman
db54b4b68f
Fix type-checking for load transformation which is not legal on floating-point types. PR11674.
...
llvm-svn: 147323
2011-12-28 21:24:44 +00:00
Bob Wilson
152a507523
Update OCaml bindings for the new half float type.
...
Patch by Jonathan Ragan-Kelley!
llvm-svn: 147314
2011-12-28 18:51:08 +00:00
Rafael Espindola
510f9d49fb
Add support for mipsel in configure. Fixes PR11669. Patch by Sylvestre Ledru.
...
llvm-svn: 147312
2011-12-28 17:08:00 +00:00
Nadav Rotem
d8c4880903
PR11662.
...
Promotion of the mask operand needs to be done using PromoteTargetBoolean, and not padded with garbage.
llvm-svn: 147309
2011-12-28 13:08:20 +00:00
Elena Demikhovsky
9b4613ff14
Fixed a bug in LowerVECTOR_SHUFFLE and LowerBUILD_VECTOR.
...
Matching MOVLP mask for AVX (265-bit vectors) was wrong.
The failure was detected by conformance tests.
llvm-svn: 147308
2011-12-28 08:14:01 +00:00
Nick Lewycky
17d4e8dae6
Demystify this comment.
...
llvm-svn: 147307
2011-12-28 06:57:32 +00:00
Rafael Espindola
2ecba45e58
PR11642 has been fixed, enable -fvisibility-inlines-hidden everywhere.
...
llvm-svn: 147296
2011-12-27 21:37:11 +00:00
Benjamin Kramer
77f9c9f719
Switch StringMap from an array of structures to a structure of arrays.
...
- -25% memory usage of the main table on x86_64 (was wasted in struct padding).
- no significant performance change.
llvm-svn: 147294
2011-12-27 20:35:07 +00:00
Nick Lewycky
4c5662bae0
Use false not zero, as a bool.
...
llvm-svn: 147292
2011-12-27 18:27:22 +00:00
Nick Lewycky
f4c21901a3
Turn cos(-x) into cos(x). Patch by Alexander Malyshev!
...
llvm-svn: 147291
2011-12-27 18:25:50 +00:00
Benjamin Kramer
c64fcf4a95
Clean up some Release build warnings.
...
llvm-svn: 147289
2011-12-27 11:41:05 +00:00
Craig Topper
9c07745da9
Add handling of x86_avx2_pmovmskb to computeMaskedBitsForTargetNode for consistency. Add comments and an assert for BMI instructions to PerformXorCombine since the enabling of the combine is conditional on it, but the function itself isn't.
...
llvm-svn: 147287
2011-12-27 06:27:23 +00:00
Nick Lewycky
295e397220
Teach simplifycfg to recompute branch weights when merging some branches, and
...
to discard weights when appropriate. Still more to do (and a new TODO), but
it's a start!
llvm-svn: 147286
2011-12-27 04:31:52 +00:00
Nick Lewycky
b3f57c8028
Using Inst->setMetadata(..., NULL) should be safe to remove metadata even when
...
there is non of that type to remove. This fixes a crasher in the particular
case where the instruction has metadata but no metadata storage in the context
(this is only possible if the instruction has !dbg but no other metadata info).
llvm-svn: 147285
2011-12-27 01:17:40 +00:00
Rafael Espindola
d448dfaa25
Fix warning.
...
llvm-svn: 147284
2011-12-26 23:12:42 +00:00
Eli Friedman
064187912e
Make sure DAGCombiner doesn't introduce multiple loads from the same memory location. PR10747, part 2.
...
llvm-svn: 147283
2011-12-26 22:49:32 +00:00
Nick Lewycky
56e04db381
Update the branch weight metadata when reversing the order of a branch.
...
llvm-svn: 147280
2011-12-26 20:54:14 +00:00
Nick Lewycky
80b1ea7ea3
Sort includes, canonicalize whitespace, fix typos. No functionality change.
...
llvm-svn: 147279
2011-12-26 20:37:40 +00:00
Nadav Rotem
7898dcf3bb
Update the LangRef documentation: the codegen does support this instruction.
...
llvm-svn: 147274
2011-12-25 21:32:35 +00:00
Nadav Rotem
d0f22ebb10
Fix a typo in the widening of vectors in PromoteIntRes. Patch by Shemer Anat.
...
llvm-svn: 147272
2011-12-25 20:01:38 +00:00
Venkatraman Govindaraju
d9c90c9fa5
Sparc: Implement emitFrameIndexDebugValue and getDebugValue Location hooks.
...
llvm-svn: 147269
2011-12-25 18:50:24 +00:00
Bill Wendling
82ca4569b9
Add braces to remove silly warning.
...
llvm-svn: 147264
2011-12-25 06:56:22 +00:00
Rafael Espindola
b598485fb9
Remove unused variables.
...
llvm-svn: 147261
2011-12-25 01:20:19 +00:00
Chandler Carruth
a012c64ced
Add an explicit test that we now fold cttz.i32(..., true) >> 5 -> 0.
...
This is a result of Benjamin's work on ValueTracking.
llvm-svn: 147259
2011-12-24 22:34:15 +00:00
Benjamin Kramer
94f07f8c2c
InstCombine: Add a combine that turns (2^n)-1 ^ x back into (2^n)-1 - x iff x is smaller than 2^n and it fuses with a following add.
...
This was intended to undo the sub canonicalization in cases where it's not profitable, but it also
finds some cases on it's own.
llvm-svn: 147256
2011-12-24 17:31:53 +00:00
Benjamin Kramer
b5e584392b
ComputeMaskedBits: Make knownzero computation more aggressive for ctlz with undef zero.
...
unsigned foo(unsigned x) { return 31 - __builtin_clz(x); }
now compiles into a single "bsrl" instruction on x86.
llvm-svn: 147255
2011-12-24 17:31:46 +00:00
Benjamin Kramer
0b4d2e3d2a
InstCombine: Canonicalize (2^n)-1 - x into (2^n)-1 ^ x iff x is known to be smaller than 2^n.
...
This has the obvious advantage of being commutable and is always a win on x86 because
const - x wastes a register there. On less weird architectures this may lead to
a regression because other arithmetic doesn't fuse with it anymore. I'll address that
problem in a followup.
llvm-svn: 147254
2011-12-24 17:31:38 +00:00
Rafael Espindola
504588a7a3
Section relative fixups are a coff concept, not a x86 one. Replace the
...
x86 specific reloc_coff_secrel32 with a generic FK_SecRel_4.
llvm-svn: 147252
2011-12-24 14:47:52 +00:00
Chandler Carruth
7a5c52fadf
Use standard promotion for i8 CTTZ nodes and i8 CTLZ nodes when the
...
LZCNT instructions are available. Force promotion to i32 to get
a smaller encoding since the fix-ups necessary are just as complex for
either promoted type
We can't do standard promotion for CTLZ when lowering through BSR
because it results in poor code surrounding the 'xor' at the end of this
instruction. Essentially, if we promote the entire CTLZ node to i32, we
end up doing the xor on a 32-bit CTLZ implementation, and then
subtracting appropriately to get back to an i8 value. Instead, our
custom logic just uses the knowledge of the incoming size to compute
a perfect xor. I'd love to know of a way to fix this, but so far I'm
drawing a blank. I suspect the legalizer could be more clever and/or it
could collude with the DAG combiner, but how... ;]
llvm-svn: 147251
2011-12-24 12:12:34 +00:00