1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 04:22:57 +02:00
Commit Graph

119862 Commits

Author SHA1 Message Date
Sanjay Patel
8b07d115c8 ignore duplicate divisor uses when transforming into reciprocal multiplies (PR24141)
PR24141: https://llvm.org/bugs/show_bug.cgi?id=24141
contains a test case where we have duplicate entries in a node's uses() list.

After r241826, we use CombineTo() to delete dead nodes when combining the uses into
reciprocal multiplies, but this fails if we encounter the just-deleted node again in
the list.

The solution in this patch is to not add duplicate entries to the list of users that
we will subsequently iterate over. For the test case, this avoids triggering the
combine divisors logic entirely because there really is only one user of the divisor.

Differential Revision: http://reviews.llvm.org/D11345

llvm-svn: 243500
2015-07-28 23:28:22 +00:00
Sanjay Patel
f59ca7d0e7 fix TLI's combineRepeatedFPDivisors interface to return the minimum user threshold
This fix was suggested as part of D11345 and is part of fixing PR24141.

With this change, we can avoid walking the uses of a divisor node if the target
doesn't want the combineRepeatedFPDivisors transform in the first place.

There is no NFC-intended other than that.

Differential Revision: http://reviews.llvm.org/D11531

llvm-svn: 243498
2015-07-28 23:05:48 +00:00
Alex Lorenz
13d7649f78 MIR Serialization: Serialize the target index machine operands.
Reviewers: Duncan P. N. Exon Smith
llvm-svn: 243497
2015-07-28 23:02:45 +00:00
Akira Hatanaka
da45bae202 [ARM] Define subtarget feature strict-align.
This commit defines subtarget feature strict-align and uses it instead of
cl::opt -arm-strict-align to decide whether strict alignment should be
forced. Also, remove the logic that was checking the OS and architecture
as clang is now responsible for setting strict-align based on the command
line options specified and the target architecute and OS.

rdar://problem/21529937

http://reviews.llvm.org/D11470

llvm-svn: 243493
2015-07-28 22:44:28 +00:00
Tim Northover
2a4f107a53 AArch64: be careful of large immediates when optimising cmps.
llvm-svn: 243492
2015-07-28 22:42:32 +00:00
Davide Italiano
8ca355af28 [tests] Use llvm-readobj instead of macho-dump.
llvm-svn: 243487
2015-07-28 21:58:08 +00:00
Bruno Cardoso Lopes
e5225dc00d [PeepholeOptimizer] Look through PHIs to find additional register sources
Reapply 243271 with more fixes; although we are not handling multiple
sources with coalescable copies, we were not properly skipping this
case.

- Teaches the ValueTracker in the PeepholeOptimizer to look through PHI
instructions.
- Add findNextSourceAndRewritePHI method to lookup into multiple sources
returnted by the ValueTracker and rewrite PHIs with new sources.

With these changes we can find more register sources and rewrite more
copies to allow coaslescing of bitcast instructions. Hence, we eliminate
unnecessary VR64 <-> GR64 copies in x86, but it could be extended to
other archs by marking "isBitcast" on target specific instructions. The
x86 example follows:

A:
  psllq %mm1, %mm0
  movd  %mm0, %r9
  jmp C

B:
  por %mm1, %mm0
  movd  %mm0, %r9
  jmp C

C:
  movd  %r9, %mm0
  pshufw  $238, %mm0, %mm0

Becomes:

A:
  psllq %mm1, %mm0
  jmp C

B:
  por %mm1, %mm0
  jmp C

C:
  pshufw  $238, %mm0, %mm0

Differential Revision: http://reviews.llvm.org/D11197
rdar://problem/20404526

llvm-svn: 243486
2015-07-28 21:45:50 +00:00
Vasileios Kalintiris
cc192bfbf1 [mips][FastISel] Fix call lowering by bailing out on "fastcc" calls.
Summary:
Currently, we support only the MIPS O32 ABI calling convention for call
lowering. With this change we avoid using the O32 calling convetion for
lowering calls marked as using the fast calling convention.

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11515

llvm-svn: 243485
2015-07-28 21:43:31 +00:00
Lang Hames
787b94fac7 [RuntimeDyld] Remove a memory-leak that was introduced in r243456. Thanks to Ben
Kramer for catching this.

llvm-svn: 243476
2015-07-28 20:51:53 +00:00
Chih-Hung Hsieh
c40b4e8361 Fix typo.
llvm-svn: 243475
2015-07-28 20:38:29 +00:00
Chih-Hung Hsieh
be61f44997 Limit this test only on linux.
Differential Revision: http://reviews.llvm.org/D10522

llvm-svn: 243474
2015-07-28 20:31:10 +00:00
Michael Zolotukhin
6f69b18d73 [Unroll] Add debug dumps to loop-unroll analyzer.
llvm-svn: 243471
2015-07-28 20:07:29 +00:00
Vasileios Kalintiris
21dafebde9 [mips][FastISel] Fix generated code for IR's select instruction.
Summary:
Generate correct code for the select instruction by zero-extending
it's boolean/condition operand to GPR-width. This is necessary because
the conditional-move instructions operate on the whole register.

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11506

llvm-svn: 243469
2015-07-28 19:57:25 +00:00
Michael Zolotukhin
88a9ab7f5a [Unroll] Don't analyze blocks outside the loop.
llvm-svn: 243466
2015-07-28 19:21:21 +00:00
Matt Arsenault
51dce12776 AMDGPU: Don't try to use LDS/vector for private if pointer value stored
If the pointer is the store's value operand, this would produce
a broken module. Make sure the use is actually for the pointer operand.

llvm-svn: 243462
2015-07-28 18:47:00 +00:00
Matt Arsenault
16b3f40599 AMDGPU: Fix crash if called function is a bitcast
getCalledFunction() is null, so this would crash. Replace
crash with an error on unsupported call.

llvm-svn: 243461
2015-07-28 18:29:14 +00:00
Jingyue Wu
a6a8a2d2b1 [SCEV] Apply NSW and NUW flags via poison value analysis
Summary:
Make Scalar Evolution able to propagate NSW and NUW flags from instructions to SCEVs in some cases. This is based on reasoning about when poison from instructions with these flags would trigger undefined behavior. This gives a 13% speed-up on some Eigen3-based Google-internal microbenchmarks for NVPTX.

There does not seem to be clear agreement about when poison should be considered to propagate through instructions. In this analysis, poison propagates only in cases where that should be uncontroversial.

This change makes LSR able to create induction variables for expressions like &ptr[i + offset] for loops like this:

  for (int i = 0; i < limit; ++i) {
    sum += ptr[i + offset];
  }

Here ptr is a 64 bit pointer and offset is a 32 bit integer. For NVPTX, LSR currently creates an induction variable for i + offset instead, which is not as fast. Improving this situation is what brings the 13% speed-up on some Eigen3-based Google-internal microbenchmarks for NVPTX.


There are more details in this discussion on llvmdev.
June: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-June/thread.html#87234
July: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/thread.html#87392

Patch by Bjarke Roune

Reviewers: eliben, atrick, sanjoy

Subscribers: majnemer, hfinkel, jingyue, meheff, llvm-commits

Differential Revision: http://reviews.llvm.org/D11212

llvm-svn: 243460
2015-07-28 18:22:40 +00:00
Matt Arsenault
27b16f9fd2 AMDGPU: Fix return type of getImplicitParameterOffset.
Patch by Zoltan Gilian <zoltan.gilian@gmail.com>

llvm-svn: 243459
2015-07-28 18:09:55 +00:00
Alex Lorenz
885c819f7b Add a test case for r242191 ([MMX] Use the appropriate instructions for
GR64 <-> VR64 copies).

This commit adds a MIR test case for the commit r242191, which was committed
without one. This test case verifies that the ExpandPostRA pass expands the
GR64 <-> VR64 copies into the appropriate MMX_MOV instructions.

llvm-svn: 243457
2015-07-28 17:52:59 +00:00
Lang Hames
8d59074fa2 [RuntimeDyld] Make LoadedObjectInfo::getLoadedSectionAddress take a SectionRef
rather than a string section name.

llvm-svn: 243456
2015-07-28 17:52:11 +00:00
Chih-Hung Hsieh
9951404987 Move unit tests to target specific directories.
Differential Revision: http://reviews.llvm.org/D10522

llvm-svn: 243454
2015-07-28 17:32:49 +00:00
Alex Lorenz
c9aa9d8e5f MIR Serialization: Serialize the block address machine operands.
llvm-svn: 243453
2015-07-28 17:28:03 +00:00
JF Bastien
47b5b15818 WebAssembly: MCAsmInfo only has one syntax variant for now.
Summary: MCAsmInfo is set up with the default AssemblerDialect, which is zero.

Subscribers: llvm-commits, sunfish, jfb

Differential Revision: http://reviews.llvm.org/D11567

llvm-svn: 243452
2015-07-28 17:23:07 +00:00
Sanjay Patel
150d1818cf add tests to show broken current behavior of minsize attribute
llvm-svn: 243451
2015-07-28 17:18:25 +00:00
Alex Lorenz
072f1a6a6c MIR Parser: Extract the method 'parseGlobalValue'. NFC.
This commit extracts the code that parses a global value from the method
'parseGlobalAddressOperand' into a new method 'parseGlobalValue', so that this
code can be reused by the method which will parse the block address machine
operands.

llvm-svn: 243450
2015-07-28 17:09:52 +00:00
Alex Lorenz
29ce80d4de MIR Parser: Move the function 'lexName'. NFC.
This commit moves the function 'lexName' to the start of the file so it can
be reused by the function which will lex the named LLVM IR block references.

llvm-svn: 243449
2015-07-28 17:03:40 +00:00
Alex Lorenz
6c69f31f77 MIR Printer: Remove an outdated TODO comment and assertion. NFC.
This commit removes an outdated TODO comment and a corresponding assertion
which asserts that the mir printer can't the print machine basic blocks that
aren't sequentially numbered.

This comment and assertion were correct when I was working on the patch which
serialized the machine basic blocks, but then I decided to add an 'ID'
attribute to the machine basic block's YAML mapping based on the patch review.
This comment and assertion then became invalid as with the 'ID' attribute we
can serialize the non sequential machine basic blocks and their references
without any problems.

llvm-svn: 243447
2015-07-28 16:56:45 +00:00
Alex Lorenz
82fbf0efb2 MIR Parser: Remove redundant parameters. NFC.
This commit removes the redundant parameters from the two methods
'initializeRegisterInfo' and 'initializeFrameInfo'. The removed parameters are
redundant as we are already passing in the 'MachineFunction' to those methods,
and those parameters can be derived from the machine function parameter.

llvm-svn: 243445
2015-07-28 16:48:37 +00:00
Chih-Hung Hsieh
17e97fbf13 Implement target independent TLS compatible with glibc's emutls.c.
The 'common' section TLS is not implemented.
Current C/C++ TLS variables are not placed in common section.
DWARF debug info to get the address of TLS variables is not generated yet.

clang and driver changes in http://reviews.llvm.org/D10524

  Added -femulated-tls flag to select the emulated TLS model,
  which will be used for old targets like Android that do not
  support ELF TLS models.

Added TargetLowering::LowerToTLSEmulatedModel as a target-independent
function to convert a SDNode of TLS variable address to a function call
to __emutls_get_address.

Added into lib/Target/*/*ISelLowering.cpp to call LowerToTLSEmulatedModel
for TLSModel::Emulated. Although all targets supporting ELF TLS models are
enhanced, emulated TLS model has been tested only for Android ELF targets.
Modified AsmPrinter.cpp to print the emutls_v.* and emutls_t.* variables for
emulated TLS variables.
Modified DwarfCompileUnit.cpp to skip some DIE for emulated TLS variabls.

TODO: Add proper DIE for emulated TLS variables.
      Added new unit tests with emulated TLS.

Differential Revision: http://reviews.llvm.org/D10522

llvm-svn: 243438
2015-07-28 16:24:05 +00:00
Martell Malone
785f5c9c50 Summary:
Object: add IMAGE_FILE_MACHINE_ARM64

The official specifications state that the value of IMAGE_FILE_MACHINE_ARM64
is 0xAA64 (as per the Microsoft Portable Executable and Common Object Format
Specification v8.3).

Reviewers: rnk

Subscribers: llvm-commits, compnerd, ruiu

Differential Revision: http://reviews.llvm.org/D11511

llvm-svn: 243434
2015-07-28 16:18:17 +00:00
Bruno Cardoso Lopes
cdfacd96b4 [LVI] Cleanup whitespaces. NFC
llvm-svn: 243430
2015-07-28 15:53:21 +00:00
Sanjay Patel
c11822e0e8 fix formatting; NFC
llvm-svn: 243424
2015-07-28 15:38:43 +00:00
Geoff Berry
72eac10c87 [AArch64] Match float round and convert to int instructions.
Summary:
Add patterns for doing floating point round with various rounding modes
followed by conversion to int as a single FCVT* instruction.

Reviewers: t.p.northover, jmolloy

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D11424

llvm-svn: 243422
2015-07-28 15:24:10 +00:00
Douglas Katzman
97a4d33479 Use a specified list of languages in cmake project() command.
This allows asm files and Cxx files to be compiled with different flags
rather than treating them identically. LLVM itself has no asm files
other than tests, but this setting is inherited by the compiler-rt
project (unless compiled standalone), which does have asm files.

Differential Revision: http://reviews.llvm.org/D10707

llvm-svn: 243419
2015-07-28 14:43:53 +00:00
Silviu Baranga
97f890ebca [LAA] Add clarifying comments for the checking pointer grouping algorithm. NFC
llvm-svn: 243416
2015-07-28 13:44:08 +00:00
Adhemerval Zanella
b3b937fabe Implement __builtin_thread_pointer
This path add the aarch64 lowering of __builtin_thread_pointer.  It uses
the already implemented AArch64ISD::THREAD_POINTER used in TLS generation.

llvm-svn: 243412
2015-07-28 13:03:31 +00:00
Martell Malone
69709f20b8 docs: update arcanist links
Summary:
I need a test commit for using arc.
This seems like an appropriate commit to use as a test

We may want to port this commit back to 3.7 also

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11527

llvm-svn: 243408
2015-07-28 11:43:37 +00:00
Chandler Carruth
d51b8c5480 [GMR] Teach GlobalsModRef to distinguish an important and safe case of
no-alias with non-addr-taken globals: they cannot alias a captured
pointer.

If the non-global underlying object would have been a capture were it to
alias the global, we can firmly conclude no-alias. It isn't reasonable
for a transformation to introduce a capture in a way observable by an
alias analysis. Consider, even if it were to temporarily capture one
globals address into another global and then restore the other global
afterward, there would be no way for the load in the alias query to
observe that capture event correctly. If it observes it then the
temporary capturing would have changed the meaning of the program,
making it an invalid transformation. Even instrumentation passes or
a pass which is synthesizing stores to global variables to expose race
conditions in programs could not trigger this unless it queried the
alias analysis infrastructure mid-transform, in which case it seems
reasonable to return results from before the transform started.

See the comments in the change for a more detailed outlining of the
theory here.

This should address the primary performance regression found when the
non-conservatively-correct path of the alias query was disabled.

Differential Revision: http://reviews.llvm.org/D11410

llvm-svn: 243405
2015-07-28 11:11:11 +00:00
Renato Golin
babd40ed93 Improving lli documentation
Too many people hope lli would act as an emulator when it's actually
just a tool to help prototype IR code and test the JIT compiler. This
commit makes that fact explicit in the documentation

It also migrates the old style bold/italic doc tags to the preferred
meta tags (.. option::, :program:, etc).

No errors when generating the documents, visual inspection in the HTML
result doesn't show any major difference, apart from the slight style
change.

llvm-svn: 243401
2015-07-28 10:24:11 +00:00
Michael Kuperstein
07f74e1a99 [X86] Remove mergeSPUpdatesUp()
X86FrameLowering has both a mergeSPUpdates() that accepts a direction, and an
mergeSPUpdatesUp(), which seem to do the same thing, except for a slightly 
different interface. Removed the less general function.
NFC.

Differential Revision: http://reviews.llvm.org/D11510

llvm-svn: 243396
2015-07-28 08:56:13 +00:00
Simon Pilgrim
8ef138051c [X86][SSE] Use bitmasks instead of shuffles where possible.
VPAND is a lot faster than VPSHUFB and VPBLENDVB - this patch ensures we attempt to lower to a basic bitmask before lowering to the slower byte shuffle/blend instructions.

Split off from D11518.

Differential Revision: http://reviews.llvm.org/D11541

llvm-svn: 243395
2015-07-28 08:54:41 +00:00
Igor Breger
b86cbff9ff AVX512: Add encoding tests to vptestnm instructions
Differential Revision: http://reviews.llvm.org/D11521

llvm-svn: 243391
2015-07-28 07:00:00 +00:00
Igor Breger
28223d1ba3 AVX512: Implemented encoding and intrinsics for VGETEXPSS/D instructions
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D11528

llvm-svn: 243390
2015-07-28 06:53:28 +00:00
Puyan Lotfi
7dd7eac0b4 Changes for MachineBasicBlock to use SortedVector for LiveIns.
llvm-svn: 243389
2015-07-28 06:38:41 +00:00
Mehdi Amini
4fba24a39c Move the Target way of overriding DAG Scheduler to a target hook
Summary:
The previous way of overriding it was relying on calling "setDefault"
on the global registry, which implies global mutable state.

Reviewers: echristo, atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11538

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 243388
2015-07-28 06:18:04 +00:00
Puyan Lotfi
8cdd0f6b3e Adding ADT SortedVector; client patch will follow.
llvm-svn: 243386
2015-07-28 06:04:00 +00:00
Chandler Carruth
e831556fca [GMR] Fix a long-standing bug in GlobalsModRef where it failed to clear
out the per-function modref data structures when functions were deleted
or when globals were deleted.

I don't actually know how the global deletion side of this bug hasn't
been hit before, but for the other it just-so-happens that functions
aren't likely to be deleted in the particular part of the LTO pipeline
where we currently enable GMR, so we got lucky.

With this patch, I can self-host with GMR enabled in the normal pass
pipeline!

I was a bit concerned about the compile-time impact of this chang, which
is part of what motivated my prior string of patches to make the
per-function datastructure very dense and fast to walk. With those
changes in place, I can't measure a significant compile time difference
(the difference is around 0.1% which is *way* below the noise) before
and after this patch when building a linked bitcode for all of Clang.

Differential Revision: http://reviews.llvm.org/D11453

llvm-svn: 243385
2015-07-28 06:01:57 +00:00
Adam Nemet
cdf7537068 [LDist][LVer] Explicitly pass the set of memchecks to LoopVersioning, NFC
Before the patch, the checks were generated internally in
addRuntimeCheck.  Now, we use the new overloaded version of
addRuntimeCheck that takes the ready-made set of checks as a parameter.

The checks are now generated by the client (LoopDistribution) with the
new RuntimePointerChecking::generateChecks API.

Also the new printChecks API is used to print out the checks for
debugging.

This is to continue the transition over to the new model whereby clients
will get the full set of checks from LAA, filter it and then pass it to
LoopVersioning and in turn to addRuntimeCheck.

llvm-svn: 243382
2015-07-28 05:01:53 +00:00
Craig Topper
811269b76b Remove unnecessary const_casts. NFC
llvm-svn: 243380
2015-07-28 04:28:46 +00:00
Bob Wilson
58dd65ff96 Reserve some constant values for the Swift calling convention.
Swift has a custom calling convention that also requires some new flags
on arguments and one new attribute on alloca instructions. This patch
does not include the implementation of that calling convention - that
will be provided as part of the open-source release of Swift; this only
reserves the bitcode constant values so that they are not used for
other purposes.

llvm-svn: 243379
2015-07-28 04:05:45 +00:00