1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

172082 Commits

Author SHA1 Message Date
Alex Bradbury
78e18785d2 [RISCV] Implement codegen for cmpxchg on RV32IA
Utilise a similar ('late') lowering strategy to D47882. The changes to 
AtomicExpandPass allow this strategy to be utilised by other targets which 
implement shouldExpandAtomicCmpXchgInIR.

All cmpxchg are lowered as 'strong' currently and failure ordering is ignored. 
This is conservative but correct.

Differential Revision: https://reviews.llvm.org/D48131

llvm-svn: 347914
2018-11-29 20:43:42 +00:00
Craig Topper
513fc8bae1 [X86] Change the pre-type legalization DAG combine added in r347898 into a custom type legalization operation instead.
This seems to produce the same results on the tests we have.

llvm-svn: 347912
2018-11-29 20:18:58 +00:00
David Stuttard
c4074ba685 Revert r347871 "Fix: Add support for TFE/LWE in image intrinsic"
Also revert fix r347876

One of the buildbots was reporting a failure in some relevant tests that I can't
repro or explain at present, so reverting until I can isolate.

llvm-svn: 347911
2018-11-29 20:14:17 +00:00
Artur Pilipenko
24aa32ed4a Introduce MaxUsesToExplore argument to capture tracking
Currently CaptureTracker gives up if it encounters a value with more than 20 
uses. The motivation for this cap is to keep it relatively cheap for 
BasicAliasAnalysis use case, where the results can't be cached. Although, other 
clients of CaptureTracker might be ok with higher cost. This patch introduces an 
argument for PointerMayBeCaptured functions to specify the max number of uses to 
explore. The motivation for this change is a downstream user of CaptureTracker, 
but I believe upstream clients of CaptureTracker might also benefit from more 
fine grained cap.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D55042

llvm-svn: 347910
2018-11-29 20:08:12 +00:00
Francis Visoiu Mistrih
b1262b3845 [MachineScheduler] Order FI-based memops based on stack direction
It makes more sense to order FI-based memops in descending order when
the stack goes down. This allows offsets to stay "consecutive" and allow
easier pattern matching.

llvm-svn: 347906
2018-11-29 20:03:19 +00:00
Craig Topper
dfe7e315ea [SelectionDAG][AArch64][X86] Move legalization of vector MULHS/MULHU from LegalizeDAG to LegalizeVectorOps
I believe we should be legalizing these with the rest of vector binary operations. If any custom lowering is required for these nodes, this will give the DAG combine between LegalizeVectorOps and LegalizeDAG to run on the custom code before constant build_vectors are lowered in LegalizeDAG.

I've moved MULHU/MULHS handling in AArch64 from Lowering to isel. Moving the lowering earlier caused build_vector+extract_subvector simplifications to kick in which made the generated code worse.

Differential Revision: https://reviews.llvm.org/D54276

llvm-svn: 347902
2018-11-29 19:36:17 +00:00
Craig Topper
099c4dc2f4 [X86] Add a DAG combine pre type legalization to widen division by constant splat on narrow vectors to avoid scalarization
This is another patch for -x86-experimental-vector-widening. This pre widens narrow division by constants so that we can get pass the legal type check in the generic DAG combiner. Otherwise we end up scalarizing.

I've restricted this to splats for now because it was easy to just call DAG.getConstant. Not sure what we should do for non-splat? Increase the element size?Widen the constant vector by padding with 1?

Differential Revision: https://reviews.llvm.org/D54919

llvm-svn: 347898
2018-11-29 19:13:38 +00:00
Sanjay Patel
beab0f0428 [InstSimplify] fold select with implied condition
This is an almost direct move of the functionality from InstCombine to 
InstSimplify. There's no reason not to do this in InstSimplify because 
we never create a new value with this transform.

(There's a question of whether any dominance-based transform belongs in
either of these passes, but that's a separate issue.)

I've changed 1 of the conditions for the fold (1 of the blocks for the 
branch must be the block we started with) into an assert because I'm not 
sure how that could ever be false.

We need 1 extra check to make sure that the instruction itself is in a
basic block because passes other than InstCombine may be using InstSimplify
as an analysis on values that are not wired up yet.

The 3-way compare changes show that InstCombine has some kind of 
phase-ordering hole. Otherwise, we would have already gotten the intended
final result that we now show here.

llvm-svn: 347896
2018-11-29 18:44:39 +00:00
Krzysztof Parzyszek
49652486ed [TableGen] Examine entire subreg compositions to detect ambiguity
When tablegen detects that there exist two subregister compositions that
result in the same value for some register, it will emit a warning. This
kind of an overlap in compositions should only happen when it is caused
by a user-defined composition. It can happen, however, that the user-
defined composition is not identically equal to another one, but it does
produce the same value for one or more registers. In such cases suppress
the warning.
This patch is to silence the warning when building the System Z backend
after D50725.

Differential Revision: https://reviews.llvm.org/D50977

llvm-svn: 347894
2018-11-29 18:20:08 +00:00
Volkan Keles
cc6119f1d3 [GlobalISel] LegalizationArtifactCombiner: Combine aext([asz]ext x) -> [asz]ext x
Summary:
Replace `aext([asz]ext x)` with `aext/sext/zext x` in order to
reduce the number of instructions generated to clean up some
legalization artifacts.

Reviewers: aditya_nandakumar, dsanders, aemerson, bogner

Reviewed By: aemerson

Subscribers: rovka, kristof.beyls, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D54174

llvm-svn: 347893
2018-11-29 18:19:24 +00:00
Fangrui Song
032ca6b491 [llvm-objcopy] Delete redundant !Config.xx.empty() when followed by positive is_contained() check
Summary: The original intention of !Config.xx.empty() was probably to emphasize the thing that is currently considered, but I feel the simplified form is actually easier to understand and it is also consistent with the call sites in other llvm components.

Reviewers: alexshap, rupprecht, jakehehrlich, jhenderson, espindola

Reviewed By: alexshap, rupprecht

Subscribers: emaste, arichardson, llvm-commits

Differential Revision: https://reviews.llvm.org/D55040

llvm-svn: 347891
2018-11-29 17:32:51 +00:00
Serge Guelton
060a664bf5 Avoid redundant reference to isPodLike in SmallVect/Optional implementation
NFC, preparatory work for isPodLike cleaning.

Differential Revision: https://reviews.llvm.org/D55005

llvm-svn: 347890
2018-11-29 17:21:54 +00:00
John Brawn
433144a780 [LICM] Reapply r347776 "Make LICM able to hoist phis" with fix
This commit caused a large compile-time slowdown in some cases when NDEBUG is
off due to the dominator tree verification it added. Fix this by only doing
dominator tree and loop info verification when something has been hoisted.

Differential Revision: https://reviews.llvm.org/D52827

llvm-svn: 347889
2018-11-29 17:10:00 +00:00
Teresa Johnson
525ee88977 [ThinLTO] Import local variables from the same module as caller
Summary:
We can sometimes end up with multiple copies of a local variable that
have the same GUID in the index. This happens when there are local
variables with the same name that are in different source files having the
same name/path at compile time (but compiled into different bitcode objects).

In this case make sure we import the copy in the caller's module.
This enables importing both of the variables having the same GUID
(but which will have different promoted names since the module paths,
and therefore the module hashes, will be distinct).

Importing the wrong copy is particularly problematic for read only
variables, since we must import them as a local copy whenever
referenced. Otherwise we get undefs at link time.

Note that the llvm-lto.cpp and ThinLTOCodeGenerator changes are needed
for testing the distributed index case via clang, which will be sent as
a separate clang-side patch shortly. We were previously not doing the
dead code/read only computation before computing imports when testing
distributed index generation (like it was for testing importing and
other ThinLTO mechanisms alone).

Reviewers: evgeny777

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, dang, llvm-commits

Differential Revision: https://reviews.llvm.org/D55047

llvm-svn: 347886
2018-11-29 17:02:42 +00:00
James Y Knight
17ff4d329f git-llvm: Fix incremental population of svn tree.
"svn update --depth=..." is, annoyingly, not a specification of the
desired depth, but rather a _limit_ added on top of the "sticky" depth
in the working-directory. However, if the directory doesn't exist yet,
then it sets the sticky depth of the new directory entries.

Unfortunately, the svn command-line has no way of expanding the depth
of a directory from "empty" to "files", without also removing any
already-expanded subdirectories. The way you're supposed to increase
the depth of an existing directory is via --set-depth, but
--set-depth=files will also remove any subdirs which were already
requested.

This change avoids getting into the state of ever needing to increase
the depth of an existing directory from "empty" to "files" in the
first place, by:

1. Use svn update --depth=files, not --depth=immediates.

The latter has the effect of checking out the subdirectories and
marking them as depth=empty. The former excludes sub-directories from
the list of entries, which avoids the problem.

2. Explicitly populate missing parent directories.

Using --parents seemed nice and easy, but it marks the parent dirs as
depth=empty. Instead, check out parents explicitly if they're missing.

llvm-svn: 347883
2018-11-29 16:46:34 +00:00
Sanjay Patel
2111d83101 [SimplifyCFG] auto-generate complete checks; NFC
llvm-svn: 347882
2018-11-29 16:28:37 +00:00
Sanjay Patel
e9ed1e8813 [InstCombine] auto-generate complete checks; NFC
llvm-svn: 347881
2018-11-29 16:26:03 +00:00
Graham Sellers
95cb757dac [AMDGPU] Add and update scalar instructions
This patch adds support for S_ANDN2, S_ORN2 32-bit and 64-bit instructions and adds splits to move them to the vector unit (for which there is no equivalent instruction). It modifies the way that the more complex scalar instructions are lowered to vector instructions by first breaking them down to sequences of simpler scalar instructions which are then lowered through the existing code paths. The pattern for S_XNOR has also been updated to apply inversion to one input rather than the output of the XOR as the result is equivalent and may allow leaving the NOT instruction on the scalar unit.

A new tests for NAND, NOR, ANDN2 and ORN2 have been added, and existing tests now hit the new instructions (and have been modified accordingly).

Differential: https://reviews.llvm.org/D54714
llvm-svn: 347877
2018-11-29 16:05:38 +00:00
David Stuttard
aae321d253 Fix: Add support for TFE/LWE in image intrinsic
My change svn-id: 347871 caused a buildbot failure due to an unused
variable def (used in an assert).

Change-Id: Ia882d18bb6fa79b4d7bbfda422b9ea5d23eab336
llvm-svn: 347876
2018-11-29 15:56:36 +00:00
Hans Wennborg
31b5fc8481 Revert r347823 "[TextAPI] Switch back to a custom Platform enum."
It broke the Windows buildbots, e.g.
http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/21829/steps/test/logs/stdio

This also reverts the follow-ups: r347824, r347827, and r347836.

llvm-svn: 347874
2018-11-29 15:47:24 +00:00
Joseph Tremoulet
7c785c2fc4 [CallSiteSplitting] Report edge deletion to DomTreeUpdater
Summary:
When splitting musttail calls, the split blocks' original terminators
get removed; inform the DTU when this happens.

Also add a testcase that fails an assertion in the DTU without this fix.


Reviewers: fhahn, junbuml

Reviewed By: fhahn

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D55027

llvm-svn: 347872
2018-11-29 15:27:04 +00:00
David Stuttard
c65201ef09 Add support for TFE/LWE in image intrinsics
TFE and LWE support requires extra result registers that are written in the
event of a failure in order to detect that failure case.
The specific use-case that initiated these changes is sparse texture support.

This means that if image intrinsics are used with either option turned on, the
programmer must ensure that the return type can contain all of the expected
results. This can result in redundant registers since the vector size must be a
power-of-2.

This change takes roughly 6 parts:
1. Modify the instruction defs in tablegen to add new instruction variants that
can accomodate the extra return values.
2. Updates to lowerImage in SIISelLowering.cpp to accomodate setting TFE or LWE
(where the bulk of the work for these instruction types is now done)
3. Extra verification code to catch cases where intrinsics have been used but
insufficient return registers are used.
4. Modification to the adjustWritemask optimisation to account for TFE/LWE being
enabled (requires extra registers to be maintained for error return value).
5. An extra pass to zero initialize the error value return - this is because if
the error does not occur, the register is not written and thus must be zeroed
before use. Also added a new (on by default) option to ensure ALL return values
are zero-initialized that is required for sparse texture support.
6. Disable the inst_combine optimization in the presence of tfe/lwe (later TODO
for this to re-enable and handle correctly).

There's an additional fix now to avoid a dmask=0

For an image intrinsic with tfe where all result channels except tfe
were unused, I was getting an image instruction with dmask=0 and only a
single vgpr result for tfe. That is incorrect because the hardware
assumes there is at least one vgpr result, plus the one for tfe.

Fixed by forcing dmask to 1, which gives the desired two vgpr result
with tfe in the second one.

The TFE or LWE result is returned from the intrinsics using an aggregate
type. Look in the test code provided to see how this works, but in essence IR
code to invoke the intrinsic looks as follows:

%v = call {<4 x float>,i32} @llvm.amdgcn.image.load.1d.v4f32i32.i32(i32 15,
                                      i32 %s, <8 x i32> %rsrc, i32 1, i32 0)
%v.vec = extractvalue {<4 x float>, i32} %v, 0
%v.err = extractvalue {<4 x float>, i32} %v, 1

Differential revision: https://reviews.llvm.org/D48826

Change-Id: If222bc03642e76cf98059a6bef5d5bffeda38dda
llvm-svn: 347871
2018-11-29 15:21:13 +00:00
Sanjay Patel
97abdc3136 [CVP] tidy processCmp(); NFC
1. The variables were confusing: 'C' typically refers to a constant, but here it was the Cmp.
2. Formatting violations.
3. Simplify code to return true/false constant.

llvm-svn: 347868
2018-11-29 14:41:21 +00:00
Martin Storsjo
ac9d97d900 Revert "[LICM] Enable control flow hoisting by default" and "[LICM] Reapply r347190 "Make LICM able to hoist phis" with fix"
This reverts commits r347776 and r347778.

The first one, r347776, caused significant compile time regressions
for certain input files, see PR39836 for details.

llvm-svn: 347867
2018-11-29 14:39:39 +00:00
Sanjay Patel
269a53d873 [CVP] auto-generate complete test checks; NFC
llvm-svn: 347866
2018-11-29 14:28:47 +00:00
Hans Wennborg
98a8db31eb Revert r347596 "Support for inserting profile-directed cache prefetches"
It causes asserts building BoringSSL. See https://crbug.com/91009#c3 for
repro.

This also reverts the follow-ups:
Revert r347724 "Do not insert prefetches with unsupported memory operands."
Revert r347606 "[X86] Add dependency from X86 to ProfileData after rL347596"
Revert r347607 "Add new passes to X86 pipeline tests"

llvm-svn: 347864
2018-11-29 13:58:02 +00:00
Petr Pavlu
4494d21210 [GlobalISel] Fix insertion of stack-protector epilogue
* Tell the StackProtector pass to generate the epilogue instrumentation
  when GlobalISel is enabled because GISel currently does not implement
  the same deferred epilogue insertion as SelectionDAG.
* Update StackProtector::InsertStackProtectors() to find a stack guard
  slot by searching for the llvm.stackprotector intrinsic when the
  prologue was not created by StackProtector itself but the pass still
  needs to generate the epilogue instrumentation. This fixes a problem
  when the pass would abort because the stack guard AllocInst pointer
  was null when generating the epilogue -- test
  CodeGen/AArch64/GlobalISel/arm64-irtranslator-stackprotect.ll.

Differential Revision: https://reviews.llvm.org/D54518

llvm-svn: 347862
2018-11-29 13:22:53 +00:00
Petr Pavlu
a73160fd34 [GlobalISel] Make EnableGlobalISel always set when GISel is enabled
Change meaning of TargetOptions::EnableGlobalISel. The flag was
previously set only when a target switched on GlobalISel but it is now
always set when the GlobalISel pipeline is enabled. This makes the flag
consistent with TargetOptions::EnableFastISel and allows its use in
other parts of the compiler to determine when GlobalISel is enabled.

The EnableGlobalISel flag had previouly only one use in
TargetPassConfig::isGlobalISelAbortEnabled(). The method used its value
to determine if GlobalISel was enabled by a target and returned false in
such a case. To preserve the current behaviour, a new flag
TargetOptions::GlobalISelAbort is introduced to separately record the
abort behaviour.

Differential Revision: https://reviews.llvm.org/D54518

llvm-svn: 347861
2018-11-29 12:56:32 +00:00
Martin Storsjo
d9380b2548 [llvm-rc] Support EXSTYLE statement.
Patch by Jacek Caban!

Differential Revision: https://reviews.llvm.org/D55020

llvm-svn: 347858
2018-11-29 12:17:39 +00:00
Andrea Di Biagio
a900121de4 [llvm-mca][MC] Add the ability to declare which processor resources model load/store queues (PR36666).
This patch adds the ability to specify via tablegen which processor resources
are load/store queue resources.

A new tablegen class named MemoryQueue can be optionally used to mark resources
that model load/store queues.  Information about the load/store queue is
collected at 'CodeGenSchedule' stage, and analyzed by the 'SubtargetEmitter' to
initialize two new fields in struct MCExtraProcessorInfo named `LoadQueueID` and
`StoreQueueID`.  Those two fields are identifiers for buffered resources used to
describe the load queue and the store queue.
Field `BufferSize` is interpreted as the number of entries in the queue, while
the number of units is a throughput indicator (i.e. number of available pickers
for loads/stores).

At construction time, LSUnit in llvm-mca checks for the presence of extra
processor information (i.e. MCExtraProcessorInfo) in the scheduling model.  If
that information is available, and fields LoadQueueID and StoreQueueID are set
to a value different than zero (i.e. the invalid processor resource index), then
LSUnit initializes its LoadQueue/StoreQueue based on the BufferSize value
declared by the two processor resources.

With this patch, we more accurately track dynamic dispatch stalls caused by the
lack of LS tokens (i.e. load/store queue full). This is also shown by the
differences in two BdVer2 tests. Stalls that were previously classified as
generic SCHEDULER FULL stalls, are not correctly classified either as "load
queue full" or "store queue full".

About the differences in the -scheduler-stats view: those differences are
expected, because entries in the load/store queue are not released at
instruction issue stage. Instead, those are released at instruction executed
stage.  This is the main reason why for the modified tests, the load/store
queues gets full before PdEx is full.

Differential Revision: https://reviews.llvm.org/D54957

llvm-svn: 347857
2018-11-29 12:15:56 +00:00
Nicolai Haehnle
fb70615d5e AMDGPU/InsertWaitcnts: Remove the dependence on MachineLoopInfo
Summary:
MachineLoopInfo cannot be relied on for correctness, because it cannot
properly recognize loops in irreducible control flow which can be
introduced by late machine basic block optimization passes. See the new
test case for the reduced form of an example that occurred in practice.

Use a simple fixpoint iteration instead.

In order to facilitate this change, refactor WaitcntBrackets so that it
only tracks pending events and registers, rather than also maintaining
state that is relevant for the high-level algorithm. Various accessor
methods can be removed or made private as a consequence.

Affects (in radv):
- dEQP-VK.glsl.loops.special.{for,while}_uniform_iterations.select_iteration_count_{fragment,vertex}

Fixes: r345719 ("AMDGPU: Rewrite SILowerI1Copies to always stay on SALU")

Reviewers: msearles, rampitec, scott.linder, kanarayan

Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits, hakzsam

Differential Revision: https://reviews.llvm.org/D54231

llvm-svn: 347853
2018-11-29 11:06:26 +00:00
Nicolai Haehnle
bbaf7db77b AMDGPU/InsertWaitcnt: Consistently use uint32_t for scores / time points
Summary:
There is one obsolete reference to using -1 as an indication of "unknown",
but this isn't actually used anywhere.

Using unsigned makes robust wrapping checks easier.

Reviewers: msearles, rampitec, scott.linder, kanarayan

Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, llvm-commits, tpr, t-tye, hakzsam

Differential Revision: https://reviews.llvm.org/D54230

llvm-svn: 347852
2018-11-29 11:06:21 +00:00
Nicolai Haehnle
d0a944b0ec AMDGPU/InsertWaitcnt: Remove unused WaitAtBeginning
Reviewers: msearles, rampitec, scott.linder, kanarayan

Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits, hakzsam

Differential Revision: https://reviews.llvm.org/D54229

llvm-svn: 347851
2018-11-29 11:06:18 +00:00
Nicolai Haehnle
008d5e44a3 AMDGPU/InsertWaitcnts: Simplify pending events tracking
Summary:
Instead of storing the "score" (last time point) of the various relevant
events, only store whether an event is pending or not.

This is sufficient, because whenever only one event of a count type is
pending, its last time point is naturally the upper bound of all time
points of this count type, and when multiple event types are pending,
the count type has gone out of order and an s_waitcnt to 0 is required
to clear any pending event type (and will then clear all pending event
types for that count type).

This also removes the special handling of GDS_GPR_LOCK and EXP_GPR_LOCK.
I do not understand what this special handling ever attempted to achieve.
It has existed ever since the original port from an internal code base,
so my best guess is that it solved a problem related to EXEC handling in
that internal code base.

Reviewers: msearles, rampitec, scott.linder, kanarayan

Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits, hakzsam

Differential Revision: https://reviews.llvm.org/D54228

llvm-svn: 347850
2018-11-29 11:06:14 +00:00
Nicolai Haehnle
1eb7b430b1 AMDGPU/InsertWaitcnts: Use foreach loops for inst and wait event types
Summary:
It hides the type casting ugliness, and I happened to have to add a new
such loop (in a later patch).

Reviewers: msearles, rampitec, scott.linder, kanarayan

Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits, hakzsam

Differential Revision: https://reviews.llvm.org/D54227

llvm-svn: 347849
2018-11-29 11:06:11 +00:00
Nicolai Haehnle
72f01d1369 AMDGPU/InsertWaitcnts: Untangle some semi-global state
Summary:
Reduce the statefulness of the algorithm in two ways:

1. More clearly split generateWaitcntInstBefore into two phases: the
   first one which determines the required wait, if any, without changing
   the ScoreBrackets, and the second one which actually inserts the wait
   and updates the brackets.

2. Communicate pre-existing s_waitcnt instructions using an argument to
   generateWaitcntInstBefore instead of through the ScoreBrackets.

To simplify these changes, a Waitcnt structure is introduced which carries
the counts of an s_waitcnt instruction in decoded form.

There are some functional changes:

1. The FIXME for the VCCZ bug workaround was implemented: we only wait for
   SMEM instructions as required instead of waiting on all counters.

2. We now properly track pre-existing waitcnt's in all cases, which leads
   to less conservative waitcnts being emitted in some cases.

     s_load_dword ...
     s_waitcnt lgkmcnt(0)    <-- pre-existing wait count
     ds_read_b32 v0, ...
     ds_read_b32 v1, ...
     s_waitcnt lgkmcnt(0)    <-- this is too conservative
     use(v0)
     more code
     use(v1)

   This increases code size a bit, but the reduced latency should still be a
   win in basically all cases. The worst code size regressions in my shader-db
   are:

 WORST REGRESSIONS - Code Size
 Before After     Delta Percentage
   1724  1736        12    0.70 %   shaders/private/f1-2015/1334.shader_test [0]
   2276  2284         8    0.35 %   shaders/private/f1-2015/1306.shader_test [0]
   4632  4640         8    0.17 %   shaders/private/ue4_elemental/62.shader_test [0]
   2376  2384         8    0.34 %   shaders/private/f1-2015/1308.shader_test [0]
   3284  3292         8    0.24 %   shaders/private/talos_principle/1955.shader_test [0]

Reviewers: msearles, rampitec, scott.linder, kanarayan

Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits, hakzsam

Differential Revision: https://reviews.llvm.org/D54226

llvm-svn: 347848
2018-11-29 11:06:06 +00:00
Martin Storsjo
ce68f447ab [CODE_OWNERS] Add myself as code owner for MinGW
llvm-svn: 347847
2018-11-29 10:58:15 +00:00
Max Kazantsev
dbc60dbe6c [NFC] Add two XFAIL tests from PR39783
llvm-svn: 347845
2018-11-29 09:38:22 +00:00
Max Kazantsev
c2737f7041 Disable TermFolding in LoopSimplifyCFG until PR39783 is fixed
llvm-svn: 347844
2018-11-29 09:00:19 +00:00
Sam Parker
c0aac1b8cd [LoopStrengthReduce] ComplexityLimit as an option
Convert ComplexityLimit into a command line value.

Differential Revision: https://reviews.llvm.org/D54899

llvm-svn: 347843
2018-11-29 08:34:22 +00:00
Craig Topper
7b62f7d2d8 [Inliner] Modify the merging of min-legal-vector-width attribute to better handle when the caller or callee don't have the attribute.
Lack of an attribute means that the function hasn't been checked for what vector width it requires. So if the caller or the callee doesn't have the attribute we should make sure the combined function after inlining does not have the attribute.

If the caller already doesn't have the attribute we can just avoid adding it. Otherwise if the callee doesn't have the attribute just remove the caller's attribute.

llvm-svn: 347841
2018-11-29 07:27:38 +00:00
Craig Topper
53ea459736 [Inliner] Add test for merging of min-legal-vector-width function attribute.
This should have been added in r337844, but apparently was I failed to 'git add' the file.

llvm-svn: 347840
2018-11-29 07:02:47 +00:00
Serguei Katkov
8bd049cee9 [CGP] Improve compile time for complex addressing mode
This is a fix for PR39625 with improvement the compile time
by reducing the number of intermediate Phi nodes created.

Reviewers: john.brawn, reames
Reviewed By: john.brawn
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D54932

llvm-svn: 347839
2018-11-29 06:45:18 +00:00
Juergen Ributzka
f54d60342b Revert "[TextAPI] Fix a memory leak in the TBD reader."
llvm-svn: 347838
2018-11-29 06:32:49 +00:00
Juergen Ributzka
40548e4bb7 [TextAPI] Fix a memory leak in the TBD reader.
This fixes an issue where we were leaking the YAML document if there was a
parsing error.

llvm-svn: 347837
2018-11-29 06:16:33 +00:00
Juergen Ributzka
ab5d0f9da8 [TextAPI] Switch back to a custom Platform enum.
Moving to PlatformType from BinaryFormat had some UB fallout when handing
unknown platforms or malformed input files.

This should fix the sanitizer bots.

llvm-svn: 347836
2018-11-29 05:56:03 +00:00
Craig Topper
e8fbdccc16 [X86] Correct comment. NFC
llvm-svn: 347835
2018-11-29 05:56:03 +00:00
Kristina Brooks
284bf85c95 Add Hurd target to LLVMSupport (1/2)
Add the required target triples to LLVMSupport to support Hurd
in LLVM (formally `pc-hurd-gnu`).

Patch by sthibaul (Samuel Thibault)

Differential Revision: https://reviews.llvm.org/D54378

llvm-svn: 347832
2018-11-29 03:23:01 +00:00
Li Jia He
781b1e98f2 [PowerPC] Fix a conversion is not considered when the ISD::BR_CC node making the instruction selection
Summary:
 A signed comparison of i1 values produces the opposite result to an unsigned one if the condition code 
 includes less-than or greater-than. This is so because 1 is the most negative signed i1 number and the 
 most positive unsigned i1 number. The CR-logical operations used for such comparisons are non-commutative
 so for signed comparisons vs. unsigned ones, the input operands just need to be swapped.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D54825

llvm-svn: 347831
2018-11-29 03:04:39 +00:00
Li Jia He
557ec60edc [PowerPC] [NFC] Add test cases to the ISD::BR_CC node in the instruction selection
Add the following test case for the ISD::BR_CC node in the instruction selection
define i64 @testi64slt(i64 %c1, i64 %c2, i64 %c3, i64 %c4, i64 %a1, i64 %a2) #0 {
entry:
  %cmp1 = icmp eq i64 %c3, %c4
  %cmp3tmp = icmp eq i64 %c1, %c2
  %cmp3 = icmp slt i1 %cmp3tmp, %cmp1
  br i1 %cmp3, label %iftrue, label %iffalse
iftrue:
  ret i64 %a1
iffalse:
  ret i64 %a2
}
The data type i64 can be replaced by i32, i64, float, double

And condition codes can be replaced by: SETEQ, SETEN, SELT, SETLE, SETGT, SETGE,SETULT, SETULE, SSETGT, and SETUGE

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D54824

llvm-svn: 347828
2018-11-29 02:51:03 +00:00