1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 10:42:39 +01:00
Commit Graph

187 Commits

Author SHA1 Message Date
Valentin Clement
c463fa6cad [mlir][openacc] Initial translation for DataOp to LLVM IR
Add basic translation of acc.data to LLVM IR with runtime calls.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D104301
2021-07-27 22:04:04 -04:00
Jose M Monsalve Diaz
5b7208da36 [OpenMP] Folding threadLimit and numThreads when single value in kernels
The device runtime contains several calls to `__kmpc_get_hardware_num_threads_in_block`
and `__kmpc_get_hardware_num_blocks`. If the thread_limit and the num_teams are constant,
these calls can be folded to the constant value.

In this patch we use the already introduced `AAFoldRuntimeCall` and the `NumTeams` and
`NumThreads` kernel attributes (to be introduced in a different patch) to fold these functions.
The code checks all the kernels, and if their attributes match, the functions are folded.

In the future we will explore specializing for multiple values of NumThreads and NumTeams.

Depends on D106390

Reviewed By: jdoerfert, JonChesterfield

Differential Revision: https://reviews.llvm.org/D106033
2021-07-27 21:47:12 -04:00
Alexey Bataev
fd1d10a20f [OPENMP]Fix PR49787: Codegen for calling __tgt_target_teams_nowait_mapper has too few arguments.
Added missed arguments in
__tgt_target_teams_nowait_mapper/__tgt_target_nowait_mapper runtime
functions calls.

Differential Revision: https://reviews.llvm.org/D106542
2021-07-22 08:44:37 -07:00
Alexey Bataev
6351ecd4dc Revert "[OPENMP]Fix PR49787: Codegen for calling __tgt_target_teams_nowait_mapper has too few arguments."
This reverts commit b455f7f22564a096c043b02fa159ab16669c121c to fix
buildbots.
2021-07-22 08:06:29 -07:00
Alexey Bataev
0261373c6d [OPENMP]Fix PR49787: Codegen for calling __tgt_target_teams_nowait_mapper has too few arguments.
Added missed arguments in
__tgt_target_teams_nowait_mapper/__tgt_target_nowait_mapper runtime
functions calls.

Differential Revision: https://reviews.llvm.org/D106542
2021-07-22 07:53:37 -07:00
Joseph Huber
9d542314e4 [Libomptarget] Introduce new main thread ID runtime function
This patch introduces `__kmpc_is_generic_main_thread_id` which splits the old
comparison into its own runtime function. The purpose of this is so we can fold
this part independently, so when both this and `is_spmd_mode` are folded the
final function will be folded as well.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106437
2021-07-21 21:18:14 -04:00
Joseph Huber
472a223072 [OpenMP] Change __kmpc_free_shared to include the paired allocation size
This patch changes `__kmpc_free_shared` to take an additional argument
corresponding to the associated allocation's size. This makes it easier to
implement the allocator in the runtime.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106496
2021-07-21 20:56:21 -04:00
Giorgis Georgakoudis
ca49ab772d [OpenMP] Expose libomptarget function to get HW thread id
The patch exposes the libomptarget runtime function that gets the hardware thread id through the kmpc API. This is to be used in SPMDization for checking the thread id to execute regions by a single thread in a block.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106323
2021-07-21 10:26:04 -07:00
Shilei Tian
0d5ca9b780 [OpenMP][deviceRTLs] Update return type of function __kmpc_parallel_level
In `deviceRTLs`, the parallel level is stored in a shared variable of type `uint8_t`.
`__kmpc_parallel_level` currently returns a 16-bit interger. This patch first
changes the return type of the function to `uint8_t`, same as the shared variable,
and then corrects function type which was updated in D105955.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106384
2021-07-20 15:45:43 -04:00
Johannes Doerfert
3839fcc5cf [OpenMP] Detect SPMD compatible kernels and execute them as such
In the spirit of TRegions [0], this patch analyzes a kernel and tracks
if it can be executed in SPMD-mode. If so, we flip the arguments of
the __kmpc_target_init and deinit call to enable the mode. We also
update the `<kernel>_exec_mode` flag to indicate to the runtime we
changed the mode to SPMD.

The code analysis is done interprocedurally by extending the
AAKernelInfo abstract attribute to track SPMD compatibility as well.

[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11

Differential Revision: https://reviews.llvm.org/D102307
2021-07-10 18:44:25 -05:00
Johannes Doerfert
51153424db [OpenMP] Unified entry point for SPMD & generic kernels in the device RTL
In the spirit of TRegions [0], this patch provides a simpler and uniform
interface for a kernel to set up the device runtime. The OMPIRBuilder is
used for reuse in Flang. A custom state machine will be generated in the
follow up patch.

The "surplus" threads of the "master warp" will not exit early anymore
so we need to use non-aligned barriers. The new runtime will not have an
extra warp but also require these non-aligned barriers.

[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11

This was in parts extracted from D59319.

Reviewed By: ABataev, JonChesterfield

Differential Revision: https://reviews.llvm.org/D101976
2021-07-10 17:53:56 -05:00
Nico Weber
b314064dc7 Revert Attributor patch series
Broke check-clang, see https://reviews.llvm.org/D102307#2869065
Ran `git revert -n ebbe149a6f08535ede848a531a601ae6591cfbc5..269416d41908bb670f67af689155d5ab8eea689a`
2021-07-10 16:15:55 -04:00
Johannes Doerfert
df82045809 [OpenMP] Detect SPMD compatible kernels and execute them as such
In the spirit of TRegions [0], this patch analyzes a kernel and tracks
if it can be executed in SPMD-mode. If so, we flip the arguments of
the __kmpc_target_init and deinit call to enable the mode. We also
update the `<kernel>_exec_mode` flag to indicate to the runtime we
changed the mode to SPMD.

The code analysis is done interprocedurally by extending the
AAKernelInfo abstract attribute to track SPMD compatibility as well.

[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11

Differential Revision: https://reviews.llvm.org/D102307
2021-07-10 12:32:51 -05:00
Johannes Doerfert
63e4735bba [OpenMP] Unified entry point for SPMD & generic kernels in the device RTL
In the spirit of TRegions [0], this patch provides a simpler and uniform
interface for a kernel to set up the device runtime. The OMPIRBuilder is
used for reuse in Flang. A custom state machine will be generated in the
follow up patch.

The "surplus" threads of the "master warp" will not exit early anymore
so we need to use non-aligned barriers. The new runtime will not have an
extra warp but also require these non-aligned barriers.

[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11

This was in parts extracted from D59319.

Reviewed By: ABataev, JonChesterfield

Differential Revision: https://reviews.llvm.org/D101976
2021-07-10 12:32:50 -05:00
Joseph Huber
4df09b164e [OpenMP] Enable HeapToStack conversion in OpenMPOpt for new RTL globalization calls
Summary:
The changes to globalization introduced in D97680 introduce a large amount of overhead by default. The old globalization method would always ignore globalization code if executing in SPMD mode. This wasn't strictly correct as data sharing is still possible in SPMD mode. The new interface is correct but introduces globalization code even when unnecessary. This optimization will use the existing HeapToStack transformation in the attributor to allow for unneeded globalization to be replaced with thread-private stack memory. This is done using the newly introduced library instances for the RTL functions added in D102087.

Depends on D97818

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D102197
2021-06-22 13:23:05 -04:00
Joseph Huber
3aea5cddbb [OpenMP] Simplify GPU memory globalization
Summary:
Memory globalization is required to maintain OpenMP standard semantics for data sharing between
worker and master threads. The GPU cannot share data between its threads so must allocate global or
shared memory to store the data in. Currently this is implemented fully in the frontend using the
`__kmpc_data_sharing_push_stack` and __kmpc_data_sharing_pop_stack` functions to emulate standard
CPU stack sharing. The front-end scans the target region for variables that escape the region and
must be shared between the threads. Each variable then has a field created for it in a global record
type.

This patch replaces this functinality with a single allocation command, effectively mimicing an
alloca instruction for the variables that must be shared between the threads. This will be much
slower than the current solution, but makes it much easier to optimize as we can analyze each
variable independently and determine if it is not captured. In the future, we can replace these
calls with an `alloca` and small allocations can be pushed to shared memory.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D97680
2021-06-22 10:52:46 -04:00
Michael Kruse
4460a4c76f [OpenMP] Implement '#pragma omp unroll'.
Implementation of the unroll directive introduced in OpenMP 5.1. Follows the approach from D76342 for the tile directive (i.e. AST-based, not using the OpenMPIRBuilder). Tries to use `llvm.loop.unroll.*` metadata where possible, but has to fall back to an AST representation of the outer loop if the partially unrolled generated loop is associated with another directive (because it needs to compute the number of iterations).

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D99459
2021-06-10 14:30:17 -05:00
Alexey Bataev
fe6e3a2893 [OPENMP]Fix PR50129: omp cancel parallel not working as expected.
Need to emit a call for __kmpc_cancel_barrier in the exit block for
__kmpc_cancel function call if cancellation of the parallel block is
requested.

Differential Revision: https://reviews.llvm.org/D103646
2021-06-04 08:24:55 -07:00
Mats Petersson
caa14ae743 [OpenMP]Add support for workshare loop modifier in lowering
When lowering the dynamic, guided, auto and runtime types of scheduling,
there is an optional monotonic or non-monotonic modifier. This patch
adds support in the OMP IR Builder to pass this down to the runtime
functions.

Also implements tests for the variants.

Differential Revision: https://reviews.llvm.org/D102008
2021-05-27 15:33:05 +01:00
Mats Petersson
ffafbe5131 Revert "[OpenMP]Add support for workshare loop modifier in lowering"
This reverts commit ea4c5fb04c6d9618d451fb2d2c360dc95c6d9131.
2021-05-27 13:09:47 +01:00
Mats Petersson
ae07366301 [OpenMP]Add support for workshare loop modifier in lowering
When lowering the dynamic, guided, auto and runtime types of scheduling,
there is an optional monotonic or non-monotonic modifier. This patch
adds support in the OMP IR Builder to pass this down to the runtime
functions.

Also implements tests for the variants.

Differential Revision: https://reviews.llvm.org/D102008
2021-05-27 12:28:27 +01:00
Fady Ghanim
bb0b21b662 [OpenMP][OMPIRBuilder]Adding support for omp atomic
This patch adds support for generating `omp atomic` for all different
atomic clauses
2021-05-23 17:44:09 -04:00
Simon Pilgrim
26c81e4f24 Fix -Wdocumentation warnings. NFCI. 2021-05-11 09:51:49 +01:00
Mats Petersson
be8fae673d [OpenMP][MLIR]Add support for guided, auto and runtime scheduling
When using parallel loop construct, the OpenMP specification allows for
guided, auto and runtime as scheduling variants (as well as static and
dynamic which are already supported).

This adds the translation from MLIR to LLVM-IR for these scheduling
variants.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D101435
2021-05-10 09:18:52 +00:00
Johannes Doerfert
031a47da5d [OpenMP] Overhaul declare target handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.

This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.

I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.

Highlights:
  - new diagnosis for restrictions specificed in the standard,
  - delayed emission of globals not mentioned in an explicit
    list of a declare target,
  - omission of `nohost` globals on the host and `host` globals on the
    device,
  - no explicit parsing of declarations in-between `omp [begin] declare
    variant` and the corresponding end anymore, regular parsing instead,
  - precedence for explicit mentions in `declare target` lists over
    implicit mentions in the declaration-definition-seq, and
  - `omp allocate` declarations will now replace an earlier emitted
    global, if necessary.

---

Notes:

The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.

After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.

---

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D101030
2021-05-06 02:10:41 -05:00
Valentin Clement
381126e1f8 [OpenMPIRBuilder] Add createOffloadMaptypes and createOffloadMapnames functions
Add function to create the offload_maptypes and the offload_mapnames globals. These two functions
are used in clang. They will be used in the Flang/MLIR lowering as well.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D101503
2021-05-03 15:42:32 -04:00
Chirag Khandelwal
135cfc58a6 [LLVM][OpenMP] Adding support for OpenMP sections construct in OpenMPIRBuilder
This patch adds section support in the OpenMP IRBuilder module, along with a test for the same.

Reviewed By: fghanim

Differential Revision: https://reviews.llvm.org/D89671
2021-04-29 18:39:49 +05:30
Irina Dobrescu
53c3a79b92 [flang][openmp] Add General Semantic Checks for Allocate Directive
This patch adds semantic checks for the General Restrictions of the
Allocate Directive.

Since the requires directive is not yet implemented in Flang, the
restriction:
```
allocate directives that appear in a target region must
specify an allocator clause unless a requires directive with the
dynamic_allocators clause is present in the same compilation unit
```
will need to be updated at a later time.

A different patch will be made with the Fortran specific restrictions of
this directive.

I have used the code from https://reviews.llvm.org/D89395 for the
CheckObjectListStructure function.

Co-authored-by: Isaac Perry <isaac.perry@arm.com>

Reviewed By: clementval, kiranchandramohan

Differential Revision: https://reviews.llvm.org/D91159
2021-04-22 16:15:06 +00:00
Giorgis Georgakoudis
fcb81a364e [OpenMP] Simplify offloading parallel call codegen
This revision simplifies Clang codegen for parallel regions in OpenMP GPU target offloading and corresponding changes in libomptarget: SPMD/non-SPMD parallel calls are unified under a single `kmpc_parallel_51` runtime entry point for parallel regions (which will be commonized between target, host-side parallel regions), data sharing is internalized to the runtime. Tests have been auto-generated using `update_cc_test_checks.py`. Also, the revision contains changes to OpenMPOpt for remark creation on target offloading regions.

Reviewed By: jdoerfert, Meinersbur

Differential Revision: https://reviews.llvm.org/D95976
2021-04-21 18:46:07 -07:00
Mats Petersson
2dd5f52ad6 [OpenMP IRBuilder, MLIR] Add support for OpenMP do schedule dynamic
The implementation supports static schedule for Fortran do loops. This
implements the dynamic variant of the same concept.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D97393
2021-04-16 16:09:49 +01:00
cchen
7304924c0f [OpenMP] Added codegen for masked directive
Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D100514
2021-04-15 12:55:07 -05:00
cchen
098312eb0c [OpenMP51] Initial support for masked directive and filter clause
Adds basic parsing/sema/serialization support for the #pragma omp masked
directive.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D99995
2021-04-09 14:00:36 -05:00
Jennifer Yu
925535c5ef [OPENMP51]Initial support for nocontext clause.
Added basic parsing/sema/serialization support for the 'nocontext' clause.

Differential Revision: https://reviews.llvm.org/D99848
2021-04-05 11:45:49 -07:00
Jennifer Yu
96cc3dc1e4 [OPENMP5.1]Initial support for novariants clause.
Added basic parsing/sema/serialization support for the 'novariants' clause.
2021-04-02 13:19:01 -07:00
cchen
4feb461822 [OpenMP51] Accept primary as proc bind affinity policy in Clang
Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D99622
2021-04-01 18:07:12 -05:00
Mike Rice
137540f63d [OPENMP51]Initial support for the dispatch directive.
Added basic parsing/sema/serialization support for dispatch directive.

Differential Revision: https://reviews.llvm.org/D99537
2021-03-30 14:12:53 -07:00
Joseph Huber
ccdd23f91f [OpenMP] Change OMPIRBuilder to append function attributes
Summary:
Currently the OMPIRBuilder overwrites the function's existing attributes
when it assigns the ones defined in OMPKinds.def. This changes the
behaviour to append the current function's attributes with them instead.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D98740
2021-03-24 09:08:29 -04:00
Valentin Clement
49cf4f2457 [openacc][openmp] Reduce number of generated file and prefer inclusion of .inc
Follow up from D92955 and D83636. This patch makes the base cpp files
OMP.cpp and ACC.cpp normal files and they now include the XXX.inc file
generated by tablegen. This reduces the number of file generated by the
DirectiveEmitter backend and makes it closer to the proposal in D83636.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D93560
2021-03-23 09:16:53 -04:00
Mike Rice
80e89c7e5e [OPENMP51]Support for the 'destroy' clause with interop variable.
Added basic parsing/sema/serialization support to extend the
existing 'destroy' clause for use with the 'interop' directive.

Differential Revision: https://reviews.llvm.org/D98834
2021-03-18 09:12:56 -07:00
Mike Rice
1bc9df98e9 [OPENMP51]Initial support for the use clause.
Added basic parsing/sema/serialization support for the 'use' clause.

Differential Revision: https://reviews.llvm.org/D98815
2021-03-17 15:46:14 -07:00
Mike Rice
7fc0b10919 [OPENMP51]Initial support for the interop directive.
Added basic parsing/sema/serialization support for interop directive.
Support for the 'init' clause.

Differential Revision: https://reviews.llvm.org/D98558
2021-03-17 09:42:07 -07:00
Johannes Doerfert
0058a98cec Revert "[OpenMP] Introduce the disable_selector_propagation variant selector trait"
Need to revert ad9e98b8efa0138559eb640023695dab54967a8d which this
commit depends on.

This reverts commit f771ef7b5f0ed260d00931cd50e6fe462edbacaf.
2021-03-11 23:48:35 -06:00
Johannes Doerfert
e3c1fc67b1 [OpenMP] Introduce the disable_selector_propagation variant selector trait
Nested `omp [begin|end] declare variant` inherit the selectors from
surrounding `omp (begin|end) declare variant` constructs. To stop such
propagation the user can add the `disable_selector_propagation` to the
`extension` set in the `implementation` selector.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D95765
2021-03-11 23:31:25 -06:00
Joseph Huber
ff5095970f [OpenMP] Restore backwards compatibility for libomptarget
Summary:
The changes introduced in D87946 changed the API for libomptarget
functions. `__kmpc_push_target_tripcount` was a function in Clang 11.x
but was not given a backward-compatible interface. This change will
require people using Clang 13.x or 12.x to recompile their offloading
programs.

Reviewed By: jdoerfert cchen

Differential Revision: https://reviews.llvm.org/D98358
2021-03-11 09:52:11 -05:00
Simon Pilgrim
6ecc32e8eb Fix Wdocumentation unknown parameter warning. NFCI. 2021-03-05 11:24:44 +00:00
Michael Kruse
9bf95cf501 [clang][OpenMP] Use OpenMPIRBuilder for workshare loops.
Initial support for using the OpenMPIRBuilder by clang to generate loops using the OpenMPIRBuilder. This initial support is intentionally limited to:
 * Only the worksharing-loop directive.
 * Recognizes only the nowait clause.
 * No loop nests with more than one loop.
 * Untested with templates, exceptions.
 * Semantic checking left to the existing infrastructure.

This patch introduces a new AST node, OMPCanonicalLoop, which becomes parent of any loop that has to adheres to the restrictions as specified by the OpenMP standard. These restrictions allow OMPCanonicalLoop to provide the following additional information that depends on base language semantics:
 * The distance function: How many loop iterations there will be before entering the loop nest.
 * The loop variable function: Conversion from a logical iteration number to the loop variable.

These allow the OpenMPIRBuilder to act solely using logical iteration numbers without needing to be concerned with iterator semantics between calling the distance function and determining what the value of the loop variable ought to be. Any OpenMP logical should be done by the OpenMPIRBuilder such that it can be reused MLIR OpenMP dialect and thus by flang.

The distance and loop variable function are implemented using lambdas (or more exactly: CapturedStmt because lambda implementation is more interviewed with the parser). It is up to the OpenMPIRBuilder how they are called which depends on what is done with the loop. By default, these are emitted as outlined functions but we might think about emitting them inline as the OpenMPRuntime does.

For compatibility with the current OpenMP implementation, even though not necessary for the OpenMPIRBuilder, OMPCanonicalLoop can still be nested within OMPLoopDirectives' CapturedStmt. Although OMPCanonicalLoop's are not currently generated when the OpenMPIRBuilder is not enabled, these can just be skipped when not using the OpenMPIRBuilder in case we don't want to make the AST dependent on the EnableOMPBuilder setting.

Loop nests with more than one loop require support by the OpenMPIRBuilder (D93268). A simple implementation of non-rectangular loop nests would add another lambda function that returns whether a loop iteration of the rectangular overapproximation is also within its non-rectangular subset.

Reviewed By: jdenny

Differential Revision: https://reviews.llvm.org/D94973
2021-03-04 22:52:59 -06:00
Michael Kruse
2074156484 [OpenMP] Implement '#pragma omp tile', by Michael Kruse (@Meinersbur).
The tile directive is in OpenMP's Technical Report 8 and foreseeably will be part of the upcoming OpenMP 5.1 standard.

This implementation is based on an AST transformation providing a de-sugared loop nest. This makes it simple to forward the de-sugared transformation to loop associated directives taking the tiled loops. In contrast to other loop associated directives, the OMPTileDirective does not use CapturedStmts. Letting loop associated directives consume loops from different capture context would be difficult.

A significant amount of code generation logic is taking place in the Sema class. Eventually, I would prefer if these would move into the CodeGen component such that we could make use of the OpenMPIRBuilder, together with flang. Only expressions converting between the language's iteration variable and the logical iteration space need to take place in the semantic analyzer: Getting the of iterations (e.g. the overload resolution of `std::distance`) and converting the logical iteration number to the iteration variable (e.g. overload resolution of `iteration + .omp.iv`). In clang, only CXXForRangeStmt is also represented by its de-sugared components. However, OpenMP loop are not defined as syntatic sugar. Starting with an AST-based approach allows us to gradually move generated AST statements into CodeGen, instead all at once.

I would also like to refactor `checkOpenMPLoop` into its functionalities in a follow-up. In this patch it is used twice. Once for checking proper nesting and emitting diagnostics, and additionally for deriving the logical iteration space per-loop (instead of for the loop nest).

Differential Revision: https://reviews.llvm.org/D76342
2021-02-16 09:45:07 -08:00
Kazu Hirata
c92524272b [llvm] Fix header guards (NFC)
Identified with llvm-header-guard.
2021-02-05 21:02:06 -08:00
Michael Kruse
930857b772 [OpenMPIRBuilder] Implement collapseLoops.
The collapseLoops method implements a transformations facilitating the implementation of the collapse-clause. It takes a list of loops from a loop nest and reduces it to a single loop that can be used by other methods that are implemented on just a single loop, such as createStaticWorkshareLoop.

This patch shares some changes with D92974 (such as adding some getters to CanonicalLoopNest), used by both patches.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D93268
2021-02-03 19:12:02 -06:00
Michael Kruse
d945273b52 [OpenMPIRBuilder] Implement tileLoops.
The  tileLoops method implements the code generation part of the tile directive introduced in OpenMP 5.1. It takes a list of loops forming a loop nest, tiles it, and returns the CanonicalLoopInfo representing the generated loops.

The implementation takes n CanonicalLoopInfos, n tile size Values and returns 2*n new CanonicalLoopInfos. The input CanonicalLoopInfos are invalidated and BBs not reused in the new loop nest removed from the function.

In a modified version of D76342, I was able to correctly compile and execute a tiled loop nest.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D92974
2021-01-23 19:39:29 -06:00