mirror of
https://github.com/RPCS3/llvm-mirror.git
synced 2024-11-25 12:12:47 +01:00
a571c5d5d3
The first problem to fix is to stop creating synthetic *Table_gen targets next to all of the LLVM libraries. These had no real effect as CMake specifies that add_custom_command(OUTPUT ...) directives (what the 'tablegen(...)' stuff expands to) are implicitly added as dependencies to all the rules in that CMakeLists.txt. These synthetic rules started to cause problems as we started more and more heavily using tablegen files from *subdirectories* of the one where they were generated. Within those directories, the set of tablegen outputs was still available and so these synthetic rules added them as dependencies of those subdirectories. However, they were no longer properly associated with the custom command to generate them. Most of the time this "just worked" because something would get to the parent directory first, and run tablegen there. Once run, the files existed and the build proceeded happily. However, as more and more subdirectories have started using this, the probability of this failing to happen has increased. Recently with the MC refactorings, it became quite common for me when touching a large enough number of targets. To add insult to injury, several of the backends *tried* to fix this by adding explicit dependencies back to the parent directory's tablegen rules, but those dependencies didn't work as expected -- they weren't forming a linear chain, they were adding another thread in the race. This patch removes these synthetic rules completely, and adds a much simpler function to declare explicitly that a collection of tablegen'ed files are referenced by other libraries. From that, we can add explicit dependencies from the smaller libraries (such as every architectures Desc library) on this and correctly form a linear sequence. All of the backends are updated to use it, sometimes replacing the existing attempt at adding a dependency, sometimes adding a previously missing dependency edge. Please let me know if this causes any problems, but it fixes a rather persistent and problematic source of build flakiness on our end. llvm-svn: 136023 |
||
---|---|---|
.. | ||
MCTargetDesc | ||
TargetInfo | ||
Blackfin.h | ||
Blackfin.td | ||
BlackfinAsmPrinter.cpp | ||
BlackfinCallingConv.td | ||
BlackfinFrameLowering.cpp | ||
BlackfinFrameLowering.h | ||
BlackfinInstrFormats.td | ||
BlackfinInstrInfo.cpp | ||
BlackfinInstrInfo.h | ||
BlackfinInstrInfo.td | ||
BlackfinIntrinsicInfo.cpp | ||
BlackfinIntrinsicInfo.h | ||
BlackfinIntrinsics.td | ||
BlackfinISelDAGToDAG.cpp | ||
BlackfinISelLowering.cpp | ||
BlackfinISelLowering.h | ||
BlackfinRegisterInfo.cpp | ||
BlackfinRegisterInfo.h | ||
BlackfinRegisterInfo.td | ||
BlackfinSelectionDAGInfo.cpp | ||
BlackfinSelectionDAGInfo.h | ||
BlackfinSubtarget.cpp | ||
BlackfinSubtarget.h | ||
BlackfinTargetMachine.cpp | ||
BlackfinTargetMachine.h | ||
CMakeLists.txt | ||
Makefile | ||
README.txt |
//===-- README.txt - Notes for Blackfin Target ------------------*- org -*-===// * Condition codes ** DONE Problem with asymmetric SETCC operations The instruction CC = R0 < 2 is not symmetric - there is no R0 > 2 instruction. On the other hand, IF CC JUMP can take both CC and !CC as a condition. We cannot pattern-match (brcond (not cc), target), the DAG optimizer removes that kind of thing. This is handled by creating a pseudo-register NCC that aliases CC. Register classes JustCC and NotCC are used to control the inversion of CC. ** DONE CC as an i32 register The AnyCC register class pretends to hold i32 values. It can only represent the values 0 and 1, but we can copy to and from the D class. This hack makes it possible to represent the setcc instruction without having i1 as a legal type. In most cases, the CC register is set by a "CC = .." or BITTST instruction, and then used in a conditional branch or move. The code generator thinks it is moving 32 bits, but the value stays in CC. In other cases, the result of a comparison is actually used as am i32 number, and CC will be copied to a D register. * Stack frames ** TODO Use Push/Pop instructions We should use the push/pop instructions when saving callee-saved registers. The are smaller, and we may even use push multiple instructions. ** TODO requiresRegisterScavenging We need more intelligence in determining when the scavenger is needed. We should keep track of: - Spilling D16 registers - Spilling AnyCC registers * Assembler ** TODO Implement PrintGlobalVariable ** TODO Remove LOAD32sym It's a hack combining two instructions by concatenation. * Inline Assembly These are the GCC constraints from bfin/constraints.md: | Code | Register class | LLVM | |-------+-------------------------------------------+------| | a | P | C | | d | D | C | | z | Call clobbered P (P0, P1, P2) | X | | D | EvenD | X | | W | OddD | X | | e | Accu | C | | A | A0 | S | | B | A1 | S | | b | I | C | | v | B | C | | f | M | C | | c | Circular I, B, L | X | | C | JustCC | S | | t | LoopTop | X | | u | LoopBottom | X | | k | LoopCount | X | | x | GR | C | | y | RET*, ASTAT, SEQSTAT, USP | X | | w | ALL | C | | Z | The FD-PIC GOT pointer (P3) | S | | Y | The FD-PIC function pointer register (P1) | S | | q0-q7 | R0-R7 individually | | | qA | P0 | | |-------+-------------------------------------------+------| | Code | Constant | | |-------+-------------------------------------------+------| | J | 1<<N, N<32 | | | Ks3 | imm3 | | | Ku3 | uimm3 | | | Ks4 | imm4 | | | Ku4 | uimm4 | | | Ks5 | imm5 | | | Ku5 | uimm5 | | | Ks7 | imm7 | | | KN7 | -imm7 | | | Ksh | imm16 | | | Kuh | uimm16 | | | L | ~(1<<N) | | | M1 | 0xff | | | M2 | 0xffff | | | P0-P4 | 0-4 | | | PA | Macflag, not M | | | PB | Macflag, only M | | | Q | Symbol | | ** TODO Support all register classes * DAG combiner ** Create test case for each Illegal SETCC case The DAG combiner may someimes produce illegal i16 SETCC instructions. *** TODO SETCC (ctlz x), 5) == const *** TODO SETCC (and load, const) == const *** DONE SETCC (zext x) == const *** TODO SETCC (sext x) == const * Instruction selection ** TODO Better imediate constants Like ARM, build constants as small imm + shift. ** TODO Implement cycle counter We have CYCLES and CYCLES2 registers, but the readcyclecounter intrinsic wants to return i64, and the code generator doesn't know how to legalize that. ** TODO Instruction alternatives Some instructions come in different variants for example: D = D + D P = P + P Cross combinations are not allowed: P = D + D (bad) Similarly for the subreg pseudo-instructions: D16L = EXTRACT_SUBREG D16, bfin_subreg_lo16 P16L = EXTRACT_SUBREG P16, bfin_subreg_lo16 We want to take advantage of the alternative instructions. This could be done by changing the DAG after instruction selection. ** Multipatterns for load/store We should try to identify multipatterns for load and store instructions. The available instruction matrix is a bit irregular. Loads: | Addr | D | P | D 16z | D 16s | D16 | D 8z | D 8s | |------------+---+---+-------+-------+-----+------+------| | P | * | * | * | * | * | * | * | | P++ | * | * | * | * | | * | * | | P-- | * | * | * | * | | * | * | | P+uimm5m2 | | | * | * | | | | | P+uimm6m4 | * | * | | | | | | | P+imm16 | | | | | | * | * | | P+imm17m2 | | | * | * | | | | | P+imm18m4 | * | * | | | | | | | P++P | * | | * | * | * | | | | FP-uimm7m4 | * | * | | | | | | | I | * | | | | * | | | | I++ | * | | | | * | | | | I-- | * | | | | * | | | | I++M | * | | | | | | | Stores: | Addr | D | P | D16H | D16L | D 8 | |------------+---+---+------+------+-----| | P | * | * | * | * | * | | P++ | * | * | | * | * | | P-- | * | * | | * | * | | P+uimm5m2 | | | | * | | | P+uimm6m4 | * | * | | | | | P+imm16 | | | | | * | | P+imm17m2 | | | | * | | | P+imm18m4 | * | * | | | | | P++P | * | | * | * | | | FP-uimm7m4 | * | * | | | | | I | * | | * | * | | | I++ | * | | * | * | | | I-- | * | | * | * | | | I++M | * | | | | | * Workarounds and features Blackfin CPUs have bugs. Each model comes in a number of silicon revisions with different bugs. We learn about the CPU model from the -mcpu switch. ** Interpretation of -mcpu value - -mcpu=bf527 refers to the latest known BF527 revision - -mcpu=bf527-0.2 refers to silicon rev. 0.2 - -mcpu=bf527-any refers to all known revisions - -mcpu=bf527-none disables all workarounds The -mcpu setting affects the __SILICON_REVISION__ macro and enabled workarounds: | -mcpu | __SILICON_REVISION__ | Workarounds | |------------+----------------------+--------------------| | bf527 | Def Latest | Specific to latest | | bf527-1.3 | Def 0x0103 | Specific to 1.3 | | bf527-any | Def 0xffff | All bf527-x.y | | bf527-none | Undefined | None | These are the known cores and revisions: | Core | Silicon | Processors | |-------------+--------------------+-------------------------| | Edinburgh | 0.3, 0.4, 0.5, 0.6 | BF531 BF532 BF533 | | Braemar | 0.2, 0.3 | BF534 BF536 BF537 | | Stirling | 0.3, 0.4, 0.5 | BF538 BF539 | | Moab | 0.0, 0.1, 0.2 | BF542 BF544 BF548 BF549 | | Teton | 0.3, 0.5 | BF561 | | Kookaburra | 0.0, 0.1, 0.2 | BF523 BF525 BF527 | | Mockingbird | 0.0, 0.1 | BF522 BF524 BF526 | | Brodie | 0.0, 0.1 | BF512 BF514 BF516 BF518 | ** Compiler implemented workarounds Most workarounds are implemented in header files and source code using the __ADSPBF527__ macros. A few workarounds require compiler support. | Anomaly | Macro | GCC Switch | |----------+--------------------------------+------------------| | Any | __WORKAROUNDS_ENABLED | | | 05000074 | WA_05000074 | | | 05000244 | __WORKAROUND_SPECULATIVE_SYNCS | -mcsync-anomaly | | 05000245 | __WORKAROUND_SPECULATIVE_LOADS | -mspecld-anomaly | | 05000257 | WA_05000257 | | | 05000283 | WA_05000283 | | | 05000312 | WA_LOAD_LCREGS | | | 05000315 | WA_05000315 | | | 05000371 | __WORKAROUND_RETS | | | 05000426 | __WORKAROUND_INDIRECT_CALLS | Not -micplb | ** GCC feature switches | Switch | Description | |---------------------------+----------------------------------------| | -msim | Use simulator runtime | | -momit-leaf-frame-pointer | Omit frame pointer for leaf functions | | -mlow64k | | | -mcsync-anomaly | | | -mspecld-anomaly | | | -mid-shared-library | | | -mleaf-id-shared-library | | | -mshared-library-id= | | | -msep-data | Enable separate data segment | | -mlong-calls | Use indirect calls | | -mfast-fp | | | -mfdpic | | | -minline-plt | | | -mstack-check-l1 | Do stack checking in L1 scratch memory | | -mmulticore | Enable multicore support | | -mcorea | Build for Core A | | -mcoreb | Build for Core B | | -msdram | Build for SDRAM | | -micplb | Assume ICPLBs are enabled at runtime. |