llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 12:12:47 +01:00

Author	SHA1	Message	Date
Jinsong Ji	41284718d3	[AIX] Emit version string in .file directive AIX .file directive support including compiler version string. https://www.ibm.com/docs/en/aix/7.2?topic=ops-file-pseudo-op This patch adds the support so that it will be easier to identify build compiler in objects. Reviewed By: #powerpc, shchenz Differential Revision: https://reviews.llvm.org/D105743	2021-07-12 17:03:52 +00:00
Saleem Abdulrasool	f56e4f6d3d	RISCV: adjust handling of relocation emission for RISCV This re-architects the RISCV relocation handling to bring the implementation closer in line with the implementation in binutils. We would previously aggressively resolve the relocation. With this restructuring, we always will emit a paired relocation for any symbolic difference of the type of S±T[±C] where S and T are labels and C is a constant. GAS has a special target hook controlled by `RELOC_EXPANSION_POSSIBLE` which indicates that a fixup may be expanded into multiple relocations. This is used by the RISCV backend to always emit a paired relocation - either ADD[WIDTH] + SUB[WIDTH] for text relocations or SET[WIDTH] + SUB[WIDTH] for a debug info relocation. Irrespective of whether linker relaxation support is enabled, symbolic difference is always emitted as a paired relocation. This change also sinks the target specific behaviour down into the target specific area rather than exposing it to the shared relocation handling. In the process, we also sink the "special" handling for debug information down into the RISCV target. Although this improves the path for the other targets, this is not necessarily entirely ideal either. The changes in the debug info emission could be done through another type of hook as this functionality would be required by any other target which wishes to do linker relaxation. However, as there are no other targets in LLVM which currently do this, this is a reasonable thing to do until such time as the code needs to be shared. Improve the handling of the relocation (and add a reduced test case from the Linux kernel) to ensure that we handle complex expressions for symbolic difference. This ensures that we correct relocate symbols with the adddends normalized and associated with the addition portion of the paired relocation. This change also addresses some review comments from Alex Bradbury about the relocations meant for use in the DWARF CFA being named incorrectly (using ADD6 instead of SET6) in the original change which introduced the relocation type. This resolves the issues with the symbolic difference emission sufficiently to enable building the Linux kernel with clang+IAS+lld (without linker relaxation). Resolves PR50153, PR50156! Fixes: ClangBuiltLinux/linux#1023, ClangBuiltLinux/linux#1143 Reviewed By: nickdesaulniers, maskray Differential Revision: https://reviews.llvm.org/D103539	2021-06-17 08:20:02 -07:00
Chen Zheng	38673999a0	[XCOFF][DebugInfo] support DWARF for XCOFF for assembly output. Reviewed By: jasonliu Differential Revision: https://reviews.llvm.org/D95518	2021-03-04 21:07:52 -05:00
Fangrui Song	f8233aa8f3	[MC] Split MCContext::createTempSymbol, default AlwaysAddSuffix to true, and add comments CanBeUnnamed is rarely false. Splitting to a createNamedTempSymbol makes the intention clearer and matches the direction of reverted r240130 (to drop the unneeded parameters). No behavior change.	2020-12-21 14:04:13 -08:00
Hongtao Yu	85e4f6f241	[CSSPGO] Pseudo probe encoding and emission. This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s Pseudo probes are in the form of intrinsic calls on IR/MIR but they do not turn into any machine instructions. Instead they are emitted into the binary as a piece of data in standalone sections. The probe-specific sections are not needed to be loaded into memory at execution time, thus they do not incur a runtime overhead. ELF object emission The binary data to emit are organized as two ELF sections, i.e, the `.pseudo_probe_desc` section and the `.pseudo_probe` section. The `.pseudo_probe_desc` section stores a function descriptor for each function and the `.pseudo_probe` section stores the actual probes, each fo which corresponds to an IR basic block or an IR function callsite. A function descriptor is stored as a module-level metadata during the compilation and is serialized into the object file during object emission. Both the probe descriptors and pseudo probes can be emitted into a separate ELF section per function to leverage the linker for deduplication. A `.pseudo_probe` section shares the same COMDAT group with the function code so that when the function is dead, the probes are dead and disposed too. On the contrary, a `.pseudo_probe_desc` section has its own COMDAT group. This is because even if a function is dead, its probes may be inlined into other functions and its descriptor is still needed by the profile generation tool. The format of `.pseudo_probe_desc` section looks like: ``` .section .pseudo_probe_desc,"",@progbits .quad 6309742469962978389 // Func GUID .quad 4294967295 // Func Hash .byte 9 // Length of func name .ascii "_Z5funcAi" // Func name .quad 7102633082150537521 .quad 138828622701 .byte 12 .ascii "_Z8funcLeafi" .quad 446061515086924981 .quad 4294967295 .byte 9 .ascii "_Z5funcBi" .quad -2016976694713209516 .quad 72617220756 .byte 7 .ascii "_Z3fibi" ``` For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format : ``` FUNCTION BODY (one for each outlined function present in the text section) GUID (uint64) GUID of the function NPROBES (ULEB128) Number of probes originating from this function. NUM_INLINED_FUNCTIONS (ULEB128) Number of callees inlined into this function, aka number of first-level inlinees PROBE RECORDS A list of NPROBES entries. Each entry contains: INDEX (ULEB128) TYPE (uint4) 0 - block probe, 1 - indirect call, 2 - direct call ATTRIBUTE (uint3) reserved ADDRESS_TYPE (uint1) 0 - code address, 1 - address delta CODE_ADDRESS (uint64 or ULEB128) code address or address delta, depending on ADDRESS_TYPE INLINED FUNCTION RECORDS A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined callees. Each record contains: INLINE SITE GUID of the inlinee (uint64) ID of the callsite probe (ULEB128) FUNCTION BODY A FUNCTION BODY entry describing the inlined function. ``` To support building a context-sensitive profile, probes from inlinees are grouped by their inline contexts. An inline context is logically a call path through which a callee function lands in a caller function. The probe emitter builds an inline tree based on the debug metadata for each outlined function in the form of a trie tree. A tree root is the outlined function. Each tree edge stands for a callsite where inlining happens. Pseudo probes originating from an inlinee function are stored in a tree node and the tree path starting from the root all the way down to the tree node is the inline context of the probes. The emission happens on the whole tree top-down recursively. Probes of a tree node will be emitted altogether with their direct parent edge. Since a pseudo probe corresponds to a real code address, for size savings, the address is encoded as a delta from the previous probe except for the first probe. Variant-sized integer encoding, aka LEB128, is used for address delta and probe index. Assembling Pseudo probes can be printed as assembly directives alternatively. This allows for good assembly code readability and also provides a view of how optimizations and pseudo probes affect each other, especially helpful for diff time assembly analysis. A pseudo probe directive has the following operands in order: function GUID, probe index, probe type, probe attributes and inline context. The directive is generated by the compiler and can be parsed by the assembler to form an encoded `.pseudoprobe` section in the object file. A example assembly looks like: ``` foo2: # @foo2 # %bb.0: # %bb0 pushq %rax testl %edi, %edi .pseudoprobe 837061429793323041 1 0 0 je .LBB1_1 # %bb.2: # %bb2 .pseudoprobe 837061429793323041 6 2 0 callq foo .pseudoprobe 837061429793323041 3 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq .LBB1_1: # %bb1 .pseudoprobe 837061429793323041 5 1 0 callq %rsi .pseudoprobe 837061429793323041 2 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq # -- End function .section .pseudo_probe_desc,"",@progbits .quad 6699318081062747564 .quad 72617220756 .byte 3 .ascii "foo" .quad 837061429793323041 .quad 281547593931412 .byte 4 .ascii "foo2" ``` With inlining turned on, the assembly may look different around %bb2 with an inlined probe: ``` # %bb.2: # %bb2 .pseudoprobe 837061429793323041 3 0 .pseudoprobe 6699318081062747564 1 0 @ 837061429793323041:6 .pseudoprobe 837061429793323041 4 0 popq %rax retq ``` Disassembling* We have a disassembling tool (llvm-profgen) that can display disassembly alongside with pseudo probes. So far it only supports ELF executable file. An example disassembly looks like: ``` 00000000002011a0 <foo2>: 2011a0: 50 push rax 2011a1: 85 ff test edi,edi [Probe]: FUNC: foo2 Index: 1 Type: Block 2011a3: 74 02 je 2011a7 <foo2+0x7> [Probe]: FUNC: foo2 Index: 3 Type: Block [Probe]: FUNC: foo2 Index: 4 Type: Block [Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6 2011a5: 58 pop rax 2011a6: c3 ret [Probe]: FUNC: foo2 Index: 2 Type: Block 2011a7: bf 01 00 00 00 mov edi,0x1 [Probe]: FUNC: foo2 Index: 5 Type: IndirectCall 2011ac: ff d6 call rsi [Probe]: FUNC: foo2 Index: 4 Type: Block 2011ae: 58 pop rax 2011af: c3 ret ``` Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D91878	2020-12-10 17:29:28 -08:00
Mitch Phillips	7c847657fe	Revert "[CSSPGO] Pseudo probe encoding and emission." This reverts commit b035513c06d1cba2bae8f3e88798334e877523e1. Reason: Broke the ASan buildbots: http://lab.llvm.org:8011/#/builders/5/builds/2269	2020-12-10 15:53:39 -08:00
Hongtao Yu	91873af129	[CSSPGO] Pseudo probe encoding and emission. This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s Pseudo probes are in the form of intrinsic calls on IR/MIR but they do not turn into any machine instructions. Instead they are emitted into the binary as a piece of data in standalone sections. The probe-specific sections are not needed to be loaded into memory at execution time, thus they do not incur a runtime overhead. ELF object emission The binary data to emit are organized as two ELF sections, i.e, the `.pseudo_probe_desc` section and the `.pseudo_probe` section. The `.pseudo_probe_desc` section stores a function descriptor for each function and the `.pseudo_probe` section stores the actual probes, each fo which corresponds to an IR basic block or an IR function callsite. A function descriptor is stored as a module-level metadata during the compilation and is serialized into the object file during object emission. Both the probe descriptors and pseudo probes can be emitted into a separate ELF section per function to leverage the linker for deduplication. A `.pseudo_probe` section shares the same COMDAT group with the function code so that when the function is dead, the probes are dead and disposed too. On the contrary, a `.pseudo_probe_desc` section has its own COMDAT group. This is because even if a function is dead, its probes may be inlined into other functions and its descriptor is still needed by the profile generation tool. The format of `.pseudo_probe_desc` section looks like: ``` .section .pseudo_probe_desc,"",@progbits .quad 6309742469962978389 // Func GUID .quad 4294967295 // Func Hash .byte 9 // Length of func name .ascii "_Z5funcAi" // Func name .quad 7102633082150537521 .quad 138828622701 .byte 12 .ascii "_Z8funcLeafi" .quad 446061515086924981 .quad 4294967295 .byte 9 .ascii "_Z5funcBi" .quad -2016976694713209516 .quad 72617220756 .byte 7 .ascii "_Z3fibi" ``` For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format : ``` FUNCTION BODY (one for each outlined function present in the text section) GUID (uint64) GUID of the function NPROBES (ULEB128) Number of probes originating from this function. NUM_INLINED_FUNCTIONS (ULEB128) Number of callees inlined into this function, aka number of first-level inlinees PROBE RECORDS A list of NPROBES entries. Each entry contains: INDEX (ULEB128) TYPE (uint4) 0 - block probe, 1 - indirect call, 2 - direct call ATTRIBUTE (uint3) reserved ADDRESS_TYPE (uint1) 0 - code address, 1 - address delta CODE_ADDRESS (uint64 or ULEB128) code address or address delta, depending on ADDRESS_TYPE INLINED FUNCTION RECORDS A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined callees. Each record contains: INLINE SITE GUID of the inlinee (uint64) ID of the callsite probe (ULEB128) FUNCTION BODY A FUNCTION BODY entry describing the inlined function. ``` To support building a context-sensitive profile, probes from inlinees are grouped by their inline contexts. An inline context is logically a call path through which a callee function lands in a caller function. The probe emitter builds an inline tree based on the debug metadata for each outlined function in the form of a trie tree. A tree root is the outlined function. Each tree edge stands for a callsite where inlining happens. Pseudo probes originating from an inlinee function are stored in a tree node and the tree path starting from the root all the way down to the tree node is the inline context of the probes. The emission happens on the whole tree top-down recursively. Probes of a tree node will be emitted altogether with their direct parent edge. Since a pseudo probe corresponds to a real code address, for size savings, the address is encoded as a delta from the previous probe except for the first probe. Variant-sized integer encoding, aka LEB128, is used for address delta and probe index. Assembling Pseudo probes can be printed as assembly directives alternatively. This allows for good assembly code readability and also provides a view of how optimizations and pseudo probes affect each other, especially helpful for diff time assembly analysis. A pseudo probe directive has the following operands in order: function GUID, probe index, probe type, probe attributes and inline context. The directive is generated by the compiler and can be parsed by the assembler to form an encoded `.pseudoprobe` section in the object file. A example assembly looks like: ``` foo2: # @foo2 # %bb.0: # %bb0 pushq %rax testl %edi, %edi .pseudoprobe 837061429793323041 1 0 0 je .LBB1_1 # %bb.2: # %bb2 .pseudoprobe 837061429793323041 6 2 0 callq foo .pseudoprobe 837061429793323041 3 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq .LBB1_1: # %bb1 .pseudoprobe 837061429793323041 5 1 0 callq %rsi .pseudoprobe 837061429793323041 2 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq # -- End function .section .pseudo_probe_desc,"",@progbits .quad 6699318081062747564 .quad 72617220756 .byte 3 .ascii "foo" .quad 837061429793323041 .quad 281547593931412 .byte 4 .ascii "foo2" ``` With inlining turned on, the assembly may look different around %bb2 with an inlined probe: ``` # %bb.2: # %bb2 .pseudoprobe 837061429793323041 3 0 .pseudoprobe 6699318081062747564 1 0 @ 837061429793323041:6 .pseudoprobe 837061429793323041 4 0 popq %rax retq ``` Disassembling* We have a disassembling tool (llvm-profgen) that can display disassembly alongside with pseudo probes. So far it only supports ELF executable file. An example disassembly looks like: ``` 00000000002011a0 <foo2>: 2011a0: 50 push rax 2011a1: 85 ff test edi,edi [Probe]: FUNC: foo2 Index: 1 Type: Block 2011a3: 74 02 je 2011a7 <foo2+0x7> [Probe]: FUNC: foo2 Index: 3 Type: Block [Probe]: FUNC: foo2 Index: 4 Type: Block [Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6 2011a5: 58 pop rax 2011a6: c3 ret [Probe]: FUNC: foo2 Index: 2 Type: Block 2011a7: bf 01 00 00 00 mov edi,0x1 [Probe]: FUNC: foo2 Index: 5 Type: IndirectCall 2011ac: ff d6 call rsi [Probe]: FUNC: foo2 Index: 4 Type: Block 2011ae: 58 pop rax 2011af: c3 ret ``` Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D91878	2020-12-10 09:50:08 -08:00
Jian Cai	7f1bdb7b20	[X86] support .nops directive Add support of .nops on X86. This addresses llvm.org/PR45788. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D82826	2020-08-03 11:50:56 -07:00
Stefan Pintilie	74fce4a7b8	[PowerPC] Extend .reloc directive on PowerPC When the compiler generates a GOT indirect load it must generate two loads. One that loads the address of the element from the GOT and a second to load the actual element based on the address just loaded from the GOT. However, the linker can optimize these two loads into one load if it knows that it is safe to do so. The compiler can tell the linker that the optimization is safe by using the R_PPC64_PCREL_OPT relocation. This patch extends the .reloc directive to allow the following setup pld 3, vec@got@pcrel(0), 1 .Lpcrel1=.-8 ... More instructions possible here ... .reloc .Lpcrel1,R_PPC64_PCREL_OPT,.-.Lpcrel1 lwa 3, 4(3) Reviewers: nemanjai, lei, hfinkel, sfertile, efriedma, tstellar, grosbach, MaskRay Reviewed By: nemanjai, MaskRay Differential Revision: https://reviews.llvm.org/D79625	2020-07-22 04:25:54 -05:00
Fangrui Song	5044eeb2f2	[MC] Support .reloc sym+constant, , For `.reloc offset, , `, currently offset can be a constant or symbol. This patch makes it support any expression which can be folded to sym+constant. Reviewed By: stefanp Differential Revision: https://reviews.llvm.org/D83751	2020-07-14 13:44:00 -07:00
Shengchen Kan	d2aeec558e	[NFC] Clean up in MCObjectStreamer and X86AsmBackend	2020-05-09 12:50:44 +08:00
Alexandre Ganea	0437c703af	Re-land [MC] Fix quadratic behavior in addPendingLabel This was discovered when compiling large unity/blob/jumbo files. Differential Revision: https://reviews.llvm.org/D78775	2020-04-26 10:39:42 -04:00
Alexandre Ganea	73eaffd109	Revert "[MC] Fix quadratic behavior in addPendingLabel()" This reverts commit e98f73a629075ae3b9c4d5317bead5a122d69865.	2020-04-24 16:43:10 -04:00
Alexandre Ganea	1bcb4a0fed	[MC] Fix quadratic behavior in addPendingLabel() Differential Revision: https://reviews.llvm.org/D78775	2020-04-24 12:48:54 -04:00
Shengchen Kan	2b533e246f	[MC][NFC] Use camelCase style for functions in MCObjectStreamer	2020-04-20 20:09:20 -07:00
Shengchen Kan	88597bc560	[MC][Bugfix] Remove redundant parameter for relaxInstruction Summary: Before this patch, `relaxInstruction` takes three arguments, the first argument refers to the instruction before relaxation and the third argument is the output instruction after relaxation. There are two quite strange things: 1) The first argument's type is `const MCInst &`, the third argument's type is `MCInst &`, but they may be aliased to the same variable 2) The backends of ARM, AMDGPU, RISC-V, Hexagon assume that the third argument is a fresh uninitialized `MCInst` even if `relaxInstruction` may be called like `relaxInstruction(Relaxed, STI, Relaxed)` in a loop. In this patch, we drop the thrid argument, and let `relaxInstruction` directly modify the given instruction. Also, this patch fixes the bug https://bugs.llvm.org/show_bug.cgi?id=45580, which is introduced by D77851, and breaks the assumption of ARM, AMDGPU, RISC-V, Hexagon. Reviewers: Razer6, MaskRay, jyknight, asb, luismarques, enderby, rtaylor, colinl, bcain Reviewed By: Razer6, MaskRay, bcain Subscribers: bcain, nickdesaulniers, nathanchance, wuzish, annita.zhang, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, tpr, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78364	2020-04-21 11:06:55 +08:00
Shengchen Kan	453d61b323	[MC][NFC] Use camelCase style for function EmitInstToData	2020-04-20 18:53:39 -07:00
Fangrui Song	7a1a2170b7	[MC][COFF][ELF] Reject instructions in IMAGE_SCN_CNT_UNINITIALIZED_DATA/SHT_NOBITS sections For `.bss; nop`, MC inappropriately calls abort() (via report_fatal_error()) with a message `cannot have fixups in virtual section!` It is a bug to crash for invalid user input. Fix it by erroring out early in EmitInstToData(). Similarly, emitIntValue() in a virtual section (SHT_NOBITS in ELF) can crash with the mssage `non-zero initializer found in section '.bss'` (see D4199) It'd be nice to report the location but so many directives can call emitIntValue() and it is difficult to track every location. Note, COFF does not crash because MCAssembler::writeSectionData() is not called for an IMAGE_SCN_CNT_UNINITIALIZED_DATA section. Note, GNU as' arm64 backend reports ``Error: attempt to store non-zero value in section `.bss'`` for a non-zero .inst but fails to do so for other instructions. We simply reject all instructions, even if the encoding is all zeros. The Mach-O counterpart is D48517 (see `test/MC/MachO/zerofill-text.s`) Reviewed By: rnk, skan Differential Revision: https://reviews.llvm.org/D78138	2020-04-15 21:02:47 -07:00
Shengchen Kan	f8ad88c396	[X86][MC] Make -x86-pad-max-prefix-size compatible with --mc-relax-all Summary: We allow non-relaxable instructions emitted into relaxable Fragment when we prefix padding branch. So we need to check if the instruction need relaxation before relaxing it. Without this patch, it currently triggers a `report_fatal_error` in `llvm::MCAsmBackend::relaxInstruction` when we prefix padding branch along with `--mc-relax-all`. Reviewers: LuoYuanke, reames, MaskRay Reviewed By: MaskRay Subscribers: MaskRay, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77851	2020-04-11 11:30:15 +08:00
Shengchen Kan	63b7da67c3	[X86][MC] Support enhanced relaxation for branch align Summary: Since D75300 has been landed, I want to support enhanced relaxation when we need to align branches and allow prefix padding. "Enhanced Relaxtion" means we allow an instruction that could not be traditionally relaxed to be emitted into RelaxableFragment so that we increase its length by adding prefixes for optimization. The motivation is straightforward, RelaxFragment is mostly for relative jumps and we can not increase the length of jumps when we need to align them, so if we need to achieve D75300's purpose (reducing the bytes of nops) when need to align jumps, we have to make more instructions "relaxable". Reviewers: reames, MaskRay, craig.topper, LuoYuanke, jyknight Reviewed By: reames Subscribers: hiraditya, llvm-commits, annita.zhang Tags: #llvm Differential Revision: https://reviews.llvm.org/D76286	2020-04-08 19:08:19 +08:00
Shengchen Kan	5a9e18541c	[NFC][MC] Rename alignBranches* to emitInstruction* alignBranches is X86 specific, change the name in a more general one since other target can do some state chang before and after emitting the instruction.	2020-03-16 17:13:14 +08:00
Shengchen Kan	bd3955496d	[X86] Reduce the number of emitted fragments due to branch align Summary: Currently, a BoundaryAlign fragment may be inserted after the branch that needs to be aligned to truncate the current fragment, this fragment is unused at most of time. To avoid that, we can insert a new empty Data fragment instead. Non-relaxable instruction is usually emitted into Data fragment, so the inserted empty Data fragment will be reused at a high possibility. Reviewers: annita.zhang, reames, MaskRay, craig.topper, LuoYuanke, jyknight Reviewed By: reames, LuoYuanke Subscribers: llvm-commits, dexonsmith, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D75438	2020-03-12 15:37:35 +08:00
Shengchen Kan	56f370d717	[X86] Move the function getOrCreateBoundaryAlignFragment MCObjectStreamer is more suitable to create fragments than X86AsmBackend, for example, the function getOrCreateDataFragment is defined in MCObjectStreamer. Differential Revision: https://reviews.llvm.org/D75351	2020-02-29 15:11:16 +08:00
Fangrui Song	181d5382b7	[MC] De-capitalize MCStreamer::Emit{Bundle,Addrsig}* etc So far, all non-COFF-related Emit* functions have been de-capitalized.	2020-02-15 09:11:48 -08:00
Fangrui Song	13339a0dea	[MCStreamer] De-capitalize EmitValue EmitIntValue{,InHex}	2020-02-14 23:08:40 -08:00
Fangrui Song	e6d3f76ab0	[MC] De-capitalize another set of MCStreamer::Emit* functions Emit{ValueTo,Code}Alignment Emit{DTP,TP,GP}* EmitSymbolValue etc	2020-02-14 19:26:52 -08:00
Fangrui Song	343c2a2b44	[MC] De-capitalize some MCStreamer::Emit* functions	2020-02-14 19:11:53 -08:00
Fangrui Song	f2049f8c24	[AsmPrinter][MCStreamer] De-capitalize EmitInstruction and EmitCFI*	2020-02-13 22:08:55 -08:00
Fangrui Song	216ba73a16	[AsmPrinter] De-capitalize some AsmPrinter::Emit* functions Similar to rL328848.	2020-02-13 13:38:33 -08:00
Philip Reames	8de39d5e37	[BranchAlign] Compiler support for suppressing branch align As discussed heavily in the original review (D70157), there's a need for the compiler to be able to selective suppress padding (either nop or prefix) to respect assumptions about the meaning of labels and instructions in generated code. Rather than wait for syntax to be finalized - which appears to be a very slow process - this patch focuses on the compiler use case and only worries about the integrated assembler. To my knowledge, this covers all cases mentioned to date for clang/JIT support. For testing purposes, I wired it up so that if the integrated assembler was using autopadding for branch alignment (e.g. enabled at command line) then the textual assembly output would contain a comment for each location where padding was enabled or disabled. This seemed like the least painful choice overall. Note that the result of this patch effective disables the jcc errata mitigation for many constructs (statepoints, implicit null checks, xray, etc...) which is non ideal. It is at least correct and should allow us to enable the mitigation for the compiler. Once that's done, and a few other items are worked through, we probably want to come back to this an explore a bundling based approach instead so that we can pad instructions while keeping labels in the right place. Differential Revision: https://reviews.llvm.org/D72303	2020-01-08 10:03:30 -08:00
Fangrui Song	f4801ca87c	[MC][ARM] Delete MCSection::HasData and move SHF_ARM_PURECODE logic to ARMELFObjectWriter::addTargetSectionFlags This simplifies the generic interface and also makes SHF_ARM_PURECODE more robust (fixes a TODO). Inspecting MCDataFragment contents covers more cases than MCObjectStreamer::EmitBytes.	2020-01-05 14:20:34 -08:00
Philip Reames	e0244827fe	Align branches within 32-Byte boundary (NOP padding) WARNING: If you're looking at this patch because you're looking for a full performace mitigation of the Intel JCC Erratum, this is not it! This is a preliminary patch on the patch towards mitigating the performance regressions caused by Intel's microcode update for Jump Conditional Code Erratum. For context, see: https://www.intel.com/content/www/us/en/support/articles/000055650.html The patch adds the required assembler infrastructure and command line options needed to exercise the logic for INTERNAL TESTING. These are NOT public flags, and should not be used for anything other than LLVM's own testing/debugging purposes. They are likely to change both in spelling and meaning. WARNING: This patch is knowingly incorrect in some cornercases. We need, and do not yet provide, a mechanism to selective enable/disable the padding. Conversation on this will continue in parellel with work on extending this infrastructure to support prefix padding. The goal here is to have the assembler align specific instructions such that they neither cross or end at a 32 byte boundary. The impacted instructions are: a. Conditional jump. b. Fused conditional jump. c. Unconditional jump. d. Indirect jump. e. Ret. f. Call. The new options for llvm-mc are: -x86-align-branch-boundary=NUM aligns branches within NUM byte boundary. -x86-align-branch=TYPE[+TYPE...] specifies types of branches to align. A new MCFragment type, MCBoundaryAlignFragment, is added, which may emit NOP to align the fused/unfused branch. alignBranchesBegin inserts MCBoundaryAlignFragment before instructions, alignBranchesEnd marks the end of the branch to be aligned, relaxBoundaryAlign grows or shrinks sizes of NOP to align the target branch. Nop padding is disabled when the instruction may be rewritten by the linker, such as TLS Call. Process Note: I am landing a patch by skan as it has been LGTMed, and continuing to iterate on the review is simply slowing us down at this point. We can and will continue to iterate in tree. Patch By: skan Differential Revision: https://reviews.llvm.org/D70157	2019-12-20 11:35:50 -08:00
Michael Trent	721829227e	[ MC ] Match labels to existing fragments even when switching sections. (This commit restores the original branch (4272372c571) and applies an additional change dropped from the original in a bad merge. This change should address the previous bot failures. Both changes reviewed by pete.) Summary: This commit builds upon Derek Schuff's 2014 commit for attaching labels to existing fragments ( Diff Revision: http://reviews.llvm.org/D5915 ) When temporary labels appear ahead of a fragment, MCObjectStreamer will track the temporary label symbol in a "Pending Labels" list. Labels are associated with fragments when a real fragment arrives; otherwise, an empty data fragment will be created if the streamer's section changes or if the stream finishes. This commit moves the "Pending Labels" list into each MCStream, so that this label-fragment matching process is resilient to section changes. If the streamer emits a label in a new section, switches to another section to do other work, then switches back to the first section and emits a fragment, that initial label will be associated with this new fragment. Labels will only receive empty data fragments in the case where no other fragment exists for that section. The downstream effects of this can be seen in Mach-O relocations. The previous approach could produce local section relocations and external symbol relocations for the same data in an object file, and this mix of relocation types resulted in problems in the ld64 Mach-O linker. This commit ensures relocations triggered by temporary labels are consistent. Reviewers: pete, ab, dschuff Reviewed By: pete, dschuff Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71368	2019-12-18 09:55:54 -08:00
Mitch Phillips	1024f30fcf	Revert "[ MC ] Match labels to existing fragments even when switching sections." This reverts commit 4272372c571cd33edc77a8844b0a224ad7339138. Caused an MSan buildbot failure. More information available in the patch that introduced the bug: https://reviews.llvm.org/D71368	2019-12-17 15:04:26 -08:00
Michael Trent	b9b1c03468	[ MC ] Match labels to existing fragments even when switching sections. Summary: This commit builds upon Derek Schuff's 2014 commit for attaching labels to existing fragments ( Diff Revision: http://reviews.llvm.org/D5915 ) When temporary labels appear ahead of a fragment, MCObjectStreamer will track the temporary label symbol in a "Pending Labels" list. Labels are associated with fragments when a real fragment arrives; otherwise, an empty data fragment will be created if the streamer's section changes or if the stream finishes. This commit moves the "Pending Labels" list into each MCStream, so that this label-fragment matching process is resilient to section changes. If the streamer emits a label in a new section, switches to another section to do other work, then switches back to the first section and emits a fragment, that initial label will be associated with this new fragment. Labels will only receive empty data fragments in the case where no other fragment exists for that section. The downstream effects of this can be seen in Mach-O relocations. The previous approach could produce local section relocations and external symbol relocations for the same data in an object file, and this mix of relocation types resulted in problems in the ld64 Mach-O linker. This commit ensures relocations triggered by temporary labels are consistent. Reviewers: pete, ab, dschuff Reviewed By: pete, dschuff Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71368	2019-12-17 08:49:25 -08:00
Fangrui Song	963c5fad53	[MC] Delete MCCodePadder D34393 added MCCodePadder as an infrastructure for padding code with NOP instructions. It lacked tests and was not being worked on since then. Intel has now worked on an assembler patch to mitigate performance loss after applying microcode update for the Jump Conditional Code Erratum. https://www.intel.com/content/www/us/en/support/articles/000055650/processors.html This new patch shares similarity with MCCodePadder, but has a concrete use case in mind and is being actively developed. The infrastructure it introduces can potentially be used for general performance improvement via alignment. Delete the unused MCCodePadder so that people can develop the new feature from a clean state. Reviewed By: jyknight, skan Differential Revision: https://reviews.llvm.org/D71106	2019-12-09 19:21:31 -08:00
James Y Knight	6648816752	MCObjectStreamer: assign MCSymbols in the dummy fragment to offset 0. In MCObjectStreamer, when there is no current fragment, initially symbols are created in a "pending" state and assigned to a dummy empty fragment. Previously, they were not being assigned an offset, and thus evaluateAbsolute would fail if trying to evaluate an expression 'a - b', where both 'a' and 'b' were in this pending state. Also slightly refactored the EmitLabel overload which takes an MCFragment for clarity. Fixes: https://llvm.org/PR41825 Differential Revision: https://reviews.llvm.org/D70062	2019-11-16 09:52:07 -05:00
Guillaume Chatelet	114e854bc6	[Alignment][NFC] Remove unneeded llvm:: scoping on Align types llvm-svn: 373081	2019-09-27 12:54:21 +00:00
Guillaume Chatelet	78165500f1	[Alignment] Introduce llvm::Align to MCSection Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, JDevlieghere Subscribers: arsenm, sdardis, jvesely, nhaehnle, sbc100, hiraditya, aheejin, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67486 llvm-svn: 371831	2019-09-13 09:29:59 +00:00
Jonas Devlieghere	2c693415b7	[llvm] Migrate llvm::make_unique to std::make_unique Now that we've moved to C++14, we no longer need the llvm::make_unique implementation from STLExtras.h. This patch is a mechanical replacement of (hopefully) all the llvm::make_unique instances across the monorepo. llvm-svn: 369013	2019-08-15 15:54:37 +00:00
Andrea Di Biagio	2820258583	[AsmPrinter] Remove hidden flag -print-schedule. This patch removes hidden codegen flag -print-schedule effectively reverting the logic originally committed as r300311 (https://llvm.org/viewvc/llvm-project?view=revision&revision=300311). Flag -print-schedule was originally introduced by r300311 to address PR32216 (https://bugs.llvm.org/show_bug.cgi?id=32216). That bug was about adding "Better testing of schedule model instruction latencies/throughputs". These days, we can use llvm-mca to test scheduling models. So there is no longer a need for flag -print-schedule in LLVM. The main use case for PR32216 is now addressed by llvm-mca. Flag -print-schedule is mainly used for debugging purposes, and it is only actually used by x86 specific tests. We already have extensive (latency and throughput) tests under "test/tools/llvm-mca" for X86 processor models. That means, most (if not all) existing -print-schedule tests for X86 are redundant. When flag -print-schedule was first added to LLVM, several files had to be modified; a few APIs gained new arguments (see for example method MCAsmStreamer::EmitInstruction), and MCSubtargetInfo/TargetSubtargetInfo gained a couple of getSchedInfoStr() methods. Method getSchedInfoStr() had to originally work for both MCInst and MachineInstr. The original implmentation of getSchedInfoStr() introduced a subtle layering violation (reported as PR37160 and then fixed/worked-around by r330615). In retrospect, that new API could have been designed more optimally. We can always query MCSchedModel to get the latency and throughput. More importantly, the "sched-info" string should not have been generated by the subtarget. Note, r317782 fixed an issue where "print-schedule" didn't work very well in the presence of inline assembly. That commit is also reverted by this change. Differential Revision: https://reviews.llvm.org/D57244 llvm-svn: 353043	2019-02-04 12:51:26 +00:00
Chandler Carruth	ae65e281f3	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Reid Kleckner	575a6ae149	[codeview] Flush labels before S_DEFRANGE* fragments This was a pre-existing bug that could be triggered with assembly like this: .p2align 2 .LtmpN: .cv_def_range "..." I noticed this when attempting to change clang to emit aligned symbol records. llvm-svn: 349403	2018-12-17 21:49:35 +00:00
Vladimir Stefanovic	d431e5c490	[MC] Support labels as offsets in .reloc directive Currently, expressions like .reloc 1f, R_MIPS_JALR, foo 1: nop are not allowed, ie. an offset in .reloc can only be absolute value. This patch adds support for labels as offsets. If offset is a forward declared label, MCObjectStreamer keeps the fixup locally and adds it to the fixups vector after the label (and its offset) is defined. label+number is not supported yet. Differential revision: https://reviews.llvm.org/D53990 llvm-svn: 347397	2018-11-21 16:28:39 +00:00
Eli Friedman	5610197726	Revert BTF commit series. The initial patch was not reviewed, and does not have any tests; it should not have been merged. This reverts 344395, 344390, 344387, 344385, 344381, 344376, and 344366. llvm-svn: 344405	2018-10-12 19:41:05 +00:00
Yonghong Song	1577053cb3	[BPF] Add BTF generation for BPF target BTF is the debug format for BPF, a kernel virtual machine and widely used for tracing, networking and security, etc ([1]). Currently only instruction streams are passed to kernel, the kernel verifier verifies them before execution. In order to provide better visibility of bpf programs to user space tools, some debug information, e.g., function names and debug line information are desirable for kernel so tools can get such information with better annotation for jited instructions for performance or other reasons. The dwarf is too complicated in kernel and for BPF. Hence, BTF is designed to be the debug format for BPF ([2]). Right now, pahole supports BTF for types, which are generated based on dwarf sections in the ELF file. In order to annotate performance metrics for jited bpf insns, it is necessary to pass debug line info to the kernel. Furthermore, we want to pass the actual code to the kernel because of the following reasons: . bpf program typically is small so storage overhead should be small. . in bpf land, it is totally possible that an application loads the bpf program into the kernel and then that application quits, so holding debug info by the user space application is not practical. . having source codes directly kept by kernel would ease deployment since the original source code does not need ship on every hosts and kernel-devel package does not need to be deployed even if kernel headers are used. The only reliable time to get the source code is during compilation time. This will result in both more accurate information and easier deployment as stated in the above. Another consideration is for JIT. The project like bcc use MCJIT to compile a C program into bpf insns and load them to the kernel ([3]). The generated BTF sections will be readily available for such cases as well. This patch implemented generation of BTF info in llvm compiler. The BTF related sections will be generated when both -target bpf and -g are specified. Two sections are generated: .BTF contains all the type and string information, and .BTF.ext contains the func_info and line_info. The separation is related to how two sections are used differently in bpf loader, e.g., linux libbpf ([4]). The .BTF section can be loaded into the kernel directly while .BTF.ext needs loader manipulation before loading to the kernel. The format of the each section is roughly defined in llvm:include/llvm/MC/MCBTFContext.h and from the implementation in llvm:lib/MC/MCBTFContext.cpp. A later example also shows the contents in each section. The type and func_info are gathered during CodeGen/AsmPrinter by traversing dwarf debug_info. The line_info is gathered in MCObjectStreamer before writing to the object file. After all the information is gathered, the two sections are emitted in MCObjectStreamer::finishImpl. With cmake CMAKE_BUILD_TYPE=Debug, the compiler can dump out all the tables except insn offset, which will be resolved later as relocation records. The debug type "btf" is used for BTFContext dump. Dwarf tests the debug info generation with llvm-dwarfdump to decode the binary sections and check whether the result is expected. Currently we do not have such a tool yet. We will implement btf dump functionality in bpftool ([5]) as the bpftool is considered the recommended tool for bpf introspection. The implementation for type and func_info is tested with linux kernel test cases. The line_info is visually checked with dump from linux kernel libbpf ([4]) and checked with readelf dumping section raw data. Note that the .BTF and .BTF.ext information will not be emitted to assembly code and there is no assembler support for BTF either. In the below, with a clang/llvm built with CMAKE_BUILD_TYPE=Debug, Each table contents are shown for a simple C program. -bash-4.2$ cat -n test.c 1 struct A { 2 int a; 3 char b; 4 }; 5 6 int test(struct A t) { 7 return t->a; 8 } -bash-4.2$ clang -O2 -target bpf -g -mllvm -debug-only=btf -c test.c Type Table: [1] FUNC name_off=1 info=0x0c000001 size/type=2 param_type=3 [2] INT name_off=12 info=0x01000000 size/type=4 desc=0x01000020 [3] PTR name_off=0 info=0x02000000 size/type=4 [4] STRUCT name_off=16 info=0x04000002 size/type=8 name_off=18 type=2 bit_offset=0 name_off=20 type=5 bit_offset=32 [5] INT name_off=22 info=0x01000000 size/type=1 desc=0x02000008 String Table: 0 : 1 : test 6 : .text 12 : int 16 : A 18 : a 20 : b 22 : char 27 : test.c 34 : int test(struct A t) { 58 : return t->a; FuncInfo Table: sec_name_off=6 insn_offset=<Omitted> type_id=1 LineInfo Table: sec_name_off=6 insn_offset=<Omitted> file_name_off=27 line_off=34 line_num=6 column_num=0 insn_offset=<Omitted> file_name_off=27 line_off=58 line_num=7 column_num=3 -bash-4.2$ readelf -S test.o ...... [12] .BTF PROGBITS 0000000000000000 0000028d 00000000000000c1 0000000000000000 0 0 1 [13] .BTF.ext PROGBITS 0000000000000000 0000034e 0000000000000050 0000000000000000 0 0 1 [14] .rel.BTF.ext REL 0000000000000000 00000648 0000000000000030 0000000000000010 16 13 8 ...... -bash-4.2$ The latest linux kernel ([6]) can already support .BTF with type information. The [7] has the reference implementation in linux kernel side to support .BTF.ext func_info. The .BTF.ext line_info support is not implemented yet. If you have difficulty accessing [6], you can manually do the following to access the code: git clone https://github.com/yonghong-song/bpf-next-linux.git cd bpf-next-linux git checkout btf The change will push to linux kernel soon once this patch is landed. References: [1]. https://www.kernel.org/doc/Documentation/networking/filter.txt [2]. https://lwn.net/Articles/750695/ [3]. https://github.com/iovisor/bcc [4]. https://github.com/torvalds/linux/tree/master/tools/lib/bpf [5]. https://github.com/torvalds/linux/tree/master/tools/bpf/bpftool [6]. https://github.com/torvalds/linux [7]. https://github.com/yonghong-song/bpf-next-linux/tree/btf Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Differential Revision: https://reviews.llvm.org/D52950 llvm-svn: 344366	2018-10-12 17:01:46 +00:00
Eric Christopher	c2b29240e0	The initial .text section generated in object files was missing the SHF_ARM_PURECODE flag when being built with the -mexecute-only flag. All code sections of an ELF must have the flag set for the final .text section to be execute-only, otherwise the flag gets removed. A HasData flag is added to MCSection to aid in the determination that the section is empty. A virtual setTargetSectionFlags is added to MCELFObjectTargetWriter to allow subclasses to set target specific section flags to be added to sections which we then use in the ARM backend to set SHF_ARM_PURECODE. Patch by Ivan Lozano! Reviewed By: echristo Differential Revision: https://reviews.llvm.org/D48792 llvm-svn: 341593	2018-09-06 22:09:31 +00:00
Reid Kleckner	4b1d0ba67a	[codeview] Clean up machinery for deferring .cv_loc emission Now that we create the label at the point of the directive, we don't need to set the "current CV location", and then later when we emit the next instruction, create a label for it and emit it. DWARF still defers the labels used in .debug_loc until the next instruction or value, for reasons unknown. llvm-svn: 340883	2018-08-28 23:25:59 +00:00
Reid Kleckner	07d5e694df	[codeview] Emit labels for .cv_loc immediately Previously we followed the DWARF implementation, which waits until the next instruction or data to emit the label to use in the .debug_loc section. We might want to consider re-evaluating that design choice as well, since it means the .loc skips alignment padding, for better or worse. This was the most minimal fix I could come up with, but we should be able to do a lot of cleanups now that we don't need to save a pending CV location on the CodeViewContext. I plan to do those next, but this immediately fixes an assertion for some of our users. llvm-svn: 340878	2018-08-28 22:29:12 +00:00
Alex Bradbury	94c1bcf630	[RISCV][MC] Don't fold symbol differences if requiresDiffExpressionRelocations is true When emitting the difference between two symbols, the standard behavior is that the difference will be resolved to an absolute value if both of the symbols are offsets from the same data fragment. This is undesirable on architectures such as RISC-V where relaxation in the linker may cause the computed difference to become invalid. This caused an issue when compiling to object code, where the size of a function in the debug information was already calculated even though it could change as a consequence of relaxation in the subsequent linking stage. This patch inhibits the resolution of symbol differences to absolute values where the target's AsmBackend has declared that it does not want these to be folded. Differential Revision: https://reviews.llvm.org/D45773 Patch by Edward Jones. llvm-svn: 339864	2018-08-16 11:26:37 +00:00

1 2 3 4 5 ...

270 Commits