1
0
mirror of https://github.com/RPCS3/rpcs3.git synced 2024-11-25 04:02:42 +01:00
Commit Graph

130 Commits

Author SHA1 Message Date
Nekotekina
a9437d69ab simd_builder: fixups
Fix resetting vmask in reduce() step.
Fix AVX-512 loads in vec_load_unaligned().
Fix bzhi reg size in build_look().
2022-09-08 18:12:15 +03:00
Nekotekina
5d91caebe9 Linux: delete /tmp/perf.map on exit 2022-09-08 16:56:06 +03:00
Nekotekina
82258915da BufferUtils: rewrite remaining intrinsic code with simd_builder 2022-09-07 17:59:07 +03:00
Nekotekina
11a1f090d3 BufferUtils: simd_builder refactoring
Some simplifications implemented.
2022-09-07 17:59:07 +03:00
Nekotekina
80f0741103 simd_builder: fix constant locations 2022-08-29 14:32:56 +03:00
Nekotekina
e28707055b Implement simd_builder for x86
ASMJIT-based tool for building vectorized loops (such as ones in BufferUtils.cpp)
2022-08-28 18:38:52 +03:00
Eladash
1dd1062be1 PPU LLVM: Fix HLE function injection 2022-08-21 15:02:01 +03:00
sguo35
d2614d01fd [ppu] fix a macOS arm64 regression
Always override the LLVM triple to prevent linking errors.
2022-06-20 15:08:27 +03:00
Ivan
c2190f71ca
SPU/PPU LLVM: fix triple setup (regression fix) (#12228) 2022-06-14 18:13:43 +03:00
Jeff Guo
cefc37a553
PPU LLVM arm64+macOS port (#12115)
* BufferUtils: use naive function pointer on Apple arm64

Use naive function pointer on Apple arm64 because ASLR breaks asmjit.
See BufferUtils.cpp comment for explanation on why this happens and how
to fix if you want to use asmjit.

* build-macos: fix source maps for Mac

Tell Qt not to strip debug symbols when we're in debug or relwithdebinfo
modes.

* LLVM PPU: fix aarch64 on macOS

Force MachO on macOS to fix LLVM being unable to patch relocations
during codegen. Adds Aarch64 NEON intrinsics for x86 intrinsics used by
PPUTranslator/Recompiler.

* virtual memory: use 16k pages on aarch64 macOS

Temporary hack to get things working by using 16k pages instead of 4k
pages in VM emulation.

* PPU/SPU: fix NEON intrinsics and compilation for arm64 macOS

Fixes some intrinsics usage and patches usages of asmjit to properly
emit absolute jmps so ASLR doesn't cause out of bounds rel jumps. Also
patches the SPU recompiler to properly work on arm64 by telling LLVM to
target arm64.

* virtual memory: fix W^X toggles on macOS aarch64

Fixes W^X on macOS aarch64 by setting all JIT mmap'd regions to default
to RW mode. For both SPU and PPU execution threads, when initialization
finishes we toggle to RX mode. This exploits Apple's per-thread setting
for RW/RX to let us be technically compliant with the OS's W^X
    enforcement while not needing to actually separate the memory
    allocated for code/data.

* PPU: implement aarch64 specific functions

Implements ppu_gateway for arm64 and patches LLVM initialization to use
the correct triple. Adds some fixes for macOS W^X JIT restrictions when
entering/exiting JITed code.

* PPU: Mark rpcs3 calls as non-tail

Strictly speaking, rpcs3 JIT -> C++ calls are not tail calls. If you
call a function inside e.g. an L2 syscall, it will clobber LR on arm64
and subtly break returns in emulated code. Only JIT -> JIT "calls"
should be tail.

* macOS/arm64: compatibility fixes

* vm: patch virtual memory for arm64 macOS

Tag mmap calls with MAP_JIT to allow W^X on macOS. Fix mmap calls to
existing mmap'd addresses that were tagged with MAP_JIT on macOS. Fix
memory unmapping on 16K page machines with a hack to mark "unmapped"
pages as RW.

* PPU: remove wrong comment

* PPU: fix a merge regression

* vm: remove 16k page hacks

* PPU: formatting fixes

* PPU: fix arm64 null function assembly

* ppu: clean up arch-specific instructions
2022-06-14 15:28:38 +03:00
Nekotekina
dba2baba9c Implement utils::memory_map_fd (partial)
Improve JIT profiling dump format (data + name, mmap)
Improve objdump interception util (better speed, fix bugs)
Rename spu_ubertrampoline to __ub+number
2022-01-26 15:46:16 +03:00
Nekotekina
11ee1f3eb2 Improve JIT profiling on Linux
Add JIT object dumping functionality.
Add source for objdump interception utility.
2022-01-25 03:16:37 +03:00
Nekotekina
580bd2b25e Initial Linux Aarch64 support
* Update asmjit dependency (aarch64 branch)
* Disable USE_DISCORD_RPC by default
* Dump some JIT objects in rpcs3 cache dir
* Add SIGILL handler for all platforms
* Fix resetting zeroing denormals in thread pool
* Refactor most v128:: utils into global gv_** functions
* Refactor PPU interpreter (incomplete), remove "precise"
* - Instruction specializations with multiple accuracy flags
* - Adjust calling convention for speed
* - Removed precise/fast setting, replaced with static
* - Started refactoring interpreters for building at runtime JIT
*   (I got tired of poor compiler optimizations)
* - Expose some accuracy settings (SAT, NJ, VNAN, FPCC)
* - Add exec_bytes PPU thread variable (akin to cycle count)
* PPU LLVM: fix VCTUXS+VCTSXS instruction NaN results
* SPU interpreter: remove "precise" for now (extremely non-portable)
* - As with PPU, settings changed to static/dynamic for interpreters.
* - Precise options will be implemented later
* Fix termination after fatal error dialog
2022-01-15 06:48:04 +03:00
Nekotekina
cb2748ae08 Update ASMJIT (new upstream API) 2021-12-29 02:45:00 +03:00
Nekotekina
122555fb66 Add an error check in JITAnnouncer event listener
This is a bit strange one.
2021-12-26 22:01:20 +03:00
Nekotekina
d836033212 LLVM: enable some JIT events (Intel, Perf)
Made some related adjustments.
Currently incomplete.
2021-12-26 16:41:37 +03:00
Nekotekina
3cd8891ab8 Re-refactor copy_data_swap_u32 again
Drop AVX2 path for now, since it usually operates on small data.
Rely on automatic SSE vectorization on recent compilers.
Side refactoring on JIT.h to workaround weird conflict issue.
2021-12-26 14:40:21 +03:00
Nekotekina
dcd011048d Implement "built_function" utility (runtime-generated assembly)
Similar to build_function_asm, but links without indirection.
Achieved by emitting code directly into a byte array.
2021-12-22 19:27:20 +03:00
Paul
4e12e70929
Add Intel's Rocket Lake 11th gen cpu. (#10205)
This does nothing but may be required later.
2021-05-13 11:34:37 +03:00
Megamouse
a16d8ba3ea More random changes 2021-04-11 14:01:51 +03:00
Nekotekina
95725bf7fc Add -Werror=missing-noreturn (GCC, clang)
May be useful to diagnose functions which fail assertions unconditionally.
2021-04-08 10:29:47 +03:00
Nekotekina
963d150e93 Fix some -Weffc++ warnings (part 2) 2021-04-03 21:54:15 +03:00
Nekotekina
2212a131ef Fix some -Weffc++ warnings (part 1) 2021-03-31 11:27:09 +03:00
Nekotekina
b3fb6d7d18 Add and fix -Wredundant-decls (GCC) 2021-03-23 22:48:57 +03:00
Nekotekina
a4fdbf0a88 Enable -Wstrict-aliasing=1 (GCC)
Fixed partially.
2021-03-09 03:10:15 +03:00
Nekotekina
87af905018 Enable -Wunused-parameter 2021-03-06 18:07:08 +03:00
Eladash
ff211a9508 LLVM: Do not crash on failure to create cache file 2021-03-02 16:07:51 +03:00
Nekotekina
980be9e0e8 JIT.cpp: fix overcommit bug (should have been Linux-specific)
Closes #9820

Co-authored-by: Eladash <elad3356p@gmail.com>
2021-02-22 13:35:01 +03:00
Nekotekina
aeeceb7d0b Minor fixups 2021-02-01 11:30:50 +03:00
Nekotekina
f9ee8978ff PPU LLVM: improve analyser
Compile possibly executable holes between detected functions.
Add unused "PPU LLVM Greedy Mode" option (for future updates).
Add "nounwind" attribute to compiled functions (reduces size).
2021-02-01 11:30:50 +03:00
Nekotekina
3567c43fb5 LLVM: generate trampolines for "null" functions
Embed name into the trampoline for easier debugging.
Only warn about it during the compilation phase.
2021-01-15 21:38:33 +03:00
Nekotekina
f14d47bfe6 LLVM: log certain null functions 2021-01-12 15:39:43 +03:00
Nekotekina
eec11bfba9 Move align helpers to util/asm.hpp
Also add some files:
GLTextureCache.cpp
VKTextureCache.cpp
2020-12-18 18:07:42 +03:00
Nekotekina
db9b7db531 Cleanup and move sysinfo.h -> util/sysinfo.hpp 2020-12-18 12:55:54 +03:00
Nekotekina
fb29933d3d Add usz alias for std::size_t 2020-12-18 12:23:53 +03:00
Nekotekina
a6a5292cd7 Use uptr (std::uintptr_t alias) 2020-12-12 16:29:55 +03:00
Nekotekina
b59f142d4e Move types.h to util/types.hpp 2020-12-12 15:12:01 +03:00
Nekotekina
36c8654fb8 Remove HERE macro
Some cleanup.
Add location to some functions.
2020-12-10 12:30:22 +03:00
Nekotekina
5d934c8759 Improve narrow() and size32() with src_loc detection 2020-12-09 16:26:20 +03:00
Nekotekina
e055d16b2c Replace verify() with ensure() with auto src location.
Expression ensure(x) returns x.
Using comma operator removed.
2020-12-09 15:43:38 +03:00
Nekotekina
ca9898e838 JIT: increase likeliness of allocating 2M large pages
On top of enabled transparent hugepages hint (Linux).
2020-11-25 10:41:17 +03:00
Nekotekina
1c99a2e7fb vm: add map_self() method to utils::shm
Add complementary unmap_self() method.
Move VirtualMemory to util/vm.hpp
Minor associated include cleanup.
Move asm.h to util/asm.hpp
2020-11-08 16:43:15 +03:00
Nekotekina
7d56069243 Fix HAS_OVERCOMMIT usage in JIT.cpp 2020-11-05 18:50:19 +03:00
Nekotekina
b66628baca Improve low-level mmap utilities (Linux/BSD)
Add madvise (MADV_WILLNEED) on utils::memory_commit
Add madvise (MADV_FREE or MADV_DONTNEED) on utils::memory_decommit
Improve shm_open pseudo-random name (not used on Linux)
2020-11-04 14:59:26 +03:00
Nekotekina
1b8bf081b5 Upgrade to LLVM 11 Stable 2020-11-02 21:23:25 +03:00
Nekotekina
8ce0819b42 SPU: add stx/ftx counters
Just count pure transaction successes and failures.
2020-10-29 18:57:57 +03:00
Nekotekina
dc8252bb9f Remove XABORT in PPU/SPU transactions.
It's expensive for unknown reason. Simply XEND is usually much cheaper.
Add some minor improvements. Use g_sudo_addr.
2020-10-20 09:10:21 +03:00
Nekotekina
44c90c060a TSX: improve transaction repeat handling
Handle status 0 as fatal.
2020-10-19 19:41:28 +03:00
Nekotekina
3d980a9f66 Reimplement ASMJIT runtime
Try to emplace generated code in lower address area.
Protect generated code from writing.
2020-10-17 21:25:43 +03:00
Nekotekina
f2d2a6b605 JIT cleanup for PPU LLVM
Remove MemoryManager3 as unnecessary.
Rewrite MemoryManager1 to use its own 512M reservations.
Disabled unwind info registration on all platforms.
Use 64-bit executable pointers under vm::g_exec_addr area.
Stop relying on deploying PPU LLVM objects in first 2G of address space.
Implement jit_module_manager, protect its data with mutex.
2020-10-11 17:22:28 +03:00