- Implies support for everything that Skylake-X supports as well as AVX512IFMA, AVX512VBMI, AVX512VBMI2, AVX512VPOPCNTDQ, AVX512BITALG, AVX512VNNI, AVX512VPCLMULQDQ, AVX512GFNI, AVX512VAES
* Replace ppu_module_manager Function Static with Class Static
Makes for a slightly 'cleaner' interface in my opinion, may also assist with adding thread read/write concurrency support in future if ever required (have left that out of this commit to match existing function).
Very slight performance improvements were seen in representative testing.
https://quick-bench.com/q/GMbgeNc-mZc21aqOKCofnbzPZvg
I didn't investigate whether static initialisation of the static_modules might actually be possible here, perhaps there's a way to do a constexpr / consteval of this.
* Fix up for old style cast syntax..
* Fixups from PR comments
Plus remove spurious type_traits include (from me) not picked up in previous PR
* Remove old code
* Update rpcs3/Emu/Cell/PPUModule.h
Co-authored-by: Eladash <elad3356p@gmail.com>
* Fix naming of static variable
Co-authored-by: Eladash <elad3356p@gmail.com>
Tidy up fake transfer iterator handling. erase invalidates all iterators including the current iterator (i.e. 'it'), given precedence ordering this was UB prior to C++17. Splitting out to use return iterator from erase seems cleaner.
Also added some additional info to usb debug message to potentially help with #8666, and used the atomic (dev_counter) less often
- Currently this conversion is being done on the CPU to reuse as much code as possible.
The expectation is that this almost never happens, so there is not point in increasing maintenance burden by adding compute paths
- Tracks which kind of raster was done (Z-ordered vs linear) throughout the application.
- This allows to identify if data is in the expected format or not.
- PSHUFB operates in reverse byte order from SHUFB, so we can take advantage of that to swap endianness without additional transformations in some situations
When on single worker mode (OpenGL or an hypothetical scenario of a
single core PC with Vulkan), load and compile would skip 10% of the
shaders on queue each stage.
Co-authored-by: kd-11 <karokidii@gmail.com>