Balancing The System With Other Hardware Features

The biggest technological advantage consoles have over PCs is that consoles are a fully-integrated fixed platform specified by a single manufacturer. In theory, the manufacturer can ensure that the system is properly balanced for the use case, something PC OEMs are notoriously bad at. Consoles generally don't have the problem of wasting a large chunk of the budget on a single high-end component that the rest of the system cannot keep up with, and consoles can more easily incorporate custom hardware when suitable off-the-shelf components aren't available. (This is why the outgoing console generation didn't use desktop-class CPU cores, but dedicated a huge amount of the silicon budget to the GPUs.)

By now, PC gaming has thoroughly demonstrated that increasing SSD speed has little or no impact on gaming performance. NVMe SSDs are several times faster than SATA SSDs on paper, but for almost all PC games that extra performance goes largely unused. In part, this is due to bottlenecks elsewhere in the system that are revealed when storage performance is fast enough to no longer be a serious limitation. The upcoming consoles will include a number of hardware features designed to make it easier for games to take advantage of fast storage, and to alleviate bottlenecks that would be troublesome on a standard PC platform. This is where the console storage tech gets actually interesting, since the SSDs themselves are relatively unremarkable.

Compression: Amplifying SSD Performance

The most important specialized hardware feature the consoles will include to complement storage performance is dedicated data decompression hardware. Game assets must be stored on disk in a compressed form to keep storage requirements somewhat reasonable. Games usually rely on multiple compression methods—some lossy compression methods specialized for certain types of data (eg. audio and images), and some lossless general-purpose algorithm, but almost everything goes through at least one compression method that is fairly computationally complex. GPU architectures have long included hardware to handle decoding video streams and support simple but fast lossy texture compression methods like S3TC and its successors, but that leaves a lot of data to be decompressed by the CPU. Desktop CPUs don't have dedicated decompression engines or instructions, though many instructions in the various SIMD extensions are intended to help with tasks like this. Even so, decompressing a stream of data at several GB/s is not trivial, and special-purpose hardware can do it more efficiently while freeing up CPU time for other tasks. The decompression offload hardware in the upcoming consoles is implemented on the main SoC so that it can unpack data after it traverses the PCIe link from the SSD and resides in the main RAM pool shared by the GPU and CPU cores.

Decompression offload hardware like this isn't found on typical desktop PC platforms, but it's hardly a novel idea. Previous consoles have included decompression hardware, though nothing that would be able to keep pace with NVMe SSDs. Server platforms often include compression accelerators, usually paired with cryptography accelerators: Intel has done such accelerators both as discrete peripherals and integrated into some server chipsets, and IBM's POWER9 and later CPUs have similar accelerator units. These server accelerators more comparable to what the new consoles need, with throughput of several GB/s.

Microsoft and Sony each have tuned their decompression units to fit the performance expected from their chosen SSD designs. They've chosen different proprietary compression algorithms to target: Sony is using RAD's Kraken, a general-purpose algorithm which was originally designed to be used on the current consoles with relatively weak CPUs but vastly lower throughput requirements. Microsoft focused specifically on texture compression, reasoning that textures account for the largest volume of data that games need to read and decompress. They developed a new texture compression algorithm and dubbed it BCPack in a slight departure from their existing DirectX naming conventions for texture compression methods already supported by GPUs.

Compression Offload Hardware
  Microsoft
Xbox Series X
Sony
Playstation 5
Algorithm BCPack Kraken (and ZLib?)
Maximum Output Rate 6 GB/s 22 GB/s
Typical Output Rate 4.8 GB/s 8–9 GB/s
Equivalent Zen 2 CPU Cores 5 9

Sony states that their Kraken-based decompression hardware can unpack the 5.5GB/s stream from the SSD into a typical 8-9 GB/s of uncompressed data, but that can theoretically reach up to 22 GB/s if the data was redundant enough to be highly compressible. Microsoft states their BCPack decompressor can output a typical 4.8 GB/s from the 2.4 GB/s input, but potentially up to 6 GB/s. So Microsoft is claiming slightly higher typical compression ratios, but still a slower output stream due to the much slower SSD, and Microsoft's hardware decompression is apparently only for texture data.

The CPU time saved by these decompression units sounds astounding: the equivalent of about 9 Zen 2 CPU cores for the PS5, and about 5 for the Xbox Series X. Keep in mind these are peak numbers that assume the SSD bandwidth is being fully utilized—real games won't be able to keep these SSDs 100% busy, so they wouldn't need quite so much CPU power for decompression.

The storage acceleration features on the console SoCs aren't limited to just compression offload, and Sony in particular has described quite a few features, but this is where the information released so far is really vague, unsatisfying and open to interpretation. Most of this functionality seems to be intended to reduce overhead, handling some of the more mundane aspects of moving data around without having to get the CPU involved as often, and making sure the hardware decompression process is invisible to the game software.

DMA Engines

Direct Memory Access (DMA) refers to the ability for a peripheral device to read and write to the CPU's RAM without the CPU being involved. All modern high-speed peripherals use DMA for most of their communication with the CPU, but that's not the only use for DMA. A DMA Engine is a peripheral device that exists solely to move data around; it usually doesn't do anything to that data. The CPU can instruct the DMA engine to perform a copy from one region of RAM to another, and the DMA engine does the rote work of copying potentially gigabytes of data without the CPU having to do a mov (or SIMD equivalent) instruction for every piece, and without polluting CPU caches. DMA engines can also often do more than just offload simple copy operations: they commonly support scatter/gather operations to rearrange data somewhat in the process of moving it around. NVMe already has features like scatter/gather lists that can remove the need for a separate DMA engine to provide that feature, but the NVMe commands in these consoles are acting mostly on compressed data.

Even though DMA engines are a peripheral device, you usually won't find them as a standalone PCIe card. It makes the most sense for them to be as close to the memory controller as possible, which means on the chipset or on the CPU die itself.The PS5 SoC includes a DMA engine to handle copying around data coming out of the compression unit. As with the compression engines, this isn't a novel invention so much as a feature missing from standard desktop PCs, which means it's something custom that Sony has to add to what would otherwise be a fairly straightforward AMD APU configuration.

IO Coprocessor

The IO complex in the PS5's SoC also includes a dual-core processor with its own pool of SRAM. Sony has said almost nothing about the internals of this: Mark Cerny describes one core as dedicated to SSD IO, allowing games to "bypass traditional file IO", while the other core is described simply as helping with "memory mapping". For more detail, we have to turn to a patent Sony filed years ago, and hope it reflects what's actually in the PS5.

The IO coprocessor described in Sony's patent offloads portions of what would normally be the operating system's storage drivers. One of its most important duties is to translate between various address spaces. When the game requests a certain range of bytes from one of its files, the game is looking for the uncompressed data. The IO coprocessor figures out which chunks of compressed data are needed and sends NVMe read commands to the SSD. Once the SSD has returned the data, the IO coprocessor sets up the decompression unit to process that data, and the DMA engine to deliver it to the requested locations in the game's memory.

Since the IO coprocessor's two cores are each much less powerful than a Zen 2 CPU core, they cannot be in charge of all interaction with the SSD. The coprocessor handles the most common cases of reading data, and the system falls back to the OS running on the Zen 2 cores for the rest. The coprocessor's SRAM isn't used to buffer the vast amounts of game data flowing through the IO complex; instead this memory holds the various lookup tables used by the IO coprocessor. In this respect, it is similar to an SSD controller with a pool of RAM for its mapping tables, but the job of the IO coprocessor is completely different from what an SSD controller does. This is why it will be useful even with aftermarket third-party SSDs.

Cache Coherency

The last somewhat storage-related hardware feature Sony has disclosed is a set of cache coherency engines. The CPU and GPU on the PS5 SoC share the same 16 GB of RAM, which eliminates the step of copying assets from main RAM to VRAM after they're loaded from the SSD and decompressed. But to get the most benefit from the shared pool of memory, the hardware has to ensure cache coherency not just between the several CPU cores, but also with the GPU's various caches. That's all normal for an APU, but what's novel with the PS5 is that the IO complex also participates. When new graphics assets are loaded into memory through the IO complex and overwrite older assets, it sends cache invalidation signals to any relevant caches—to discard only the stale data, rather than flush the entire GPU caches.

What about the Xbox Series X?

There's a lot of information above about the Playstation 5's custom IO complex, and it's natural to wonder whether the Xbox Series X will have similar capabilities or if it's limited to just the decompression hardware. Microsoft has lumped the storage-related technologies in the new Xbox under the heading of "Xbox Velocity Architecture":

Microsoft defines this as having four components: the SSD itself, the compression engine, a new software API for accessing storage (more on this later), and a hardware feature called Sampler Feedback Streaming. That last one is only distantly related to storage; it's a GPU feature that makes partially resident textures more useful by allowing shader programs to keep a record of which portions of a texture are actually being used. This information can be used to decide what data to evict from RAM and what to load next—such as a higher-resolution version of the texture regions that are actually visible at the moment.

Since Microsoft doesn't mention anything like the other PS5 IO complex features, it's reasonable to assume the Xbox Series X doesn't have those capabilities and its IO is largely managed by the CPU cores. But I wouldn't be too surprised to find out the Series X has a comparable DMA engine, because that's kind of feature has historically shown up in many console architectures.

SSD Details: Xbox Series X and Playstation 5 What To Expect From Next-gen Games
Comments Locked

200 Comments

View All Comments

  • Oxford Guy - Monday, June 15, 2020 - link

    "The other point is that PCs, while more complicated"

    False.

    "Consoles" of today, except for the Switch, ARE PCs.

    They are merely PCs with different walled gardens and, unlike the "PC" platform, they can't be used outside of those walled gardens. On the "PC", though, there is Linux, which offers freedom from the Microsoft and Sony taxes.
  • close - Tuesday, June 16, 2020 - link

    @hecksagon, this "oh it's just 30FPS" routine is pretty old. Whatever you think is a good resolution and framerate, someone thinks it should be higher. Current gen consoles play run games 4K@30-60FPS. *I* personally find that more than decent and certainly better than the hassle of PC gaming. And I say this as someone who plays on the console just slightly more than on the PC. Graphics are the bonus in a game, I still enjoy an "8 bit" game even without the Ks and the FPSs, I still play oldies.

    So if you say rock solid 60FPS is good, someone can just reply "144FPS or bust, anything else is for micropeenuses" (got you there ;)). 4K is good? You need at least 4 x 5K monitors, anything else..." well you get the point. Maybe.

    It's nice that you went on to list all the (dis)advantages that were already discussed previously by myself and others just before reaching the conclusion that for myself and many, many others the tradeoffs are worth it. But leave it to a 16 year old to think there's only room in this world for what they like "na-d'uh". You'd like to think that you're some sort of genius surrounded by millions of idiots who for some reason impossible to understand chose differently. Some day you might still want to play games but not feel like wasting your free time tinkering away.

    So in conclusion yes, no matter how you measure it, mine is bigger ;).
  • FreckledTrout - Saturday, June 13, 2020 - link

    While I am a PC gamer I do get the appeal of "it just works". I buy iPhone's because I don't want to deal with tweaking tons of with settings and iPhone's are fairly well configured right out of the box.
  • Oxford Guy - Monday, June 15, 2020 - link

    "While I am a PC gamer I do get the appeal of 'it just works'."

    Marketing magic. In reality, there is nothing a "console" walled garden offers for consumers in added value. It's all smoke and mirrors.

    Every feature can be done with Linux + Vulkan + OpenGL and done better (lower cost, less inefficiency of having THREE walled gardens).
  • close - Tuesday, June 16, 2020 - link

    @Oxford Guy "Every feature can be done with Linux + Vulkan + OpenGL"

    Ah... all of the things nobody ever wanted to deal with when playing games. In reality you can't buy a console equivalent new PC for less money (plenty of people|bloggers tried and ended up comparing new consoles with second hand PC to even get close).

    Yes, doing it yourself is many times cheaper, including (especially?) that thing that you mostly do by yourself. But people also want convenience. Which is why the lowly console still sells tens of millions of units every year. People even play on phones. And just as a hint that people don't care that much for your opinion... look around ;).
  • close - Saturday, June 13, 2020 - link

    @Retycint, nothing strange. Indeed, a game that is optimized by design to run as well as it can on the given hardware with no tinkering involved is for me far preferable than having to waste an afternoon for every game and maybe get the desired result. And just because you have dozens of settings and powerful hardware it doesn't change the fact that the performance will vary with every driver version. I've had a far more inconsistent gaming experience on my PC than on the console and again, I have a much faster PC than console.

    Yes, there is the risk of a developer getting a game's performance to dip here and there. But then when this happens they usually fix it or the game stays on the shelves. I gave you a concrete example. For me and many others the tradeoff is worth it.
  • whatthe123 - Saturday, June 13, 2020 - link

    except you gave a horrible example. Assassins Creed Odyssey dips in framerate on every platform, especially consoles, where neither the ps4 pro nor the xbox one x can maintain even 30fps as shown by digital foundry. Neither can maintain 4K either and instead use dynamic resolution, which is also available on PC. Basically your example of "it just works" is a game that constantly stutters from frame dips and can't maintain its output resolution. Not entirely sure how that's different from running a PC with random settings and just ignoring frame dips.
  • close - Sunday, June 14, 2020 - link

    Except my example wasn't "dips framerate", it was "freezes". I guess you can call it "dipping framerate all the way down to 0fps for at least 1s". And I guess you could argue that something is wrong on my PC, probably the AV doing something weird, when I installed the latest GPU drivers I didn't properly clean up the old ones, I didn't "defragment" the SSD, etc. But that would just reinforce my point.

    I can give you dozens of examples but I'm sure you'll just find weird ways to "prove" that they are all exceptions that don't count somehow, "everyone knows that" . My GPU alone cost more than any latest gen console at launch, the rest of it is leaps ahead also, and yet the experience on PC was always more inconsistent and full of hassle.

    And if you want to "play that game", every gamer worth their salt agrees that a playable experience starts at 144FPS but should really go higher. Which means your PC also spends 100% of its time in a "framerate dip". So if I'm going to stare at an "eyesore" I'd rather do it with a $400 box that mostly does 4K at 30-60FPS than a $2000 one where there's always a tweak to be done to get it right. I just don't have the time or the patience for that kind of crap. If the game review says "adequate performance" on the console, I know I'll get adequate performance. On the PC it's never that simple.

    You clearly have no consistent first hand experience using both, so that kind of shoots your opinion in both feet and once between the eyes for good measure.
  • Zagor Te Nay - Sunday, June 14, 2020 - link

    I'm with you.

    Also PC and PS4 gamer. Not all games work perfectly on consoles, but if you do some research before you buy console game - wait for reliable review or two, for example - if conclusion is that game works fine on any give console, you know exactly what performance you will get out of it on your console. With PC, even with minimum/recommended specs, one never knows if game will behave exactly the same. There might be some legacy hardware, older drivers...

    Back in the days - I think it was AMD64 days - I have built new rig with GeForce 7850 GPU etc. My PC was crashing in 2 out of 3 games. Game would freeze and audio would lock in a loop... hard reset was the only way out. Luckily I was working for IT company so I had access to spare parts I could borrow to troubleshoot.

    After replacing everything meaningful, reinstalling OS and drivers a few times... and starting to get a bit desperate, I have replaced everything that I haven't tried before, just because. Solution to my problem turned out to be replacing Microsoft Internet Explorer keyboard. Keyboard was working perfectly in Windows, and computer never froze on desktop, so I never suspected keyboard. It probably wasn't even faulty - I brought it to office and it worked perfectly fine for years on my work PC, until I decided I don't want beige keyboard any more.

    I know it is extreme one off, but I did have other share of smaller compatibility issues, since I started gaming on PC in early '90. Consoles are more straight forward.
  • SirPerro - Sunday, June 14, 2020 - link

    Well but that's exactly it. "It just works" means that. That's what Apple users expect from their computer or phone. And you may not prefer consoles, but "They just work".

    I also like PC gaming, but I'm aware there's A LOT of knowledge involved which is not necessary with consoles.

    We are a small minority of people who understand really technical concepts. Power users if you want. But people out there don't know what frame rate is. They don't have a choice because it's overwhelming for them. That's EXACTLY what consoles do best. They create a layer of abstraction that works for too many people.

Log in

Don't have an account? Sign up now