Microsoft: DirectStorage 1.1 with GPU Decompression Finally on Its Wayby Ryan Smith on October 14, 2022 10:45 AM EST
- Posted in
- DirectX 12
- Windows 11
As part of this week’s Microsoft Ignite developers conference, Microsoft’s DirectX team has published a few blog posts offering updates on the state of various game development-related projects. The biggest and most interesting of these is an update on DirectStorage, Microsoft’s API for enabling faster game asset loading. In short, the long-awaited 1.1 update, which adds support for GPU asset decompression, is finally on its way, with Microsoft intending to release the API to developers by the end of this year.
As a quick refresher, DirectStorage is Microsoft’s next-generation game asset loading API, and is designed to take advantage of the modern capabilities of both GPUs and storage hardware to allow for game assets to be more efficiently transferred directly to GPU. On the I/O side of matters, DirectStorage offers new batched I/O operations that are designed to cut down on the number of individual I/O operations, reducing the overall I/O overhead. But more even more notable than that, DirectStorage also enables (or rather, will enable) GPU asset decompression, allowing for modern compressed assets to bypass the CPU and be decompressed on the GPU instead.
The significance of DirectStorage is that Microsoft wants PCs (and console) to be able to better leverage the low random access times and high transfer rates of modern SSDs, enabling games to quickly stream in new assets rather than having to pre-load everything or suffering noticeably slow asset loading, as can be the case today. Under current game development paradigms, the CPU can be a bottlenecking factor in scaling up I/O rates to meet what SSDs can provide, as there are significant CPU costs both to tracking so many I/O operations and for decompressing game assets before passing them on to the GPU. DirectStorage, in turn, is designed to minimize both of these loads, and ultimately, try to remove the CPU as much as possible from game asset streaming.
DirectStorage technology was already implemented on Microsoft/s Xbox Series X/S consoles for their launch in 2020, so more recent efforts have been around porting DirectStorage to Windows and accounting for the non-homogenous hardware ecosystem. Earlier this year Microsoft rolled out DirectStorage 1.0, which implemented the I/O batching improvements, but not the GPU decompression capabilities. This is where DirectStorage 1.1 will come in, as it will finally be enabling the second (and most important) aspect of DirectStorage for PCs.
By allowing GPUs to do game asset decompression, that entire process is offloaded from the CPU. This not only frees the CPU up for other tasks, but it removes a potentially critical bottleneck in game asset streaming. Because modern SSDs are so fast – on the order of hundreds of thousands of IOPS and data transfer rates hitting 7GB/second – the CPU is the weakest link between speedy SSDs and massively parallel GPUs. So under DirectStorage, the CPU is getting cut out almost entirely.
As far as the performance benefits of DirectStorage 1.1 go, the full gains will depend on both the hardware used and how much data a game or other application is attempting to push. Games moving large amounts of data on very fast systems are expected to see the largest gains from the full DirectStorage 1.1 stack, though even lighter games can benefit from the fast access times to NVMe SSDs.
As part of Microsoft’s blog post, the company posted a screenshot from their Bulk Loading sample program for game developers, which offers a simple demonstration and benchmark of DirectStorage 1.1 in action. In Microsoft’s case, they were able to load 5.65GB of assets in 0.8 seconds using GPU decompression on an undisclosed PC, versus 2.36 seconds on the same system with CPU decompression – while maxing out the load on the CPU in the process. Like most SDK sample programs, this is a simple test case focused on just one feature, so the real-world gains aren’t likely to be quite so extreme, but it underscores the performance benefits of moving asset decompression from the CPU to the GPU when you have a large amount of asset data.
Moving under the hood, DirectStorage GPU decompression is being enabled via the introduction GDeflate, a general purpose compression algorithm that was originally developed by NVIDIA. GDeflate is a GPU-optimized variation on Deflate, which has been designed to better mesh with the massively parallel (and not-very-serial) nature of GPUs.
DirectStorage, in turn, will be implementing GDeflate support in two different manners. The first (and preferred) manner is to pass things off to the GPU drivers and have the GPU vendor take care of it as they see fit. This will allow hardware vendors optimize for the specific hardware/architecture used, and leverage any special hardware processing blocks if they’re available. All three companies are eager to get the show on the road, and it's likely some (if not all) of them will have DirectStorage 1.1-capable drivers ready before the API even ships to game developers.
Failing that, Microsoft is also providing a generic (but optimized) DirectCompute GDeflate decompressor, which can be run on any DirectX12 Shader Model 6.0-compliant GPU. Which means that, in some form or another, GDeflate will be available with virtually any PC GPU made in the last 10 years – though more recent GPUs are expected to offer much better performance.
Otherwise, the only things that will eventually be needed to take advantage of GPU decompression – and DirectStorage 1.1 in general – will be Windows 10 1909 (or later) or Windows 11, as well as a fast storage device. Technically, DirectStorage works against any storage device, including SATA SSDs, but it is explicitly being optimized for (and deliver the best results on) systems using NVMe SSDs.
Do note, however, that it will be up to individual games to implement DirectStorage to see the benefits of the API. That means not only using the necessary API hooks, but also shipping games with assets packed using the new GDeflate algorithm. The vast backwards compatibility of GDeflate means that game devs can essentially hit the ground running here on DX12 games – anything worth running a new game on is going to support DirectStorage and GDeflate – but the fact that it involves game assets means that full DirectStorage 1.1 support cannot be trivially added to existing games. Developers would need to redistribute (or otherwise recompress) game assets for GDeflate, which is certainly do-able, but would require gamers to re-download a large part of a game. So gamers should plan on seeing DirectStorage 1.1 arrive as a feature in future games, rather than backported into existing games.
Finally, as for Microsoft’s audience at hand (developers), this week’s announcement from Microsoft is meant to prod them into getting ready for the updated API ahead of its release later this year. Microsoft isn’t releasing the API documentation or tools at this time, but they are encouraging developers to get started with DirectStorage 1.0, so that they can take the next step and add GPU decompression once 1.1 is available later this year.
Source: Microsoft DirectX Dev Blog
Post Your CommentPlease log in or sign up to comment.
View All Comments
Small Bison - Sunday, October 16, 2022 - linkThese are all assets the GPU needs (textures, geometry, etc) so time spent sending it to the GPU isn’t overhead. Given that, it’s probably still worth decompressing on the GPU, even if the CPU would be a little faster at it, just to save time transferring everything over the PCIe bus. (I’m assuming in your hypothetical that there’s some synergy between decrypting and decompressing that makes the latter faster when done simultaneously with the former)
0ldman79 - Friday, October 14, 2022 - linkStill seems like adding it to older games would show gains even if they weren't compressed with it in mind, where it worked properly the gains should far offset anywhere they didn't.
It's not like they can't write the software to decode virtually any container.
wr3zzz - Saturday, October 15, 2022 - linkWhen sites say DirectStorage 1.0 is already out do they mean out to developers or already in Windows?
With no DirectStorage games in sight I was hoping DirectStorage can at least improve I/O of small files in Windows file system. Moving/copying/deleting thousands of files such as decompressed video png even with NVMe is still as painful as using HDD.
Ryan Smith - Saturday, October 15, 2022 - link"When sites say DirectStorage 1.0 is already out do they mean out to developers or already in Windows?"
Both. That said, you'd have to check the API docs to see if there's even a provision for deleting things. It's primarily designed for loading game assets.
Aside from that, Windows 11 does have some storage stack optimizations that better tune things for NVMe drives. And, of course, deleting files via the CLI is a good deal faster than Windows Explorer.
GeoffreyA - Sunday, October 16, 2022 - linkI think the problem is that, inevitably, the updating of thousands of MFT records takes up time. Perhaps the updates are written out to disk one at a time, in a synchronous fashion with the file being deleted. Checking that a file's handles are closed could be adding to it as well. And, as Ryan pointed out, the GUI itself eats up a big share. Of course, I have got no idea how it actually works, but am speculating.
GeoffreyA - Sunday, October 16, 2022 - linkIncidentally, I deleted a folder now---an FFmpeg compilation of over 200,000 files---and it seems to corroborate this. While deleting, Task Manager showed that it wrote consistently to the SSD, at < 10 MB/sec, along with 30% CPU usage.
James5mith - Sunday, October 16, 2022 - linkSounds like you need an Optane drive. Too bad it was decided they weren't worth the money.
Zingam - Saturday, October 15, 2022 - linkWho thinks we don't need no CPU no more!
JKJK - Saturday, October 15, 2022 - linkWe've heard about this on pc for 3 years now ... and still nothing.
I'll believe it when I see it. However, I really hope it kicks off asap. Because this is much needed.
LuxZg - Sunday, October 16, 2022 - linkI am confused here... I thought most DX games used the DXT variations for compression of textures, and that GPUs already had that in hardware for past 10 years or so. So why Gdeflate?!? Ok, that's 'just' textures, but that's also 90% of bandwidth, isn't it? Seems so much work when they could've made a system that allowed most of existing games to use DX Storage and GPU decompression for 80-90% of assets just by leveraging OS, DX APIs and drivers.