Nvidia’s RTX 40-series graphics cards are coming in a few weeks, but among all the hardware improvements is what could be Nvidia’s golden egg: DLSS 3. It’s much more than just an update to the popular DLSS feature. (Deep Learning Super Sampling) from Nvidia. , and could end up defining the next generation of Nvidia much more than the graphics cards themselves.
AMD has been working hard to bring its FidelityFX Super Resolution (FSR) on par with DLSS, and for the past few months, it has been successful. It looks like DLSS 3 will change that dynamic, and this time, FSR may not be able to catch up anytime soon.
How DLSS 3 works (and how it doesn’t)
You’d be forgiven for thinking that DLSS 3 is a completely new version of DLSS, but it’s not. Or at least, it’s not entirely new. The backbone of DLSS 3 is the same super-resolution technology that’s available in DLSS titles today, and presumably Nvidia will continue to improve it with new releases. Nvidia says that you will now see the super resolution portion of DLSS 3 as a separate option in the graphics settings.
The new part is frame generation. DLSS 3 will output a completely unique frame every two frames, essentially outputting seven out of every eight pixels you see. You can see an illustration of that in the flowchart below. In the case of 4K, your GPU only renders the pixels for 1080p and uses that information not only for the current frame but for the next frame as well.
Frame generation, according to Nvidia, will be a separate change from super resolution. This is because frame rendering only works on RTX 40 series GPUs for now, while super resolution will continue to work on all RTX graphics cards, even games that have been upgraded to DLSS 3. It doesn’t. needless to say, but if half of your frames are generated entirely, that will increase your performance by much.
However, frame generation isn’t just some secret AI sauce. In DLSS 2 and tools like FSR, motion vectors are a key input for scaling. They describe where objects move from one frame to the next, but motion vectors only apply to geometry in a scene. Elements that don’t have 3D geometry, such as shadows, reflections, and particles, have traditionally been masked out of the scaling process to avoid visual artifacts.
Masking isn’t an option when an AI is generating a completely unique frame, which is where Optical Flow Accelerator comes into play on RTX 40-series GPUs. It’s like a motion vector, except the graphics card follows the movement of individual pixels from one frame to the next. This optical flow field, along with motion vectors, depth, and color, contribute to the AI-generated picture.
It sounds like all the good stuff, but there’s a big problem with AI-generated frames: they increase latency. The AI-generated frame never passes through your PC; it’s a “fake” frame, so you won’t see it in traditional fps readings in games or tools like FRAPS. So the latency doesn’t go down despite having so many extra frames, and due to the computational overhead of optical flow, the latency actually goes up. Therefore, DLSS 3 requires Nvidia Reflex to compensate for the higher latency.
Normally, your CPU stores a render queue for your graphics card to make sure your GPU is never waiting for work to be done (that would cause stutters and frame rate drops). Reflex removes the render queue and synchronizes your GPU and CPU so that as soon as your CPU can send instructions, the GPU starts rendering them. When applied over DLSS 3, Nvidia says that Reflex can sometimes even result in reduced latency.
Where AI makes a difference
AMD’s FSR 2.0 doesn’t use AI, and as I wrote a while back, it shows that you can Get the same quality as DLSS with algorithms instead of machine learning. DLSS 3 changes that with its unique frame generation capabilities, as well as the introduction of optical flow.
Optical flow isn’t a new idea: It’s been around for decades and has applications in everything from video editing apps to self-driving cars. Nevertheless, optical flow calculation with machine learning it is relatively new due to an increase in datasets for training AI models. The reason you’d want to use AI is simple: it produces fewer visual bugs with enough training, and it doesn’t have as much overhead at runtime.
DLSS is running at runtime. It is possible to develop an algorithm, free of machine learning, to estimate how each pixel moves from one frame to the next, but it is computationally expensive, which defeats the point of supersampling in the first place. With an AI model that doesn’t require a lot of power and enough training data (and rest assured, Nvidia has plenty of training data to work with), you can achieve high-quality optical flow and it can run at runtime.
That leads to an improvement in frame rate even in games that are CPU-limited. Supersampling only applies to your resolution, which depends almost entirely on your GPU. With a new framework that bypasses CPU processing, DLSS 3 can double the frame rate in games, even if you’re fully throttled on the CPU. That is impressive and is currently only possible with AI.
Why FSR 2.0 can’t catch up (for now)
AMD has really bent over backwards with FSR 2.0. It looks fantastic, and the fact that it’s brand independent is even better. I’ve been ready to ditch DLSS for FSR 2.0 ever since I first saw it on death loop. But as much as I enjoy FSR 2.0 and think it’s a great team from AMD, it’s not going to catch up with DLSS 3 any time soon.
To begin with, developing an algorithm that can track each pixel between frames without artifacts is difficult enough, especially in a 3D environment with fine and dense details (cyberpunk 2077 It’s a good example). It is possible, but difficult. The bigger issue, however, is how bloated that algorithm should be. Tracking each pixel through 3D space, doing the optical flow calculation, generating a frame, and cleaning up any mishaps along the way: that’s a lot to ask.
To get it to run while a game is running and still deliver frame rate improvement to the level of FSR 2.0 or DLSS, that’s even more to ask. Nvidia, even with dedicated processors and a trained model, still has to use Reflex to compensate for the higher latency imposed by optical flow. Without that hardware or software, FSR would likely trade too much latency to generate frames.
I have no doubt that AMD and other developers will get there eventually, or find another way to fix the problem, but that could be a few years. It’s hard to say now.
What is easy to say is that DLSS 3 looks very exciting. Of course, we’ll have to wait until it’s here to validate Nvidia’s performance claims and see how image quality holds up So far, we’ve only got a short video from Digital Foundry showing DLSS 3 footage (above), which I would recommend watching until we see more third-party testing. However, from our current vantage point, DLSS 3 certainly looks promising.
This article is part of ReSpec, an ongoing biweekly column that includes in-depth discussions, tips, and reports on the technology behind PC gaming.