Tuesday, September 27, 2022
Home TECH How Nvidia DLSS 3 works and why FSR can't catch up for...

How Nvidia DLSS 3 works and why FSR can’t catch up for now

Nvidia’s RTX 40-series graphics cards are coming in a few weeks, but among all the hardware improvements is what could be Nvidia’s golden egg: DLSS 3. It’s much more than just an update to the popular DLSS feature. (Deep Learning Super Sampling) from Nvidia. , and could end up defining the next generation of Nvidia much more than the graphics cards themselves.

AMD has been working hard to bring its FidelityFX Super Resolution (FSR) on par with DLSS, and for the past few months, it has been successful. It looks like DLSS 3 will change that dynamic, and this time, FSR may not be able to catch up anytime soon.

How DLSS 3 works (and how it doesn’t)

nvidia

You’d be forgiven for thinking that DLSS 3 is a completely new version of DLSS, but it’s not. Or at least, it’s not entirely new. The backbone of DLSS 3 is the same super-resolution technology that’s available in DLSS titles today, and presumably Nvidia will continue to improve it with new releases. Nvidia says that you will now see the super resolution portion of DLSS 3 as a separate option in the graphics settings.

The new part is frame generation. DLSS 3 will output a completely unique frame every two frames, essentially outputting seven out of every eight pixels you see. You can see an illustration of that in the flowchart below. In the case of 4K, your GPU only renders the pixels for 1080p and uses that information not only for the current frame but for the next frame as well.

A graph showing how DLSS 3 reconstructs frames.
nvidia

Frame generation, according to Nvidia, will be a separate change from super resolution. This is because frame rendering only works on RTX 40 series GPUs for now, while super resolution will continue to work on all RTX graphics cards, even games that have been upgraded to DLSS 3. It doesn’t. needless to say, but if half of your frames are generated entirely, that will increase your performance by much.

However, frame generation isn’t just some secret AI sauce. In DLSS 2 and tools like FSR, motion vectors are a key input for scaling. They describe where objects move from one frame to the next, but motion vectors only apply to geometry in a scene. Elements that don’t have 3D geometry, such as shadows, reflections, and particles, have traditionally been masked out of the scaling process to avoid visual artifacts.

nvidia

Masking isn’t an option when an AI is generating a completely unique frame, which is where Optical Flow Accelerator comes into play on RTX 40-series GPUs. It’s like a motion vector, except the graphics card follows the movement of individual pixels from one frame to the next. This optical flow field, along with motion vectors, depth, and color, contribute to the AI-generated picture.

It sounds like all the good stuff, but there’s a big problem with AI-generated frames: they increase latency. The AI-generated frame never passes through your PC; it’s a “fake” frame, so you won’t see it in traditional fps readings in games or tools like FRAPS. So the latency doesn’t go down despite having so many extra frames, and due to the computational overhead of optical flow, the latency actually goes up. Therefore, DLSS 3 requires Nvidia Reflex to compensate for the higher latency.

Normally, your CPU stores a render queue for your graphics card to make sure your GPU is never waiting for work to be done (that would cause stutters and frame rate drops). Reflex removes the render queue and synchronizes your GPU and CPU so that as soon as your CPU can send instructions, the GPU starts rendering them. When applied over DLSS 3, Nvidia says that Reflex can sometimes even result in reduced latency.

Where AI makes a difference

Microsoft Flight Simulator | NVIDIA DLSS 3: Exclusive First Look

AMD’s FSR 2.0 doesn’t use AI, and as I wrote a while back, it shows that you can Get the same quality as DLSS with algorithms instead of machine learning. DLSS 3 changes that with its unique frame generation capabilities, as well as the introduction of optical flow.

Optical flow isn’t a new idea: It’s been around for decades and has applications in everything from video editing apps to self-driving cars. Nevertheless, optical flow calculation with machine learning it is relatively new due to an increase in datasets for training AI models. The reason you’d want to use AI is simple: it produces fewer visual bugs with enough training, and it doesn’t have as much overhead at runtime.

DLSS is running at runtime. It is possible to develop an algorithm, free of machine learning, to estimate how each pixel moves from one frame to the next, but it is computationally expensive, which defeats the point of supersampling in the first place. With an AI model that doesn’t require a lot of power and enough training data (and rest assured, Nvidia has plenty of training data to work with), you can achieve high-quality optical flow and it can run at runtime.

That leads to an improvement in frame rate even in games that are CPU-limited. Supersampling only applies to your resolution, which depends almost entirely on your GPU. With a new framework that bypasses CPU processing, DLSS 3 can double the frame rate in games, even if you’re fully throttled on the CPU. That is impressive and is currently only possible with AI.

Why FSR 2.0 can’t catch up (for now)

FSR and DLSS image quality comparison in God of War.

AMD has really bent over backwards with FSR 2.0. It looks fantastic, and the fact that it’s brand independent is even better. I’ve been ready to ditch DLSS for FSR 2.0 ever since I first saw it on death loop. But as much as I enjoy FSR 2.0 and think it’s a great team from AMD, it’s not going to catch up with DLSS 3 any time soon.

To begin with, developing an algorithm that can track each pixel between frames without artifacts is difficult enough, especially in a 3D environment with fine and dense details (cyberpunk 2077 It’s a good example). It is possible, but difficult. The bigger issue, however, is how bloated that algorithm should be. Tracking each pixel through 3D space, doing the optical flow calculation, generating a frame, and cleaning up any mishaps along the way: that’s a lot to ask.

To get it to run while a game is running and still deliver frame rate improvement to the level of FSR 2.0 or DLSS, that’s even more to ask. Nvidia, even with dedicated processors and a trained model, still has to use Reflex to compensate for the higher latency imposed by optical flow. Without that hardware or software, FSR would likely trade too much latency to generate frames.

I have no doubt that AMD and other developers will get there eventually, or find another way to fix the problem, but that could be a few years. It’s hard to say now.

Coming Soon: GeForce RTX 4090 DLSS 3 First Preview Trailer

What is easy to say is that DLSS 3 looks very exciting. Of course, we’ll have to wait until it’s here to validate Nvidia’s performance claims and see how image quality holds up So far, we’ve only got a short video from Digital Foundry showing DLSS 3 footage (above), which I would recommend watching until we see more third-party testing. However, from our current vantage point, DLSS 3 certainly looks promising.

This article is part of ReSpec, an ongoing biweekly column that includes in-depth discussions, tips, and reports on the technology behind PC gaming.

Editors’ Recommendations






RELATED ARTICLES

NASA’s DART spacecraft crashes into an asteroid, on purpose

“This is the first time we have tried to move something in our solar system with the intention of preventing a natural disaster...

NASA crashed the DART spacecraft into an asteroid and filmed the accident

While most people sat down to dinner, NASA tried to move a space mountain. Out of sight for backyard stargazers, a vending machine-sized spacecraft...

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -

Most Popular

The 7 OSI Network Layers Explained

Shutterstock.com/Angelus_Svetlana The Open Systems Interconnection (OSI) network model defines a conceptual framework for communications between computer systems. the model is to...

NASA’s DART mission hits an asteroid in the first planetary defense test

After 10 months of flight in space, NASA's Double Asteroid Redirection Test (DART), the world's first planetary defense technology demonstration, successfully impacted its asteroid...

NASA just redirected an asteroid by crashing a spacecraft into it | living science

NASA has intentionally crashed a spacecraft into an asteroid in the first test of Earth's planetary defense system.NASA's Double Asteroid Redirection Test (DART) spacecraft...

Laws Against Catalytic Converter Theft Could Protect Car Owners

This article originally appeared on unit. California is aiming to curb catalyst theft. On Sunday, Governor...