Consumers needed clarification on the announcement of the RTX 4080. Talking to other tech enthusiasts, they asked, “Hello, are you reviewing 4080?” which speaks volumes about how consumers received the card’s announcement. Truthfully? It doesn’t matter what it is. Perception won’t make you play better or give you more frames. Performance is what matters most. That’s why we’re here.
Due to market conditions and chip shortages, NVIDIA’s usual launch strategy was rewritten. NVIDIA launched its flagship product without launching a base range of cards that would have given consumers multiple choices for every price point. NVIDIA claimed that their flagship product had a significant performance boost, up to four times, in some cases, double, triple, and sometimes even four-fold the performance of the previous generation. It wasn’t easy to believe that normal uplifts were in the 30% range.
This technology can be paired with their DLSS 3 technology to achieve excellent framerates. The newest technology can generate more frames and deliver 4K/120 in almost every game we tested. The generational improvements made throughout the card resulted in a dramatic improvement in the stereoscopic framerate. It’s large at three slots and can drink power like water. But it’s a potent card. We now have the “good” 4080 a month later. It’s time for us to test the NVIDIA GeForce GTX 4080.
A truth. The power limits:
AD103 (DT), 450W, AD103 (Mobile), 175W;
AD104 (DT), 400W, AD104 (Mobile), 175W;
AD106 (DT), 260W, AD106 (Mobile), 140W.
But I don't think we need to use the full power cap.
— kopite7kimi (@kopite7kimi) June 18, 2022
You might buy a 4080 as it is smaller. But I have good news: the RTX 4080 has precisely the exact dimensions as the RTX 4090. The 4080 will occupy three slots and crowd the next PCIe slot, just like the 4090. Airflow is another important consideration. Although it’s unlikely that your new GPU will be crowded, someone will find a way to pair it with a mini-ITX.
Many important pieces of tech have received a significant upgrade with the 4000 series cards. Let’s take a look at a few.
NVIDIA GeForce RTX 40 Series Graphics Card Lineup (Rumored):
|GRAPHICS CARD||GPU||PCB VARIANT||SM UNITS / CORES||MEMORY / BUS||MEMORY CLOCK / BANDWIDTH||TBP||POWER CONNECTORS||LAUNCH|
|NVIDIA Titan A / GeForce RTX 40?||AD102-450?||PG137-SKU0||142/ 18176?||48 GB / 384-bit||24 Gbps / 1.15 TB/s||~800W||2x 16-pin||TBD|
|NVIDIA GeForce RTX 4090 Ti||AD102-350?||TBD||144 / 18432?||24 GB / 384-bit||24 Gbps / 1.15 TB/s||~600W||1x 16-pin||TBD|
|NVIDIA GeForce RTX 4090||AD102-300?||PG137/139 SKU330||128 / 16384?||24 GB / 384-bit||21 Gbps / 1.00 TB/s||~450W||1x 16-pin||Q4 2022|
|NVIDIA GeForce RTX 4080||AD103-300?||PG136/139-SKU360||76 / 9728?||16 GB / 256-bit||23 Gbps / 760 GB/s||~340W||1x 16-pin||Q4 2022|
|NVIDIA GeForce RTX 4080||AD103/104?||PG141-SKU340/341||TBD||12 GB / 192-bit||23 Gbps / 552 GB/s||~285W||1x 16-pin||Q4 2022|
|NVIDIA GeForce RTX 4070||AD104-400?||PG141-SKU331||60 / 7680?||12 GB / 192-bit||21 Gbps / 504 GB/s||~285W||1x 16-pin||Q4 2022|
|NVIDIA GeForce RTX 4060 Ti||AD104-180?||TBD||48 / 6144?||10 GB / 160-bit||17.5 Gbps / 350 GB/s||~275W||1x 16-pin||Q1 2023|
|NVIDIA GeForce RTX 4060||AD106-300?||TBD||31 / 3968?||8 GB / 128-bit||17 Gbps / 272 GB/s||~235W||1 x 16-pin||Q1 2023|
NVIDIA introduced the C.U.D.A. (compute unified device architecture) parallel processing core unit in 2007. These C.U.D.A. cores are capable of processing and streaming graphics. The data is sent through S.M.s (Streaming Multiprocessors), which then feed parallel C.U.D.A. cores through the cache memory. The C.U.D.A. cores process each instruction as they receive it. Even though it was powerful, the R.T.X. 2080Ti had only 4,352 cores. The RTX 3080 Ti is even more powerful, with a staggering 10240 CUDA cores. This is just 200 less than the 3090. N.V.I.D.I.A. GeForce RTX 4090 ships equipped with 16384 cores, and the RTX 4080 comes with 9728. But that’s not the entire story. This is where we’ll return to.
So, who is Ada Lovelace? And why am I constantly hearing her name regarding this card? You may not have realized it, but NVIDIA uses code numbers for its processors. They name them after well-known scientists. Kepler and Turing are just a few examples. Fermi, Fermi, and Maxwell are others. Ada Lovelace is the most recent. NVIDIA acknowledges the contributions of these people to some of the most significant technological advances humanity has ever seen. So this is a nice nod. If it sends you down a scientific rabbit hole, that’s mission accomplished.
The new generation of technology for GPUs is available in the 4000 series cards. While we’ll be getting to Tensor cores and R.T. cores later, the core of the next generation is capable of delivering faster and more efficient performance in three areas: streaming multiprocessing and A.I. performance. N.V.I.D.I.A. claims that the core can double its performance in all three areas. These claims are easy to quantify and test, which is why I love them.
What’s a Tensor Core?
Another example of “but why should I need it?” in GPU architecture is the Tensor Core. NVIDIA’s Tensor Core technology was first used in data centers and high-performance supercomputing before it finally made its way to consumer cards using the 20X0 series cards. The RTX-40-series is the fourth generation. The 2080 Ti shipped with 240 Tensor cores of the second-gen, while the 3080 Ti came with 320. The 3090 came with 328—the new 4090 ships with 544 Tensor Cores, and the RTX 4080 with 304. But, again, this is only a tiny part of the story. So I ask you to add a pin to that thought. What do they do?
Simply put, Tensor cores power technologies such as D.L.S.S. The RTX 4090 includes D.L.S.S. 3. This is a significant leap over the 3000 series cards and is mainly responsible for NVIDIA’s massive framerate improvements. We will test it to determine if the improvement is due to D.L.S.S. or the hardware itself. This is crucial since not all games support D.L.S.S. However, this may change.
Adoption was one of the problems DLSS 1 & 2 had. Studios had to make every effort to train the neural network to read images and decide what to do next. The results were terrific, with 2.0 producing cleaner images than the source image. However, getting the game community to adopt this technology would still be necessary. Some companies embraced it, and beautiful visuals were produced in games such as Metro: Exodus and Shadow of the Tomb Raider. You can say what you like about the game, but visuals weren’t the problem. Nevertheless, growth would be slow if engines were not adopted at an engine level. That’s precisely what they did with DLSS 3.
DLSS 3 is a unique feature on the 4000 series cards. The DLSS 2.0 upgrade will be available for all cards before this new installment. This is because the 4000-series cards contain the most advanced cores, namely the 4th-Gen Tensor Cores and the new Optical Flow Accelerator. Although this may sound disappointing, there is some light at the end. D.L.S.S. 3 is now being supported at the engine level by Epic’s Unreal Engine 4/5 and Unity. This will allow for the future release of the vast majority of games. Although I need to figure out what developers need to do, having the information at a deeper level will help. Here is a list of games that support DLSS 3.
DLSS 2.0 is supported by nearly 300 games, with 39 already supporting DLSS3. They are also natively compatible with the Frostbite Engine (The Madden, Battlefield, Need for Speed Unbound, and the Dead Space remake), among others. Unity (Fall Guys and Among Us), Cuphead, Genshin Impact (Pokemon Go, Ori, the Will of the Wisps, and Beat Saber), Unreal Engine 4 & 5 (The next Witcher, Warhammer 40,000, Darktide, Hogwarts Legacy (Loopmancer, Atomic Heart) and many other unannounced titles). We will likely see a significant increase in titles that support native engine support in the future.
D.L.S.S. stands for Deep Learning Super Sampling. It’s done by the method described in the breakdown. Deep learning computers powered by A.I. will analyze a game’s frame and then supersample it to increase its resolution and speed up the process. DLSS 1.0 & 2.0 used a technique that analyzed a frame and projected the next frame. The whole thing continues like this throughout your game. DLSS 3 does not require these frames. Instead, it uses the new Optical Multi Frame Generation to create new frames. It is not just adding pixels to the scene but also reconstructing parts of it faster and more efficiently.
DLSS 3’s A.I. is reconstructing 3/4 of the first frame and the entire second frame. Then, it reconstructs 3/4 of the third frame and the whole fourth. DLSS 3 uses these four frames and data from the Optical Flow Accelerator to predict the next frame based on where objects are located in the scene and where they are going. This method generates 7/8ths of a scene with only 1/8th of the pixels. The predictive algorithm is nearly invisible to the human eye. This method calculates light, shadows, and particle effects. I will also show you the remarkable improvements in benchmarking.
It’s not all sunshine, rainbows, and D.L.S.S. However, according to a Eurogamer report, some of this is early-adopter woes. Some scenes showed ghosting and artifacts, according to the researchers. Although I disagree with their findings, it is something that I have yet to see in my tests on the 4090 and 4080. It could be a matter of game patches, updated drivers, or blind luck. I have not had such experiences. You may have different mileage, but you likely won’t notice the insanely high framerate or resolution. We’ll show the non-DLSS framerate rasterized in our benchmarks if it causes heartburn. Let’s move on.
Although the boost clock is not new, it was introduced with the GeForce GTX 600 series. However, it is an essential part of getting every frame from your card. The base clock, which is the card’s “stock” speed at any given moment, regardless of the conditions, is what you can expect. The GPU can adjust the speed dynamically using the boost clock. This allows for more power to be used if necessary. The boost clock for the R.T.X. 3080Ti was 1.66GHz. A few 3rd-party cards were capable of overclocking speeds up to 1.8GHz. The RTX 4090 ships with a boost clock of 2.52GHz, and the 4080 comes close behind at 2.4GHz. Although you don’t have to, overclockers have shown plenty of headroom in the 4090, and the 4080 will be no different. It’s the right time to discuss power.
The RTX 4090 is a powerful piece of tech. However, to power it, you will need 100 watts more power than the RTX 3090. NVIDIA recommends a power supply of at least 850W for the 4090, which has a TDP value of 450W. The RTX 4080 follows suit with a recommendation for 800W for your PSU and a TDP 400.
You will need a separate adapter for your PCIe Gen5 power supply. This means that you won’t be using more than one power lead. You don’t have to worry if your Gen 4 power supply is unavailable. However, the adapter can be used. You can connect three 6+2 PCIe cables to power your card. However, if you want to overclock, you need to connect four. Overclocking should be possible if the card has the same headroom as the predecessor. However, that is beyond the scope of this review. There will undoubtedly be many people who feel the need to push this beast beyond its natural limits. Dear reader, I suggest that you first read this review about benchmarks. You’ll see that overclocking doesn’t make sense for you.
Many videos have been made about the durability of the adapter cable since the launch of the 4090. Tech experts have bent the adapter cable to a breaking point, causing it to malfunction. This level of testing is pointless, as the same outcome would be achieved if you used a PCIe Gen4 power supply cable. The cable’s limitations have been brought up by manufacturers, who pointed out that it is designed to be unplugged and plugged in approximately 30 times. That’s a little bit of time for press people like me. I can guarantee that I have unplugged my 4090 more times than that over the past 30 days as I test different M.2 drives and video cards. This is likely to happen once by you, the consumer when you insert the card. Despite all my abuse, I still have pins intact on my adapter. To be remembered, the same plug/unplug recommendations were used for the previous generation. Nobody cried then. What I am trying to say is “moving forward.”
The RTX 3080 Ti and 3080 Ti used GDDR6X. This is because it has a vastly expanded memory pipeline, providing the best bandwidth. The GeForce GTX 3090 Ti had 24 G.B. of memory and a memory lane 384 bits wide. This allows for more instructions than the traditional GDDR6 that you would find in a GeForce RTX 3090 Ti’s 256-bits wide memory lane. The RTX 4090 has the same pipeline width as the RTX 4080 and 24 G.B. of GDDR6X. The RTX 4080 is 16 G.B. in size and has the same 21.2GB/s throughput as its larger brother. This will likely change as we move on to the 4000 series cards. However, these cards are still the best.
Shader Execution Reordering
Shader Execution Reordering is a new tech feature only available to D.L.S.S. 3. Shader execution reordering will make it easier to process shaders on a 4000 series card. Shader objects, which calculate light and shadow values and color gradients, are currently processed in the order they were received. This means you’re doing many tasks in an out-of-order order as to when the engine will consume them. Although it works, it could be more efficient. These can be reorganized using Shader Execution Reordering to deliver them in a sequence compatible with similar workloads. This can have a net effect of up to 25% in framerates and up to 3X in raytracing operations. You’ll see this in our benchmarks later in the review.
What’s an R.T. Core?
The RT core is perhaps the most confusing aspect of the R.T.X. Series. This core is the dedicated pipeline to the streaming multiprocessor (S.M.), where light rays, triangle intersections, and other calculations are made. It is the math unit responsible for making real-time lighting look and work at its best. Multiple R.T. cores and S.M.s work together to interleave instructions. This allows for multiple light sources to be processed simultaneously, processing many different objects in the environment in multiple ways. It means that a group of graphic artists and designers can place lighting and shadows manually and then adjust the scene according to light intensity and intersection. With R.T.X., they can place the light source anywhere in the scene and let it do all the work. Of course, it’s possible to simplify it, but this is the basic idea.
The RT core powers your real-time lighting processing. You’ll hear this when people talk about “shader processing speed.” For comparison, the RTX 3080 Ti could process 34.10 TFLOPS of light and shadow processing, while the 3090 could handle 35.58. What about the RTX 4090? 82.58 TFLOPS across 120 cores.
The 4090 is the most distinguishing feature between the 4080 and 4090. Here it is. The RTX 4080’s 76 R.T. cores deliver 48.74 TFLOPS of light and shadow processing power. The 4090 doubled its processing power with an additional scoop, but the 4080 almost doubled it. It’s reasonable to compare the M.S.R.P. with the price difference between the 4090 and the 4080, which is only 25%.
The Turing architecture cards (20X0 series) were the first to implement this dedicated R.T. core technology. The 2080 Ti featured 72 R.T. cores and 26.9 Teraflops in throughput. The R.T.X. 3080Ti has 80 R.T. cores – just short of the 82 cores found on the RTX 3090. The RTX 4090 has 128 RT Cores, and the 4080 110. However, these cores are next-generation, just like the Tensor or Cuda cores. Let’s now talk about the combined performance of all these new cores.
This time, we will take a different approach to benchmarks. You’ll see why when you examine the first graph. First, it is so powerful that it is impossible to measure at less than 4K. The graph scales so absurdly that it’s impossible to compare it with the highest end of any previous generation. These first games combine DLSS 2.0 with rasterized titles. For your reference, here are the 4090 averages:
Metro: Exodus, Wolfenstein, Rise of the Tomb Raider, and Wolfenstein were among the first to adopt Deep Learning Super Sampling. We can see the longer maturity window’s effects here. We will focus on the large spires because the scale is so confusing before we can see a more straightforward graph.
I created a new class for modern titles to push the new generation of cards. It also combines native rasterized 4K performance with D.L.S.S. 3. It’s easy for anyone to see that a title supports DLSS 3. The scale of the graph increases dramatically when this happens. We are bringing Flight Simulator 2022, Cyberpunk 2077, A Plague Tale: Requiem, F122, the Unity demo Of Enemies, and Lyra Unreal Engine 5 demo products. Loopmancer has been added to the mix, and F.I.S.T. This time. We’ve added Forged in Shadow Torch to the mix. We loop back for a 4090 as a reference.
These images were taken with all settings set to maximum, all R.T.X. options activated, Reflex Boost on, and at 4K resolution. The results were terrific. These were run multiple times, and I was stunned by the results. Let’s compare them with D.L.S.S. 3 enabled.
We had fun with the RTX3080 and 3080Ti to compare our benchmarking suite against 4090 and 4080. Even though they are absolutely out of the race, these two flagship cards can still run in the same race. But, of course, these are just the average numbers. We’ll get into the question of the effect D.L.S.S. 3 has on the race in a second.
One thing is now crystal clear: you can see more detail. The RTX 4090 is a combination of the power of both the RTX 3080 Ti and the RTX 3080. The 4080 delivers very similar results and has no D.L.S.S. 3 support – it only has native 4K rendering. It is amazing.
YouTube will show you many channels asking, “Is it worth it?” With red arrows and scrunched faces. Take a moment to look at the graph again. You would be crazy not to enable D.L.S.S. 3 in the vast world of sports. Although you could argue that you are putting frames on the table by not enabling DLSS3, it is worse. You are wasting hundreds of frames at no additional cost if you don’t enable D.L.S.S. 3. Turn it on now!
I noticed the fans were not spinning when I looked at the 4080 in my bag. It had cycled down entirely, so the heat pipe passthrough systems have been working well. However, it still sounds like the 4080 when it starts up to play a game. As some might be curious, I pulled out my audiometer to check where the cards were sitting at 41dB when loaded. According to I.A.C. Acoustics, this is often associated with “quiet library sounds.” My case is approximately 2 feet from my seat, so I should hear this card spin up. But even with the case off, it doesn’t.
This card will undoubtedly be compared to the 4090, so the price is essential. The 4090 and 4080 have a 10-15% performance drop, while the 4080 has a 25% price reduction. Although some of this is attributed to Windows 11’s garbage code — we’ll be rerunning these on Windows 10 shortly. As such, the 4090 was shipped at $1599, and the 4080 will retail for $1199. It is roughly the same price as AMD’s Radeon RX7900 XTX with an M.S.R.P. of $999. Although I don’t own any AMD cards, they suggest 72fps for Cyberpunk using their slide deck. However, they don’t specify the settings. I got an F.P.S. average for the 4080 with D.L.S.S. at 101 and 70 with R.T. at maximum settings. These two seem to be going head-to-head.
It is important to remember that these prices are driven not by greed but by market realities. T.S.M.C., a chip manufacturer, has raised its prices across the board. They supply chips to N.V.I.D.I.A. and AMD as well as N.V.I.D.I.A. and Intel. This price is likely the tip of the iceberg. While I do not intend to make any value judgments for you, I will acknowledge that this price is higher than the 4090. It is essential to be clear about how much this price will get you.
I said that I love the borders between generations. I’ll explain why as we get to the end of the review. There are many ahhs and ahhs about shiny new gear. But these moments are the beginning of improvements. D.L.S.S. 3 was new when we benchmarked it against the 4090. We’ve seen major technology updates and a significant increase in the number of games that support the new technology since the launch of the 4080. DLSS 2.0 continues its growth. DLSS 2.0 will become more important as we get to lower-end cards. It will make the difference between running them at high framerates or not.
To predict the age of this card, we can look back at some historical data. We can see that the RTX 3080 T’s scores at launch, and again today, have improved by 30-40%. The 4000 series is now in its next generation. There’s also an abundance of support for AI-driven innovations.