As the requirements for picture accuracy of 3A masterpieces continue to escalate, the size of high-precision material and texture packages has increased. The once mainstream 8GB video memory graphics card now frequently encounters the dilemma of exploding video memory, screen freezes, and forced image quality degradation, and is also ridiculed by many players as a "disabled card."NVIDIA’s new RTX neural texture compression technology (NTC for short) may be able to completely change this situation..

Recently, Tom's Hardware completed a special test of this technology on multiple graphics cards and all platforms. NTC is an AI driver technology released with the RTX50 series graphics cards.Relying on the graphics card Tensor core (the built-in AI acceleration computing core of NVIDIA graphics cards) to complete texture compression and decompression, it can reduce video memory requirements by up to 80%, with the highest measured reduction reaching 85%. At the same time, the picture quality is better than the traditional compression scheme that has been used in the game industry for many years.This result also gives players real expectations for "8GB graphics cards to fight for another ten years."
From a technical perspective, NTC is a texture compression and decompression solution based on machine learning, and is also one of the core technologies of NVIDIA's new neural shading rendering paradigm. It breaks out of the fixed limit of 4×4 pixels of the traditional BCn block compression format (the traditional block texture compression standard commonly used in the game industry), the original texture is converted into a combination of small neural network weights and latent features during the compression stage.
To be clear,NTC is a deterministic decoding technology, not a generative AI, and there is no risk of AI illusion..
In order to adapt to different levels of hardware, NTC provides three operating modes under the DirectX 12 interface. Another mainstream interface, Vulkan, only supports two of them because it does not have corresponding supporting functions (Inference on Feedback is not supported).
The first is the load-time inference mode (Inference on Load).It will complete the decompression of NTC textures in the GPU during the game or map loading phase, and simultaneously transcode it into the traditional BCn format. This mode has exactly the same rendering performance as native BCn textures, without any performance overhead in the rendering phase. It can also significantly reduce the game's disk footprint and PCIe bus transmission pressure. The only drawback is that it cannot reduce the video memory usage during runtime.
The second is the Inference on Sample mode, which is also the core form of neural texture compression in public perception, and it is the mode with the strongest memory compression capability..It will decode the currently required pixel data in real time through a pre-trained multi-layer perceptron (referred to as MLP, a lightweight small neural network) during texture sampling, ultimately achieving up to an 85% reduction in video memory usage.
The third is the inference on feedback mode (Inference on Feedback), only supports DirectX12 interface. It will feedback through the sampler (DirectX12 exclusive graphics function, which can accurately identify the texture blocks needed to render the current picture), and only decompress the texture part needed to render the current picture. It is a compromise between the first two modes. The memory reduction is not as good as the sampling inference mode, but the performance overhead is lower, and the overall performance is between the two.

Tom's Hardware completed the quantitative test using the Intel Sponza standard scene commonly used in the industry. The measured data completely matches the official nominal compression capability. The texture memory occupied by the original lossless reference material is 6830MB. After transcoding the texture into BCn format in Inference on Load mode, the video memory occupied is 2041MB.
In the inference on sample mode, the texture memory occupied is only 303MB. Achieved more than 85% of texture memory usage. Compared with the original lossless reference material, the video memory decrease is more than 95%.
At the same time, actual measurements show that the picture effect in this mode is closer to the original reference material than the transcoded BCn texture, and can almost be reproduced perfectly.In NVIDIA's official Tuscan villa scene test, the texture memory usage under the same image quality can be directly reduced from 6.5GB in the traditional BCn format to 970MB in the NTC format..
The testing team covers a variety of NVIDIA graphics cards from flagship to entry-level, as well as notebook mobile platforms.The core test indicator is frame time (the time required to render a single frame, the lower the value, the higher the smoothness of the picture).
At 4K resolution, the RTX 5090 uses the sampling-time inference mode with TAA (temporal anti-aliasing, a mainstream picture smoothing technology used to eliminate picture aliasing and improve picture purity). The frame time is only 0.09ms higher than the zero-overhead load-time inference mode, and the performance loss is almost negligible.






Under the adapted 1440P resolution of the mainstream RTX 5070, the frame time overhead of this mode is between 0.50-0.70ms. The entry-level RTX 5060 has a stable frame time overhead of 0.60-0.70ms at the adapted 1080P resolution.Even for the notebook-side RTX 4060 mobile graphics card (8GB video memory), the frame time overhead at 1080P is only 0.70-0.85ms.


The test team also made it clear that the test scene only included basic forward rendering and anti-aliasing processes. Actual 3A games have a large number of rendering passes that are not affected by NTC, so the relative performance loss of this technology in actual games will be lower than the test data.
For an 8GB video memory graphics card, as long as the basic frame rate of the game is sufficient, exchanging a small performance overhead for non-degraded texture quality is a real net benefit.
This technology also has clear usage thresholds. When sampling, the inference mode must turn on stochastic texture filtering (STF for short, which is used to optimize texture quality and reduce picture defects). Turning off anti-aliasing will produce picture noise. DLSS can completely eliminate this type of noise, while TAA can only complete most of the cleaning and cannot completely eliminate it. Therefore, this mode is first recommended to be used with DLSS.
NTC technology developer and NVIDIA senior engineer Alexey Panteleev said that the sampling-time inference mode is more suitable for high-performance graphics cards, and the load-time inference mode can cover all platform hardware. Game manufacturers can choose whether to enable NTC per single texture, and can also give players open mode options, allowing users to decide based on their own hardware conditions.
It is worth mentioning that NTC is not an exclusive technology of NVIDIA. It is compatible with the AI acceleration units of AMD and Intel graphics cards. Industry sources say that Sony PS6 host is also expected to adopt similar technology.
There are currently no games that officially support this technology, but the industry-wide layout has been implemented, and large-scale commercial use is just around the corner. It not only allows old graphics cards with small memory to be reborn, but also opens up a new technical direction for real-time graphics rendering.
