The Alliance for Open Media (AOMedia) recently officially released the first final specification version 1.0.0 of the AV2 video encoding standard, marking that this new generation of open source, royalty-free video encoding format, regarded as the successor to AV1, has entered a stable stage, allowing the industry to develop standard-compliant encoders and decoders, and perform long-term optimization around fixed specifications, without having to worry about compatibility issues caused by future versions. AV2 is expected to further compress the bit rate and reduce bandwidth costs in streaming media and ultra-high-definition video scenarios. However, industry insiders also warn that its decoding complexity is much higher than that of the existing AV1, and it will face considerable challenges in terminal device support and hardware acceleration.

AOMedia was founded in 2015. The AV1 encoding standard released in 2018 was designed as a royalty-free video format that can compete with H.265 (HEVC). After several years of development, although its popularity is not fast, it has gradually gained a foothold in online videos and cloud video services. YouTube has been experimenting with AV1 encoding since 2018, followed by Netflix introducing AV1 streaming on its Android mobile app in 2020, and Amazon launching AV1 real-time encoding support through AWS Elemental media services in 2024. At the same time, manufacturers such as AMD, Intel and NVIDIA have successively added accelerated support for AV1 decoding and encoding in their hardware and drivers, making the terminal playback experience of this format gradually mature.

Compared with AV1, the core selling point of AV2 is to significantly reduce the bit rate under the same subjective image quality conditions. According to official evaluation data published by AOMedia, under different objective indicators (such as PSNR, etc.), AV2 can achieve an average bit rate reduction of approximately 30%–34% compared to AV1, while maintaining the same visual quality. For streaming media platforms, this means that while ensuring user viewing experience, AV2 can complete video transmission of the same quality with lower bandwidth, thereby further compressing content distribution costs. Especially in high-resolution, high-dynamic-range video scenarios such as 4K, 8K, and HDR, the bandwidth and storage savings will be even more considerable.

The comparison provided by AOMedia shows the difference in compression fidelity between AV1 and AV2 in terms of peak signal-to-noise ratio (PSNR). PSNR is a commonly used mathematical indicator that measures the difference between the compressed video image and the original signal. Higher values ​​generally mean that more details are preserved. In these tests, AV2 was able to output at a significantly lower bit rate under the same PSNR conditions, reflecting its comprehensive upgrade in encoding tools and algorithms.

To achieve the above-mentioned compression gains, AV2 introduces a number of improvements in coding technology, including more advanced intra-frame prediction and inter-frame prediction methods, more sophisticated motion modeling, more complex transformation and filtering tools, a more flexible block partitioning structure, and an upgraded entropy coding mechanism. These changes increase the decision-making dimension and algorithm freedom on the encoding side, allowing the encoder to more accurately match video content characteristics, thereby compressing more redundant information at the same image quality.

However, higher compression efficiency does not come without a price. According to the evaluation of VideoLAN project leader Jean-Baptiste Kempf, the computational complexity of AV2 on the decoding side has increased significantly compared to AV1. It is currently estimated that the decoding complexity of AV2 is about five times that of AV1, which will make it very difficult for a large number of existing CPUs to smoothly decode AV2 videos through pure software. In the absence of extensive hardware decoding acceleration support, ordinary terminal devices may be unable to bear the additional burden brought by AV2 in terms of power consumption, heat generation and smoothness, thus slowing down its implementation on browsers, TV boxes and mobile devices.

Because of this, Kempf believes that before mainstream chips and platforms provide complete AV2 hardware acceleration, there are still many uncertainties in the prospects for large-scale adoption of this standard. In other words, whether AV2 can repeat the development path of AV1 and be actually deployed on large streaming media platforms such as YouTube, Netflix, and Amazon will largely depend on whether and when CPU, GPU, and SoC manufacturers are willing to devote sufficient hardware support resources to it.

Currently, with the official implementation of the AV2 1.0.0 specification, AOMedia and its members can already promote the implementation and optimization of encoders, decoders and related development tool chains under a unified standard. The industry generally expects that in the next few years, experimental support around AV2 will first appear in open source players, browser experimental versions, and some cloud transcoding services. Real commercial applications for mass users still need to wait for the hardware ecosystem to mature and mainstream content platforms to give clearer adoption routes.