AV1, Blog Posts

Screen Content Coding in AV1 Explained with Results

screen content encoding in AV1 Aurora1 Visionular Ai-driven video compression
Zoe Liu

President & Co-Founder, Visionular

Summary: The Screen Content Coding tool in the AV1 video codec is a powerful tool for compressing videos with a lot of text, windows, graphics, etc. found during screen-sharing. These objects have peculiar features like sharp edges, text, details, and texture that can be compressed very well using AV1’s Screen Content Coding (SCC) tools. 

Real-time communication protocols like WebRTC have a vital need for efficient video compression, especially when it comes to screen content. Screen sharing presents certain challenges that are markedly different from traditional video: frequent changes between slides, windows, etc., sharp edges, and repetitive elements.

Thus, it is necessary to use specific techniques of encoding to achieve the best compressibility and the highest quality of images.

This article is dedicated to AV1’s Screen Content Coding extensions. We will analyze how this special toolset leverages IBC or Intra Block Copy and Palette Coding to reduce bitrates considerably, while keeping the quality of the image surprisingly high. We will also provide comparisons between other traditional codecs and AV1 SCC, which will reveal the benefits of the latter. Finally, we will share a detailed look at the Aurora1 AV1 encoder and its highly effective implementation of the SCC.

What is Screen Content Coding or SCC?

Before we dive into how AV1 compresses screen content, let’s understand what we mean by screen content and what’s so unique about it.

Screen content is that which is displayed on a computer screen and is typically  captured by a browser or transmitted in real time by real-time communication tools. It is unique and differs a lot from standard video. For example, if someone is sharing their screen and displaying a PowerPoint presentation, then

  • Almost every pixel will change when switching from one slide to the next.
  • And practically no changes occur once on that slide until the following slide change. 

Screen content also features limited colors and graphics with sharp edges and highly repetitive patterns.

For example, alphabetical and alphanumeric characters may appear multiple times on the same frame (slide) with the same font and size. This implies that a frame rate as low as 5 FPS may be adequate to represent the shared screen, in contrast to standard webcam video that requires 30 FPS or gameplay content requiring 60 FPS. 

Video Codecs and Screen Content Coding

The application of AV1 in WebRTC is driving some of the largest-scale commercial implementations in real-time communaction today. WebRTC supports four encoder implementations, namely VP8/VP9 (libvpx), OpenH264 (H.264 codec), and libaom RT AV1 (libaom real-time version).

Note that VP9 is the zero-version of AV1, meaning libaom grew out of libvpx. A notable addition to the WebRTC encoder implementations is our Aurora1 AV1 encoder.

When we consider that libaom RT inherits features directly from libvpx VP8 and VP9, which have been deployed for WebRTC use cases, it’s easy to see how AV1 fits the RTC application, starting with screen content coding (SCC) applications. AV1 SCC tools yield much more significant bitrate reduction with better quality due to IntraBC and Palette modes. Cisco WebEx was among the first to announce AV1 support replacing the aging H.264 codec.

AV1's Support for Screen Content Coding

AV1 is the first video codec standard that natively includes Screen Content Coding (SCC) tools, with over 100 tools to address multiple encoding situations. These tools are included in its main body, meaning that every AV1 decoder must support the SCC features to be compliant. Other codecs specify SCC coding tools, but only in their extensions, meaning that not all decoders support them (what is a video codec?).

See the quality impact that AV1 SCC tools can have, where x264 requires 800 kbps while Aurora1 (AV1) needs just half the bits (400 kbps) to produce demonstrably sharper image quality.

Combining AV1’s core coding tools with AI-driven video compression delivers results for screen content coding with 50% fewer bits and noticeable higher quality. Applying the same algorithms for standard video content to screen content is not effective, as the trade-off between quality and encoding efficiency is not the same for screen content.

Our team has worked hard to ensure that the Aurora1 AV1 encoder dynamically applies the best tools to screen content with a combination of the following tools:  

  • Adaptive early terminations for screen content,  
  • Categorization and identification of screen content-specific motions, such as sliding and scrolling,  
  • Optimized IntraBC motion search,  
  • Palette coding speedup, 
  • SCC specific motion estimation optimization, 
  • Optimized hash matching. 

Here are a couple of coding tools in the AV1 video codec that are particularly helpful in SCC.

Frame-partitioning

Since AV1 grew out of VP9, let’s compare their abilities to partition a frame. VP9 offers four partitions (it can split a single frame into four squares and analyze them separately). In comparison, AV1 has ten partitions, allowing the encoder to process different image parts more granularly. 

Single and Compound Motion Compensation Modes

A tool that the Aurora1 AV1 encoder leverages is single and compound motion compensation modes. The new single motion mode called “Warped Motion” allows you to detect motion using four parameters. Altogether, there are 128 different methods to detect motion when you switch to compound mode, enabling Aurora1 to stabilize an image and reduce unnecessary movement, reducing the number of constantly changing pixels. 

Next, lets compare the coding tools in different codecs to see how they perform when tasked with compressing screen content.

Results: Comparing Codecs for Screen Content Coding

Take a look at Figure 1 below, where Aurora1 is compared to x264 with 1080p30 screen content. Testing was performed on an Intel Core i7 processor with both encoders configured to use a single thread at 100% utilization.

As seen in Figure 1, using four videos, Aurora1 achieves a BD-rate (Overall PSNR) savings of 81.25% with a significant quality improvement (BD-PSNR of 13.95dB) while maintaining a constant FPS of 46 or 41 with SVC turned on. This result is 8x faster than required for screen content video, as many platforms encode screen content at as low as 5 FPS.

NOTE: In all comparisons, we used the following command options for x264:

ffmpeg -threads 1 -r 30 -s 1920×1080 -c:v libx264 -x264-params bframes=0 -tune zerolatency -preset superfast -threads 1

In Figure 2 below, Aurora1 is compared to Open H264. Across a wide test set of videos, Aurora1 achieved a greater than 50% bitrate reduction while operating just 38% slower. This shows that video engineers can confidently switch from the aging and less efficient Open H264 encoder to AV1 so that they can enjoy better quality and a tremendous reduction in bandwidth.

Screen Content Coding in AV1 Explained with Results

Let’s take a look at how Aurora1 compares to other AV1 implementations.

In Figure 3 below, Aurora1 is compared to libaom RT. Across a wide test set of videos, Aurora1 achieved a greater than 50% bitrate reduction while operating 14% faster than libaom RT.

For WebRTC applications that currently leverage VP8 or VP9, and where AV1 is on the roadmap, engineers now have a solution that operates with greater speed while providing a significant efficiency advantage over H.264 and libvpx implementations.

Screen Content Coding in AV1 Explained with Results

Conclusion

Enabling SCC in Aurora1 can reduce screen content bitrates by more than 50% or up to 500kbps, which is impossible with any other video codec standard, or AV1 encoder implementation, including libaom RT. 

What about speed? A common concern with AV1 and any next-gen codec standard is speed. While this argument has been around for some time, it’s somewhat outdated. Read this post to learn about how AV1 speed has improved. WebRTC testing across various platforms, including the cloud (data center), desktop, and mobile, was conducted using the following settings and operational conditions: 

  • Video camera output is encoded at 24 FPS with screen content at 12 FPS. Resolutions between 720p and 1080p.
  • With standard screen content, Aurora1 preserved the original quality at 1080p and 100kbps. During intense screen content motion, the bitrate rarely exceeded 500 kbps.
  • CPU usage was reasonable, enabling smooth playback even on entry level i5 PCs.
  • With Aurora1’s Scalable Video Codec (SVC), the bitrate was 35% of OpenH264 or VP9 at the same visual quality.

AV1 offers an exciting set of tools for optimizing content for real-time delivery using WebRTC. With the Aurora1 encoder, you can achieve lower bitrates at the same quality while requiring less processing power, making AV1 a realistic option for any WebRTC application.

Related Posts