With the industry’s first five-nanometre Application Specific Integrated Circuit (ASIC) -based media accelerator, Advanced Micro Devices has set the course for a new course in a content-hungry world of interactive media services. Girish Malipeddi, director of product management and marketing at AMD, spoke to Yashasvini Razdan from Electronics For You, explaining the need for efficient media accelerators in a world dominated by GPUs and CPUs and what it means for India.
Q. What are the key features and benefits of the Alveo MA35D media accelerator for video transcoding workloads, and how does it compare to other hardware solutions AMD offers?
A. The Alveo MA35D, our industry-first 5-nanometer ASIC-based media accelerator, is designed to handle interactive streaming at scale, making it ideal for high-volume video transcoding workloads. This purpose-built chip incorporates all the necessary functionality for efficient cloud-based video streaming.
As for its capabilities, the Alveo MA35D fully supports widely used encoding and transcoding frameworks like FFmpeg and GStreamer, along with popular libraries such as LibAV. This compatibility allows software engineers to migrate their applications to our platform using these APIs easily. Additionally, we offer a lower-level C/C++ API for developers who prefer to create their own software stack or framework, ensuring portability across different hardware platforms and catering to various needs.
Comparing our offerings at AMD, while AMD EPYC CPUs provide excellent video quality at low bitrates for applications with low stream counts and acceptable latency, GPUs are more relevant for graphics-intensive applications or multi-application scenarios. However, for pure video transcoding workloads requiring high density, low latency, low cost, and low power consumption, the ASIC-based Alveo MA35D media accelerator is the perfect fit. It excels in scenarios like single-stream to millions of users, software-based live sports streaming, interactive media, cloud gaming, and live e-commerce.
Q. Why did you incorporate ASIC architecture instead of the usual FPGA architecture to build this chip?
A. After successfully launching our first-generation FPGA-based Alveo U30, we gained valuable insights and experience in the media accelerator family. While the Alveo U30 offered flexibility with its programmable logic, it became evident that higher scalability, lower power consumption, and cost reduction were essential for the next-generation product.
To achieve greater scale, density, and power efficiency, we made the strategic decision to build a custom ASIC for this application(MA35D). This choice aligns with the principle observed in traditional AI workloads, where GPUs outperform software solutions. The ASIC design incorporates multiple hardened IP blocks, including the ABR scaler and compositor, enabling seamless scaling of video streaming. While this venture represents our first foray into ASICs, it is driven by the realisation that as video workloads expand, customised solutions become crucial for optimal performance.
Q. Does this mean that the ASIC solutions are better than FPGA solutions?
A. The superiority of ASICs or programmable logic depends on the specific applications and workload requirements. Custom ASIC solutions generally offer superior price points, density, and scalability when the workload is well-defined and understood. They become particularly advantageous when dealing with higher volumes and fixed functionality.
On the other hand, with their programmable logic, adaptive computing devices like FPGAs provide flexibility for applications where different functionalities may be required at the endpoints. FPGA and adaptive System-on-Chips (SoC) offer a more versatile and adaptable architecture, making them suitable for various use cases.
The choice between ASICs and adaptive computing depends on the workload knowledge level and the application’s specific needs.
Q. How does your media accelerator handle video decoding/encoding? What formats and codecs does it support?
A. The Alveo MA35D, an Application Specific Integrated Circuit (ASIC), incorporates two ASIC-based purpose-built video processing units on a single card, revolutionising live interactive streaming services at scale. Each chip features four encoders that support 4K 60, or lower resolution streams. The top two encoders are multi-format, providing support for HEVC, H.264, and AV1, while the bottom encoder is dedicated solely to AV1.
Remarkably, with a power consumption of under 35 watts, the Alveo MA35D can encode at incredibly fast speeds, as fast as eight milliseconds for 4K content. In comparison, typical software solutions or CPUs require approximately 25 watts per stream. This efficiency translates to a low cost per stream, significantly reduced power consumption, and improved video quality at lower bitrates. Service providers can leverage these advantages to optimise their video transcoding infrastructure and deliver high-quality, low-latency streaming services more efficiently.
Moreover, the media accelerator’s exceptional density also reduces the cost, power, and space requirements for video transcoding in data centers. By utilising this hardware solution, cloud providers can enhance their video transcoding capabilities, supporting up to 32 AV1 and 16 H.265 (HEVC) or H.264 (AVC) 1080p60 streams per card, all within a compact, low-power form factor.
In addition to encoding capacity, the chip also includes decoding capability to handle streams from various endpoints, crucial for applications like watch parties, where multiple streams must be decoded, combined into a composite image, and shared with participants.
Q. Why are you giving more preference and prominence to the AV1 encoding format on your chip?
A. The preference for AV1 encoding on our chip is based on the recognition of its increasing adoption and demand in the future. While AV1 adoption is still in its early stages, we anticipate significant traction over the next few years. Leading industry players, including major smart TV manufacturers like Samsung and LG, have already incorporated AV1 decoding support, and newer Android phones and laptops are following suit.
The growing popularity of AV1 is attributed to its benefits, such as a substantial 50% reduction in bitrate for streaming, resulting in cost savings for service providers. As decoding capability becomes more prevalent on endpoint devices, leveraging AV1’s efficiency and cost-effectiveness is a logical choice for the industry.
Q. Could you explain how the AV1 standard differs from the HEVC and H.264?
A. From an economic standpoint, AV1 offers a significant advantage as it is an open standard without royalties or licensing fees. This means that anyone can use AV1 without incurring additional costs. On the other hand, standards like H.264 and H.265 involve substantial royalties and patent challenges, requiring companies to pay fees for implementing these codecs.
AV1 has gained strong support from major players such as Google, recognising the growing need for efficient streaming algorithms due to the exponential increase in interactive content and media services. AV1 excels at reducing bitrate while maintaining video quality, making it a superior option compared to legacy H.264, H.265, and other codecs. Each codec has its strengths, but AV1’s open and royalty-free nature makes it an appealing choice, and many North American cloud service providers are already evaluating it for adoption.
Q. What is the difference between an open standard and open source?
A. The concept of open source is different and primarily applies to the software side of the world. Open source refers to creating software where anyone can contribute to its development and make it accessible to all users. It fosters a large community of developers who can collaborate and contribute to the software’s improvement and innovation. In open source, no restrictions or royalties are associated with using or contributing to the software.
This concept slightly differs from the discussion on AV1, which focuses on the codec being an open standard without royalties. The open-source model enables the creation and sharing of software that can be freely accessed and enhanced by a diverse community of developers.
Q. Will we see the adoption of open standards increase, and will that impact the pricing of the products in the future?
A. We believe that adopting open standards, such as AV1, which are royalty-free, will incentivise many service providers to embrace these standards. Service providers can reduce expenses and leverage open standards’ benefits by eliminating recurring royalty costs.
Regarding the impact on product pricing, the semiconductor industry has a natural trend of decreasing costs over time. This reduction is driven by factors such as increasing density, advancements in semiconductor process technologies, and the ability to pack more functionality into a given chip. While the royalty aspect is a small component of the cost reduction, the semiconductor industry has consistently witnessed significant cost reductions over decades as more functionality is added to the solutions.
Q. How is a general processing unit (GPU) for video transcoding different from a media accelerator chip?
A. There are indeed alternative solutions available in the market for video transcoding, such as GPUs and CPUs. GPUs are commonly used for video transcoding and offer scalability as the density increases. However, GPUs are primarily designed for rendering and gaming, with only a small portion of the chip dedicated to video transcoding. This can lead to disadvantages such as larger chip size, higher cost, and increased power consumption compared to dedicated video transcoding solutions like our ASIC.
In contrast, our solution provides faster scalability and lower power consumption, requiring approximately 35 watts at the server level. GPUs typically consume around 75 to 100 watts and require two slots of PCI, while our solution fits into a single slot. It’s worth noting that there are also competitors in this space, including those offering FPGA-based solutions.
Q. Can a media accelerator chip be used instead of a CPU?
A. Yes, a media accelerator chip can indeed be used instead of a CPU for video processing. CPUs have a general-purpose design and are crucial for server functionality, including video processing. They are suitable for software-based solutions due to their programmability and are still a viable choice for dealing with a single stream or a moderate number of streams reaching millions of users.
However, as the stream density increases, alternative solutions become more efficient. CPUs tend to be power-hungry, expensive, and occupy significant physical space. In contrast, dedicated video processing solutions like our ASIC can provide a more efficient approach. By adding multiple MA35D cards to a server, the capacity to handle streams increases exponentially, enabling hundreds of streams to be processed on a single server. This offers significant advantages in terms of scalability and cost-effectiveness.
Q. How do you maintain low bitrates on the chip?
A. The chip incorporates hardware blocks, including adaptive bitrate scalers, to support different resolutions for each stream. The resolution is adjusted based on the user’s available bandwidth, optimising the video quality.
A compositing engine enables mixing multiple streams, and an AI processor is utilised to enhance pixel quality using AI techniques. On-chip Quality of Experience (QoE) engines measure objective metrics like PSNR and SSIM, providing feedback to the end-user application for real-time video quality adjustment or bitrate optimisation.
Additionally, a lookahead engine is employed, allowing improved bitrate compression by analysing up to 40 frames in advance. These functionalities are connected to the host processor via PCIe Gen 5 or Gen 4. Our hardware-based approach ensures efficient and high-quality video processing, achieving significant bitrate savings compared to legacy standards like H.264. The MA35D achieves a 24% bitrate reduction for HEVC, a 47% reduction for H.265, and an impressive 52% reduction for AV1, all while maintaining or improving visual quality. AI techniques play a crucial role in enhancing pixel quality and improving the visual experience while maintaining low bitrates.
Q. Could you elaborate on the reason why you’ve incorporated AI?
A. The incorporation of AI in our product serves to enhance video pixel quality. While many discussions revolve around application-level AI, such as face recognition or cloud computing, AI has other valuable applications, including video pixel quality enhancement.
Traditional codec tools can be highly compute-intensive, making their implementation in software challenging. As codecs like H.265 and AV1 demand more computational resources, the application of AI techniques becomes essential for efficient processing.
AI techniques offer additional opportunities to improve pixel quality without significantly increasing die size or introducing latency. Techniques like regions of interest can be applied to enhance specific areas of interest in the video without affecting the overall bitrate.
Furthermore, AI can be used for super-resolution scaling, leveraging AI to scale smaller-resolution images to higher resolutions with improved pixel quality and reduced computational requirements.
These AI techniques in video codec improvement are analogous to machine learning, teaching the codec to achieve better encoding results over time.
Q. Please share any use cases where AI has improved the video experience.
A. AI has been instrumental in enhancing the video experience in various applications. For instance, in video conferencing or Zoom calls, our AR processor identifies regions of interest, such as faces, and allocates more bits to those areas while disregarding the background. This approach significantly improves the image quality of faces without affecting the overall bitrate, resulting in a better visual experience.
Similarly, AI techniques are used in gaming to reduce text artifacts and enhance specific areas of interest, improving the overall visual experience for gamers.
In content-aware encoding during live streams, such as news channels with minimal action, AI-based frame-to-frame comparison can identify areas with little change. This allows for significant bitrate reduction by dynamically adjusting the bitrate, known as variable bitrate. AI techniques play a critical role in detecting these low-action frames and optimising the bitrate accordingly.
These are some common use cases where we leverage AI techniques to improve the video experience by enhancing pixel quality and optimising bitrates.
Q. How is this product going to benefit the Indian market? What is the response you’ve received so far from India?
A. While the Indian market is still in the early stages for our current product, we have gained some experience with our older generation product. India has a significant live-stream presence, particularly in the OTT platform space. Although our current product may target slightly different markets, we have observed adoption by local players in India for our previous product, such as IPTV providers and media companies like Tata Sky and Dish TV. These companies have recognised the economic benefits of our solutions compared to existing options.
As India generates a large amount of media content, we anticipate increased demand and a focus on addressing local needs and user preferences in the future. Currently, the Indian market relies heavily on various cloud infrastructures, such as AWS, due to the absence of its data centers. However, as the market develops, local data center providers may play a larger role and adopt hardware solutions like ours to cater to local needs. We are already engaged in talks with many of these OTT platforms.
Q. How are OTT platforms utilizing your product, as there aren’t many data centers in India?
A. While data centers may not be native to India, many OTT platforms in India rely on platforms such as AWS and Google Cloud, which provide data center services in various regions, including Asia. These platforms offer media libraries and infrastructure that OTT can leverage to host and deliver content. While OTT platforms don’t necessarily need local data centers, controlling their infrastructure allows for optimal implementation and customisation.
However, the availability of cloud instances from providers like AWS and Azure offers flexibility and convenience for OTT platforms to utilise existing infrastructure. Therefore, while local data centers can be beneficial, OTT platforms can still use cloud platforms to meet their requirements effectively.
Q. What are the specific needs and challenges of the Indian market regarding media acceleration technology?
A. While the architecture for streaming is generally similar across different regions, including India, there may be differences in the type and quality of content being streamed. In India, a significant amount of standard-definition content is still being consumed. While our architecture supports standard-definition content, it is expected that as more people adopt smart TVs and the demand for high-definition content increases, there will also be a shift towards higher quality content in India.
It is important to note that the architecture we discussed earlier can accommodate various applications. While specific applications or services may be unique to the Indian market, the overall architecture remains applicable and adaptable to different regions and streaming needs.
Q. Are there any pricing or licensing considerations with respect to the Indian market?
A. The pricing for the hardware remains the same regardless of the region. There are no additional licensing fees or royalties associated with the product. The pricing is consistent globally. However, when users choose to deploy the hardware in their data centers or opt for cloud instances, there may be additional pricing models specific to those instances. Nevertheless, the manufacturer’s suggested retail price (MSRP) for the hardware remains consistent worldwide.
Q. How do you ensure accessibility?
A. AMD is a large company with an extensive network of distribution channels worldwide. We sell our products, including media accelerator chips, through these distribution channels. Additionally, we have direct channels for specific customer needs, ensuring a wide range of options for users to purchase our products, including India.
Q. Are there any expansion plans concerning business in India?
A. AMD actively promotes its product family to customers worldwide, including India. Recognising the potential for significant growth in the Indian market in the long run, we are proactively reaching out and visiting India regularly. We also aim to increase our participation in trade shows and events in India over time. Being part of a larger company like AMD allows us to leverage its existing presence and participation in trade shows in India, and we will continue to expand our involvement in the future.
Q. How do you plan to expand in India with this product, if your market is very small?
A. While the market in India may be relatively small currently, there are growth opportunities on the horizon. As more local content is created and platforms like Facebook Live, YouTube, or telecom providers such as Jio Streaming gain traction, the demand for efficient media acceleration technology will likely increase.
By catering to local content creators and providing solutions that can accommodate their growing needs, we aim to foster adoption and expand our presence in the Indian market. Additionally, as the Indian market evolves, we are prepared to support both local data center approaches and the continued utilisation of cloud platforms like AWS and Google Cloud, ensuring accessibility for end-users regardless of the expansion path taken.