How AWS, GCP, and Microsoft Azure Serve the Growing Demand for AI Inference

Google leads in technology, Amazon Web Services excels in price.

Reading Time: 2 min  


  • [Source photo: Krishna Prasad/Fast Company Middle East]

    Major cloud providers are engaged in a close competitive race to deliver AI at scale, with Google Cloud Platform (GCP) leading in cutting-edge technology, while Amazon Web Services (AWS) focuses on offering cost-efficient solutions. 

    New Omdia research reveals that GCP benefits from Google’s status as a powerhouse of fundamental AI research, while Amazon Web Services benefits both from the enormous scale of its existing business and its excellence in day-to-day operations. 

    GCP will best serve customers looking to adopt the latest technology, while AWS will best serve those focused on price. However, Microsoft Azure is concentrating on satisfying OpenAI’s appetite for capacity.

    The research examines how the major cloud infrastructure vendors serve inference–generating content or answers from an AI model once its training is complete. By definition, inference is required when an AI application goes into production, with demand driven by end-user needs. As a result, it represents the intersection of AI projects and practical reality. As more and more AI applications go into production, Omdia anticipates inference will account for a growing share of overall AI computing demand.

    “The competition in this sector is intense. Google has an edge related to its strength in fundamental AI research, while AWS excels in operational efficiency, but both players have impressive custom silicon,” said Alexander Harrowell, Omdia’s Principal Analyst for Advanced Computing. “Microsoft took a very different path by concentrating on FPGAs initially but is now pivoting urgently to custom chips. However, both Google and Microsoft are considerably behind AWS in CPU inference. AWS’ Graviton 3 and 4 chips were clearly designed to offer a strong AI inference option, and staying on a CPU-focused approach is advantageous for simplifying projects.”

    Hyperscalers are crucial computing service providers to most of the AI industry. They will likely be the first point of contact for those establishing an AI model inference infrastructure to serve users. The report informs enterprises on key recommendations when selecting an appropriate provider and options. It analyzes pricing and availability of custom AI silicon, such as Google TPUs, flagship, mid-range, and entry-level GPUs, and CPU options hyperscalers recommended for AI inference.



    More Like This

    You must to post a comment.

    First time here? : Comment on articles and get access to many more articles.