Software

AI + ML

IBM sets Watson chips on the AI case as price war kicks off

Amazon, Microsoft and VMware make their moves in the game of machines


IBM claims it can cut the cost for AI models in the cloud with custom silicon to cash in on the surge of interest in generative models like ChatGPT.

Tech companies have been desperate to exploit the new interest in AI caused by ChatGPT, even though that may now be on the wane, with traffic to OpenAI's website falling by an estimated 10 percent between May and June.

IBM said it is considering the use of its own custom AI chips to lower the costs of operating its Watsonx services in the cloud.

Watsonx, announced in May, is actually a suite of three products designed for enterprise customers trying out foundation models and generative AI to automate or accelerate workloads. These can run on multiple public clouds, as well as on-premises.

IBM’s Mukesh Khare told Reuters that the company is now looking to use a chip called the Artificial Intelligence Unit (AIU) as part of its Watsonx services operating on IBM Cloud. He blamed the failure of Big Blue’s old Watson system on high costs, and claimed that by using its AIU, the company can lower the cost of AI processing in the cloud because the chips are power efficient.

Unveiled last October, the AIU is an application-specific integrated circuit (ASIC) featuring 32 processing cores, and described by IBM as a version of the AI accelerator built into the Telum chip that powers the z16 mainframe. It fits into a PCIe slot in any computer or server.

Amazon aims to cut costs

Meanwhile, Amazon said it is also looking to attract more customers to its AWS cloud platform by competing on price, claiming it can offer lower costs for training and operating models.

The cloud giant's veep of AWS Applications Dilip Kumar said that AI models behind services such as ChatGPT require considerable amounts of compute power to train and operate, and that these are kinds of costs Amazon Web Services (AWS) is historically good at lowering.

According to some estimates, ChatGPT may have used over 570GB worth of datasets for training, and required over 1,000 of Nvidia’s A100 GPUs to handle the processing.

Kumar commented at the Momentum conference in Austin that the latest generation of AI models are expensive to train for this reason, adding "We're taking on a lot of that undifferentiated heavy lifting, so as to be able to lower the cost for our customers."

Plenty of organizations already have their data already stored in AWS, Kumar opined, making this a good reason to choose Amazon’s AI services. This is especially so when customers may be hit with egress charges to move their data anywhere else.

However, cloud providers may not be ready to meet the new demand for AI services, according to some experts. The Wall Street Journal notes that the new breed of generative AI models can be anything from 10 to 100 times bigger than older versions, and needs infrastructure backed by accelerators such as GPUs to speed processing.

Only a small proportion of the bit barns operated by public cloud providers are made up of high-performance nodes fitted with such accelerators that can be assigned to AI processing tasks, Amazon’s director of product management for EC2 at AWS Chetan Kapoor declared, saying that there is “a pretty big imbalance between demand and supply”.

This hasn’t stopped cloud companies from expanding their AI offerings. Kapoor said that AWS intends to expand its AI-optimized server clusters over the next year, while Microsoft’s Azure and Google Cloud are also said to be increasing their AI infrastructure.

Microsoft also announced a partnership with GPU maker Nvidia last year, which essentially involved integrating tens of thousands of Nvidia's A100 and H100 GPUs into Azure to power GPU-based server instances, as well as Nvidia's AI software stack.

Meanwhile, VMware is also looking to get in on the act, announcing plans this week to enable generative AI to run on its platform, making it easier for customers to operate large language models efficiently in a VMware environment, potentially using resources housed across multiple clouds. ®

Send us news
2 Comments

UK data watchdog warns Snap over My AI chatbot privacy issues

Plus: 4channers are making troll memes with Bing AI, and more

AI processing could consume 'as much electricity as Ireland'

Boffins estimate power needed if every Google search became an LLM interaction. Gulp

OpenAI warns folks over GPT-4 Vision's limits and flaws

Plus: Mistral emits uncensored model, Meta expands Llama 2's context window, Alexa drills into your voice

Hyperscale datacenter capacity set to triple because of AI demand

And it's going to suck... up more power too

How 'AI watermarking' system pushed by Microsoft and Adobe will and won't work

Check for 'cr' bubble in pictures if your app supports it, or look in the metadata if it hasn't been stripped, or...

Google offers some copyright indemnity to users of its generative AI services

'If you are challenged, we will assume responsibility'

India courts IBM, Intel, and Tower for chip partnerships - all in one day

Big Blue in early talks to advance local RISC-V designs

Ampere leads a pack of AI startups to challenge Nvidia's 'unsustainable' GPUs

AI Platform Alliance probably has Jensen Huang in tears...of laughter

UK government embarks on bargain bin hunt for AI policy wonk

Con: You won't get a Menlo Park salary. Pro: You won't have to meet Zuck

SoftBank boss Masayoshi Son predicts artificial general intelligence is a decade away

'Investo-bot, make me rich' is his vision – powered by Arm chips, natch

Five Eyes intel chiefs warn China's IP theft program now at 'unprecedented' levels

Spies come in from the cold for their first public chinwag

Nvidia sells Foxconn on AI factories that turn raw data into profit

Or, at least, information that can drive profits anyway