IBM’s latest mainframe builds on the platform’s traditional attributes of security and reliability for mission-critical workloads, adding AI to support large language models (LLMs), assistants, and agents.
The z17 family introduces an improved Telum II processor and Spyre AI Accelerator card, both of which were discussed at the Hot Chips conference in Palo Alto last year, for a claimed speed bump of 7.5 times the AI performance of the z16.
While the Telum II offers improved AI inferencing for running fraud detection checks against transactions – as was introduced with the z16 – the Spyre cards provide a way to scale AI handling to support generative AI and LLMs, and the use of multiple models to improve accuracy and reduce false positives, IBM claims.
“If you look at data as the new fuel, then infrastructure is the engine that allows organizations to drive their AI journeys to success,” said Elpida Tzortzatos, IBM Fellow and IBM Z Architect, referring to the hardware enhancements Big Blue has developed for this latest big iron.
The firm says it has spent a lot of time talking to clients about what they wanted to see in the mainframe, and this informed the development of the z17. Modernizing their applications and enabling mainframes to be more AI-driven is apparently what the customers told them.
But it isn’t a case of just throwing generative AI into the mix, as some other companies may have done. Big Blue claims to have thought this through carefully.
“GenAI is very critical and important to our clients, but also not the only AI tool. And although there’s a lot of talk around GenAI these days, predictive AI will continue to play a critical role in enterprises,” Tzortzatos said.
“We’ll continue to serve those use cases very, very well, but GenAI opens the aperture for new use cases, such as having assistants and being able to summarize documents, being able to provide support to developers in terms of having copilots that do code autocomplete and so forth.”
These assistants include the firm’s watsonx Code Assistant for Z and watsonx Assistant for Z, for example.
A new trend that the firm sees emerging is combining both the strengths of predictive AI with the strengths of large language and code models to extract new features or new insights, and get better and more accurate results out of these AI models, Tzortzatos claimed.
She cited an example of insurance where companies are pulling the structured information relating to claims from a DB2 database, then extracting key insights such as the cause of the claim, or the urgency of it from unstructured text and feeding it into a predictive AI model to get better, more accurate results.
As detailed at Hot Chips, the Telum II processors in the z17 are eight-core chips, like the previous generation, but running at a higher 5.5 GHz clock speed. Telum II also features a 40 percent increase in cache size and another new capability – an on-chip IO accelerator or data processing unit (DPU), which is designed to offload huge volumes of data that the Spyre AI Accelerator cards will churn through while handling newer AI models.
“When it comes to large language models and GenAI, we’ve seen a factor bigger than a hundred in terms of model complexity and model size increase, and this leads to higher requirements for AI compute,” Tzortzatos explained.
Those Spyre AI Accelerator cards fit into PCIe slots, and feature up to 32 cores each, said to be a similar architecture to the AI accelerator in the Telum II chip itself. IBM says it is possible for the z17 to have up to 48 of the cards in a single system.
Big Blue is also readying z/OS 3.2, the next release of its chief operating system for IBM Z systems, which is planned for the third quarter of this year. This brings support for hardware-accelerated AI capabilities across the system and uses operational AI for system management capabilities.
The new platform will add support for modern data access methods, NoSQL databases, and hybrid cloud data processing, according to IBM, to allow AI to tap into a broader set of enterprise data from which to apply predictive business insights.
IBM is launching its new big iron at a tricky time for such big-ticket items, with the Trump administration’s approach to international trade shaking business confidence. Traditionally, the introduction of a new mainframe sees a spike in revenue for Big Blue as customers with older systems upgrade, but this year could prove a difficult sell.
However, Mike Chuba, Managing VP in Gartner’s Infrastructure and Operations group, said the company has done its homework on what customers want to see.
“If you look at the last several mainframe generation announcements and continuing with this one, IBM is spending a lot more time in its R&D process involving its big mainframe customers,” Chuba told The Register.
“IBM’s R&D efforts now focus on how the new hardware directly addresses the challenges its customers are facing. The focus on AI with the dedicated accelerator they introduced on the z16 and the turbocharged Version 2 coming with this generation directly addresses, for example, the challenge of fraud detection at the point of the transaction.”
IBM’s z17 systems will be generally available June 18, while the Spyre Accelerator cards are expected to be available in the fourth quarter. ®