Scaling Up vs Scaling Out

This is something I picked up from the internet. Though it might seem like an AMD fan rant about the current state of the x86, but it does bring up some salient points about the scalability of the AMD platform as compared to it's Intel counterpart.

4-way configuration for the Opteron and Xeon

I would have to agree that the AMD processor (K8 Hammer microarchitecture onwards, circa 2003) has been build ground-up from day 1 to be multiple-CPU scalable thru the use of HyperTransport (HT) links and integrated DDR memory controllers. HyperTransport technology allows two AMD CPUs with HT links to connect directly without any glue logic to make a multiprocessor platform and they can be further scaled up to maximum 8 CPUs (on Opterons with 3 HT links each).

In comparison, Intel is still maintaining a Front Side Bus based Northbridge/Southbridge platform architecture up till today and will only introduce integrated memory controllers and point-to-point serial interconnects in their upcoming Nehalem microarchitecture (late 2008).

My only comment is that AMD's scalability advantage was never really widely or heavily adopted and capitalized, especially by enterprise/corporate users that would benefit the most from the cost savings and performance gains. In fact most of the publicity generated for the technology came from the enthusiast community. Perhaps it takes a bigger player (Intel comes to mind) to affect changes and promote (read enforce) technology adoption (ie. USB, AGP, PCIE etc) We'll have to see after Nehalem to comment on this again.

AMD's ready to scale you up

When it comes to scaling x86 servers, it's smarter to think inside the box

Architectural traits reaching back to Pentium remain present in the Intel-powered servers of today. The limitations of those servers aren't likely to be noticed as long as the routine of IT and commercial server buyers is to add capacity by scaling out, purchasing new two-socket servers. But the time will come when adding a rack server, or a rack of servers, is no longer the wise person's path to increased capacity. Smart planning will lead you to handle bigger workloads without more servers.

The terms "scale up" and "scale out" are sometimes unfamiliar to x86 buyers. They refer to the locale of capacity expansion, computing ("thinking") capacity in particular. A server that scales up can be made to handle substantially higher workloads through upgrades inside the chassis. These systems cost more at first, but they're designed to have untapped capabilities that you can turn on with an incremental investment far less than that of a new server.

Scale up is the factor that has kept proprietary Unix big iron in business. Linux on a commodity two-socket Intel server was supposed to push HP, IBM, and Sun out of business. It looks that way if you see a rack chassis as a rack chassis without regard for what's inside. But scale-up maximizes everything from power savings and server consolidation ratio to server longevity, with the bonus of lower long-term costs and higher availability. All AMD Opteron servers scale up. It's baked into the CPU, the bus, and the total system architecture. AMD's strategy is to make it possible to scale up any Opteron server for five years with only a CPU swap, no new server required. This stands in stark contrast to Intel's "tick tock" plan that attempts to nail IT to the stereotypical two-year purchasing cycle. Intel's two-year cycle of obsoleting chips makes parts scarce and expensive, so that if you do buy an Intel-based server with empty sockets with plans to scale it up, it's unlikely that CPUs precisely matching the models you have now will be available, and the availability of FB-DIMM memory at your existing Intel servers' speed may be rare as well. AMD's five-year plan is more in line with the way IBM treats, and retains, its customers.

Scale out means bigger racks, more servers, more heat, higher power and cooling costs, another tick on your service contract, another hand to hold in the middle of the night, and so on. The only thing going for it is convenience, and that's a powerful motivator. Most shops have the deployment of new rack servers down to a science, and there's rarely a need to even remove the cover on a server before you slide it into the rack. Opteron servers yield to the very same plug-and-play initial deployment, but in a few months when you'd ordinarily add a new server, you can take the scale-up route of your choice: Swap out your Opteron CPUs with higher speed or more cores, add RAM or use faster RAM, or fill empty CPU sockets with new CPUs. It really is as simple as it sounds, and when you (or your field service person) buttons up the case, you have a new server, or two, or two and a half, where your two-socket server used to be.

You have to adopt a long-term view to justify buying x86 servers that you can grow without filling more rack units, but the economy has a way of fast-forwarding reality such that the present suddenly laps the plan. If you're not already in spend-it-while-we-have-it mode, all forecasts indicate that you will be. Servers that you buy from now on should put you on course to grow your capacity, or to ready yourself for an overnight recovery, while you gently apply the brakes by reducing your costs now.

If that's too wishy-washy for you, I'll give you a hard example: A copy of Windows Server 2008 costs the same for a one-socket, four-way server as it does for an eight-socket, 32-way server. Each unit of Windows Server 2008 carries a license that permits the operation of an unlimited number of Windows virtual machines on one physical server. Today, expanding Windows server capacity means buying more servers, and therefore more Windows licenses. It may be that you have so many servers that a volume license, as costly as it is, is cheaper or more convenient than one license per server. Using any Opteron scale-up scenario, one Windows license covers all the cores and virtual servers you can squeeze into one physical box. As a bonus, any variety of distributed computing is done faster on scale-up hardware because far more server-to-server communication is handled at the speed of memory rather than the speed of Ethernet.

That scenario can be carried further. When you get to know Opteron, especially the quad-core Opteron CPU nicknamed Barcelona (revision B, with the TLB flaw repaired, is now shipping), I'll explain how AMD's redesign of the x86 architecture not only scales up through added components, but scales up through evolved software as well. There are many more features in quad-core Opteron than generic x86 and x64 operating systems use. You will scale up your quad-core Opteron servers merely by installing a Windows or Linux point release that includes Opteron-specific optimizations, or changing the architectural target of the projects you compile in-house. I realize that my strong position on Opteron and desktop derivatives, like the amazing Phenom, might appear to some like bias. Please understand that when I dig into AMD CPUs and platforms as technology and foundation for IT strategy and investment, I simply see so many changes for the better.

Posted by Tom Yager on March 26, 2008 03:00 AM

Comments