The major flaw of the cloud, and ironically its touted strength, is its inability to scale up. For example, the promise of cloud providers is that you can start out with a 512MB nodes and scale up however much you want! The sky is the limit they say! Oh, but no more than 15GB of RAM on the same node! Truth be told, you’re going to hit bottlenecks way before you start using that much memory. Something is going to saturate whether it’s the IO subsystem, CPU, network IO, or the bus. And that’s probably the main reason for putting a cap at 15GB (15GB for domU + 1GB for dom0). It’s balanced and fair. But what if one needs 72GB of RAM on one node without having to worry about hitting IO / CPU bottlenecks? What to do? You’d be out luck!
You can scale out by allocating new nodes on other blades and load balancing the application across. This is when one needs to dig in the cloud, get down and dirty. You’d have to worry about provisioning scripts, data replication, sync, etc. So the answer is no, you can’t scale up with IO / CPU / and RAM on the same blade!
This is where NUMA comes in. It stands for non-uniform memory access. This is what I believe the next cloud will be based on. Imagine hundreds of stacked blades connected with very high speed IO and memory buses. Each server has 16GB of RAM. Process 108796 on server103 already consuming 132GB RAM needs an additional 30GB RAM to run, server105 (closest and less costly in terms of bus access) responds and delivers! IO is scalable since each blade has its own IO chip. So IOPS can scale up not down, as is the case today.
The good news is that the Nehalem architecture is the first NUMA capable Intel CPU. Perhaps paving the way for NUMA across blades. Linux has had NUMA code for ages. The next vendor to turn a bunch of Linux blades into a scalable super computer for hosting will take it all home. I think motherboard vendors and Intel will have to coordinate to come up with a bus that connects the blades’ fabric. Of course, it’s easier said than done.