Archive | November, 2010

An Administrator’s view of Open PaaS and VMforce

23 Nov

After posting about How Development works on Open PaaS and VMforce, I felt it was time to provide an equivalent view from an Administrator’s perspective. Before going deep, I thought I would provide a comparison of what things look like between the Developer’s view of things vs. the Administrator’s.

Comparison of Developer vs. Admin View

Please note that this is derived information and in some cases speculative (but I bet I’m close)
Starting at the top:

  • The URL and Mapping matches a DNS entry with an External IP (Host), a Path, and a Port to Access the Application
  • The Application contains an App Instance matching the Virtual Machine with a Workload (Potentially multiple Workloads)
  • The Internal IP operates off of the assumption that the VM is either multi-homed or has a NAT based interface with an Inside and Outside Address
  • The Service Instance matches a VM with a specific Running Service inside. This could be a shared Service instance or a Multi-User/Tenant Service Instance (There isn’t enough info. from what I have found to know which)
  • The Service Catalog is the equivalent of a Template/Gold Image based VM (in the describe model)
  • There are several different ways VMware could choose to implement isolation and multi-tenancy.

The diagram below gives an Administrator’s view of Open PaaS and its implementation inside of VMforce. The current implemented resource model shows a quota system as the chosen way of limiting/controlling consumption of resources in the Open PaaS Cloud.

Administrator's View of Open PaaS / VMforce

The Account is the line between where the Administrator turns over the resources to the Developer. This seems like it would create an environment like the wild west, but this is a deceptively simple view. The Architects and Administrators both have the ability to constrain the system before any code is pushed into it. This is achieved by decisions on what types of code can supported in the system, potentially constraining allowed frameworks, available services, the ability to create services, and allocated resources. Quota based allocated resources include number of CPU cores, Memory, and Disk space.

From what I have been able to find so far, there is a focus on isolation by Account using a quota system.
The strongest isolation model would be to assign each workload its own VM, this however would consume far more resources than isolation at a process level (a typical trade-off). Implementing isolation at a process level would work well but you wouldn’t want all Workloads (App Instances) for a single Application running in a single VM, because if the VM fails so does your Application. As more is revealed, I will provide more indepth information on how isolation and distribution is done.

There is also an unknown as to if and how a load-balancing mechanism is implemented. I haven’t come across how/if this is implemented, perhaps this is done in the Mapping (via. DNS/round robin?). This is purely speculative.


AppCloud appears to be VMware Open PaaS Cloud backend name

20 Nov

As I continue to go through VMC related code, I have come across a few code entries talking about AppCloud.  At first I thought this might be a reference to EngineYard’s AppCloud solution. This brought to mind the rumors I mentioned in previous posts, but after further digging and reading the following code and code comments:

AppCloud Gist

Code references to AppCloud

I am fairly convinced that AppCloud is referring to the Open PaaS Cloud Controller (and possibly some of the other components collectively), not EngineYard’s AppCloud. The big question in my mind is when will the first version of “AppCloud” be launched/shipped/released.

How Development works on Open PaaS & VMforce

14 Nov

After having gone through the materials available (both the easy to find and the difficult to find) I have created what should be an accurate view of what the environments inside a VMware Open PaaS and VMforce world should look like.  In this post is a series of diagrams that I have created based on what I have concluded is the way the current system works.

Developer's overview of Open PaaS & VMforce

In the diagram above, starting at the top:

  • Organization Context – This is likely a sub-Cloud of the overall Cloud, but I haven’t been able to clarify this in the code yet.  It provides authentication and determines which services are available to be used and shared amongst User Accounts (aka Service Domains)
  • User Account – This is done by registering an e-mail address and a password, each account is allocated a quota which is controlled by an administrator.
  • URL – This is how external Applications, Users, APIs, etc. access the Application
  • Mapping – Connects the URL to the Application (Allowing Applications to be Switched beneath the URL)
  • Application – Also know as a “Droplet”, is made up of App Instances and connect to Service Instances
  • Service Instances – Are useable/consumable/invoked instances of services from the Service Catalog
  • Service Catalog – This is the listing of all available services that can currently be invoked for use by an Application

Logical Diagram of Open PaaS - VMforce

The above diagram shows a slightly more filled out Service Catalog. These are the services that were provided as examples by the VMware presentations and documentation that I have seen so far.  The diagram also shows an even larger number of applications running, although each only has a single App Instances associated with it.

Detailed Logical Diagram of a more realistic example

In this diagram (above), there are two URLs each providing access to an Application.  The first application on the left has a single Application Instance and that App Instance is bound (see Binding Labels) to a MySQL Instance (Service Instance) and a RabbitMQ Instance (Service Instance).  The two Service Instances are created from the Service Catalog’s MySQL and RabbitMQ entries.

The second Application has three App Instances inside of it, all of which are bound to the SAME RabbitMQ Instance that the first Application is (this means that the two Applications can share information through the RabbitMQ Instance).  The MySQL Instance is a separate MySQL Instance from the first Application MySQL Instance, although both are based/invoked from the MySQL Service in the Service Catalog.  The Redis, Memcache, and MongoDB instances are all bound to each of the App Instances in the second application and are used by all three instances.

Diagram of Open PaaS - VMforce Service Tiers

The final Diagram is from information I came across while digging through the VMC Ruby code.  The code has fields in it for “Service Tiers”, which based on some poking around on Salesforce’s website, I came up with the above possibilities.  I don’t know if this is 100% accurate, but based on the information I think it provides a pretty reasonable approach to exposing provider services to Developers looking to write code on a Cloud platform such as VMforce.

There are several more interesting things that I have come across since I began going through the code.  I will be blogging about them soon.

Walk-through of the VMforce / Cloud OS / OpenPaaS Demo

13 Nov

This post attempts to walk-through the demo that was shown at the Ruby Conference.  I was not actually at the conference, but I am reconstructing what happened based on materials and information that was tweeted and the presentation materials.

Diagram of VMware Cloud OS - PaaS - VMforce Demo CLI Walk-through

The walk-through above shows a sophisticated PaaS layer (reminding me of the Google AppEngine PaaS) where code is uploaded/pushed and then inspected and compiled.  The resulting “App Instance” (also referred to as a “Slug” in Heroku terms) is ready to respond to requests.  Can’t service enough requests with just one App Instance?  Type “vmc instances fu 5” and instead of a single App Instance, the App Instance clones/copies/ Scales Out to 5 instances.

Need to Scale Down from 5 App Instances to 3 because demand has fallen?  “vmc instances fu 3” Scales Down the App Instances from 5 to 3 killing the last 2 App Instances – 3 and 4 (note that instances are numbered 0-4).


Diagram of VMware Cloud OS - PaaS - VMforce Demo CLI Walk-through - 2

Now the requests have fallen to the point where there is only 1 App Instance needed, so the command “vmc instances fu 1” shrinks (Scales Down) the App Instances from 3 to 1.  Now we look to see where the fu Application actually is resource utilization wise (This is the point where we see the overlap with the underlying VM) by typing “vmc stats fu”

This allows a Developer on a Public Cloud – in the above case is hosted by Terremark (using vCloud if I had to guess) to deploy their code with little to NO knowledge of the underlying Infrastructure beneath (IaaS).  This ultimately mirrors the functionality of what VMware’s Cloud OS will provide to  Private Clouds and the Enterprise Development groups inside them.  What is demoed is supposed to be the same as the system was designed for VMforce in ‘s Data Center as well.

VMware quietly shows Cloud OS, OpenPaaS, and VMforce at Ruby Conference

13 Nov

Yesterday, VMware previewed the first concrete evidence that they are moving forward on the OpenPaaS initiative, the VMware Cloud OS, and VMforce at the 2010 Ruby Conference in New Orleans. At the conference Derek Collison demonstrated an early preview of the VMware Cloud OS via. a command line interface that he and Ezra Zygmuntowicz created.

The demonstration included showing Ruby code being auto deployed (pushed) into an a VM (where it becomes an App), coming online and then scaling both up and down in real-time. Also shown in the presentation itself was a screenshot of the control panel providing a view into what is being referred to as an application centric view of PaaS.

VMware Cloud OS Dashboard

In addition, the architecture was described, covering how the system is coordinated for auto-scaling and how resources are controlled.  Below is the high level architecture that was presented, a more detailed and in-depth walkthrough of how the system works will be published on this blog tomorrow.

VMware Cloud OS / OpenPaaS Architecture

VMware is moving quickly to fully support all modern/popular languages in the Cloud OS, including:

  • Ruby
  • Java
  • Node.js
  • Python
  • .NET
  • and more!

This strategy is critical to VMware being able to uplift itself from being considered a purely infrastructure company, to that of a Platform company (beyond owning SpringSource).  It is also important to VMware in attempting to grab developer mindshare by trying to better meet the capabilities that developers have traditionally been going to the Public Cloud for, by enabling Enterprises to provide those same capabilities internally (real Private Clouds).

Several more posts to follow very soon…

Where most Enterprise IT Architectures are today

4 Nov

Most Enterprises are architecturally in a rigid and fragile state.  This has been caused by years of legacy practices in support of poor code, design patterns, underpowered hardware (which focused on increasing MHz not parallelism/multi-cores).  What follows is a brief review of what has led us here and is needed background for the follow-on post which exercises a theory that I’m testing.

Architecture Phase 1 – How SQL and ACID took us down a path
Early on in the Client/Server days even low power x86 servers were expensive. These servers would have an entire stack of software put on them (i.e. DB and Application functions with Clients connecting to the App and accessing the DB). This architecture made the DB the most critical component of the system. The DB needed to ALWAYS be online and needed to have the most rigid transactional consistency possible. This architecture forced a series of processes to be put in place and underlying hardware designs to evolve in support of this architecture.
This legacy brought us the following hardware solutions:

RAID 1 (Disk Mirroring) -> Multi-pathed HBAs connecting SANs with even more Redundancy

Two NIC Cards -> NICs teamed to separate physical Network Switches

Memory Parity -> Mirrored Memory

Multi-Sockets -> FT Based in Lock Step CPUs

All of this was designed to GUARANTEE both Availability and Consistency.
Having Consistency and Availability is expensive and complicated.  This also does not take into account ANY Partition tolerance.  (See my Cap Theorem post)













Architecture Phase 2 – The Web
Web based architectures in the enterprise contributed to progress with a 3-Tier model where we separated the Web, Application, and Database functionality into separate physical systems. We did this because it made sense. How can you scale a system that has a Web, Application, and Database residing on it? You can’t, so first you break it out and run many web servers with a load balancer in front. Next you get a big powerful server for the Application tier and another (possibly even more highly redundant than the Application tier server) for the Database. All, set right? This is the most common architecture in the enterprise today. It is expensive to implement, expensive to manage, and expensive to maintain, but it is the legacy that developers have given IT to support.  The benefit being that there is better scalability and flexibility with this model and with adding virtualization (which helps further the life of this architecture).

Where is Virtualization in all of this?
Virtualization is the closest Phase 2 could ever really get to the future (aka Phase 3, which is covered in my next post). Virtualization breaks the bond with the physical machines, but not the applications (and their architectures) that are running on top. This is why IT administrators have had such a need for capabilities in products like VMware ESX in conjunction with VMware vSphere like HA (High Availability), DRS (Distributed Resource Scheduling), and FT (Fault Tolerance). These things are required when you are attempting to keep a system up as close to 100% as possible.


The trend toward Cloud architectures is forcing changes in development practices and coding/application design philosophies.  Cloud architectures are also demanding changes in IT operations and the resulting business needs are creating pressures for capabilities that current/modern IT Architectures can’t provide.

This leads us to what is coming….

CAP Theorem and Clouds

3 Nov

A background on CAP Theorem:

CAP Theorem is firmly anchored in the SOA (Service Oriented Architecture) movement and is showing promise as a way of classifying different types of Cloud Solution Architectures.  What follows is an explanation about CAP Theorem, how it works, and why it is so relevant to anyone looking at Clouds (Public, Private, Hybrid, or otherwise).

Distributed Systems Theory – The CAP Theorem:
CAP Theorem was first mentioned by Eric Brewer in 2000 (CTO of Inktomi at the time) and was proven 2 years later.  CAP stands for Consistency, Availability, and Partitioning tolerance.  CAP Theory states that you can only have TWO of the three capabilities in a system.  So you can have Consistency and Availability, but then you don’t have Partitioning tolerance.  You could have Availability and Partitioning tolerance without rigid Consistency.  Finally you could have Consistency and Partitioning tolerance without Availability.

The KEY assumption is that the system needs to persist data and/or has state of some type, if you don’t need either Data persistence or State ANYWHERE, you can get very close to having Consistence, Availability, and Partitioning simultaneously.

Understanding Consistency, Availability, and Partitioning:

Consistency is a system’s ability to maintain ACID properties of transactions (a common characteristic of modern RDBMS). Another way to think about this is how strict or rigid the system is about maintaining the integrity of reads/writes and ensuring there are no conflicts.  In an RDBMS this is done through some type of locking.
Availability is system’s ability to sucessfully respond to ALL requests made.  Think of data or state information split between two machines, a request is made and machine 1 has some of the data and machine 2 has the rest of the data, if either machine goes down not ALL requests can be fulfilled, because not all of the data or state information is available entirely on either machine.
Partitioning is the ability of a system to gracefully handle Network Partitioning events.  A Network Partitioning event occurs when a system is no longer accessible (Think of a network connection failing). A different way of considering Partitioning tolerance is to think of it as message passing.  If an individual system can no longer send/receive messages to/from other systems, it has been effectively “partitioned” out of the network.
A great deal of discussion has occurred over Partitioning and some have argued that it should be instead referred to as Latency.  The idea being that if Latency is high enough, then even if an individual system is able to respond, the individual system will be treated by other systems as if it has been partitioned.
In Practice:

CAp – Think of a traditional Relational Database (i.e. MS SQL, DB2, Oracle 11g, Postgres), if any of these systems lose their connection or experience high latency they can not service all requests and therfore are NOT Partitioning tolerant (There are ways to solve this problem, but none are perfect)
cAP –  A NOSQL Store (i.e. Cassandra, MongoDB, Voldemort), these systems are highly resilient to Network Partitioning (assuming that you have several servers supporting any of these systems) and they offer Availbility.  This is achieved by giving up a certain amount of Consistency, these solutions follow an Eventual Consistency model.
CaP – This isn’t an attractive option, as your system will not always be available and wouldn’t be incredibly useful in an Cloud environment at least.  An example would be a system that if one of the nodes fails, other nodes can’t respond to requests.  Think of a solution that has a head-end where if the head-end fails, it takes down all of the nodes with it.
A Balancing Act
When CAP Theorem is put into practice, it is more of a balancing act where you will not truly leave out C,A, or P.  It is a matter of which two of the three the system is closest to (as seen below).
NOTE:  Post to follow tying this more closely to the Cloud coming tomorrow.
In my research I came across a number of great references on CAP Theorem (In order most favored order):