Archive | Trends in the Web RSS feed for this section

Data Gravity – in the Clouds

7 Dec

Today announced at Dreamforce.  I realized that many could be wondering why they decided to do this and more so, why now?

The answer is Data Gravity.

Consider Data as if it were a Planet or other object with sufficient mass.  As Data accumulates (builds mass) there is a greater likelihood that additional Services and Applications will be attracted to this data. This is the same effect Gravity has on objects around a planet.  As the mass or density increases, so does the strength of gravitational pull.  As things get closer to the mass, they accelerate toward the mass at an increasingly faster velocity.  Relating this analogy to Data is what is pictured below.

Data Gravity

Services and Applications can have their own Gravity, but Data is the most massive and dense, therefore it has the most gravity.  Data if large enough can be virtually impossible to move.
What accelerates Services and Applications to each other and to Data (the Gravity)?
Latency and Throughput, which act as the accelerators in continuing a stronger and stronger reliance or pull on each other.  This is the very reason that VMforce is so important to Salesforce’s long term strategy.  The diagram below shows the accelerant effect of Latency and Throughput, the assumption is that the closer you are (i.e. in the same facility) the higher the Throughput and lower the Latency to the Data and the more reliant those Applications and Services will become on Low Latency and High Throughput.
Note:  Latency and Throughput apply equally to both Applications and Services
How does this all relate back to  If can build a new Data Mass that is general purpose, but still close in locality to its other Data Masses and App/Service Properties, it will be able to grow its business and customer base that much more quickly.  It also enables VMforce to store data outside of the construct of ForceDB (Salesforce’s core database) enabling knew Adjacent Services with persistence.
The analogy holds with the comparison of your weight being different on one planet vs. another planet to that of services and applications (compute) having different weights depending on Data Gravity and what Data Mass(es) they are associated with.
Here is a 3D video depicting what I diagrammed at the beginning of the post in 2D.


More on Data Gravity soon (There is a formula in this somewhere)

Where most Enterprise IT Architectures are today

4 Nov

Most Enterprises are architecturally in a rigid and fragile state.  This has been caused by years of legacy practices in support of poor code, design patterns, underpowered hardware (which focused on increasing MHz not parallelism/multi-cores).  What follows is a brief review of what has led us here and is needed background for the follow-on post which exercises a theory that I’m testing.

Architecture Phase 1 – How SQL and ACID took us down a path
Early on in the Client/Server days even low power x86 servers were expensive. These servers would have an entire stack of software put on them (i.e. DB and Application functions with Clients connecting to the App and accessing the DB). This architecture made the DB the most critical component of the system. The DB needed to ALWAYS be online and needed to have the most rigid transactional consistency possible. This architecture forced a series of processes to be put in place and underlying hardware designs to evolve in support of this architecture.
This legacy brought us the following hardware solutions:

RAID 1 (Disk Mirroring) -> Multi-pathed HBAs connecting SANs with even more Redundancy

Two NIC Cards -> NICs teamed to separate physical Network Switches

Memory Parity -> Mirrored Memory

Multi-Sockets -> FT Based in Lock Step CPUs

All of this was designed to GUARANTEE both Availability and Consistency.
Having Consistency and Availability is expensive and complicated.  This also does not take into account ANY Partition tolerance.  (See my Cap Theorem post)













Architecture Phase 2 – The Web
Web based architectures in the enterprise contributed to progress with a 3-Tier model where we separated the Web, Application, and Database functionality into separate physical systems. We did this because it made sense. How can you scale a system that has a Web, Application, and Database residing on it? You can’t, so first you break it out and run many web servers with a load balancer in front. Next you get a big powerful server for the Application tier and another (possibly even more highly redundant than the Application tier server) for the Database. All, set right? This is the most common architecture in the enterprise today. It is expensive to implement, expensive to manage, and expensive to maintain, but it is the legacy that developers have given IT to support.  The benefit being that there is better scalability and flexibility with this model and with adding virtualization (which helps further the life of this architecture).

Where is Virtualization in all of this?
Virtualization is the closest Phase 2 could ever really get to the future (aka Phase 3, which is covered in my next post). Virtualization breaks the bond with the physical machines, but not the applications (and their architectures) that are running on top. This is why IT administrators have had such a need for capabilities in products like VMware ESX in conjunction with VMware vSphere like HA (High Availability), DRS (Distributed Resource Scheduling), and FT (Fault Tolerance). These things are required when you are attempting to keep a system up as close to 100% as possible.


The trend toward Cloud architectures is forcing changes in development practices and coding/application design philosophies.  Cloud architectures are also demanding changes in IT operations and the resulting business needs are creating pressures for capabilities that current/modern IT Architectures can’t provide.

This leads us to what is coming….

CAP Theorem and Clouds

3 Nov

A background on CAP Theorem:

CAP Theorem is firmly anchored in the SOA (Service Oriented Architecture) movement and is showing promise as a way of classifying different types of Cloud Solution Architectures.  What follows is an explanation about CAP Theorem, how it works, and why it is so relevant to anyone looking at Clouds (Public, Private, Hybrid, or otherwise).

Distributed Systems Theory – The CAP Theorem:
CAP Theorem was first mentioned by Eric Brewer in 2000 (CTO of Inktomi at the time) and was proven 2 years later.  CAP stands for Consistency, Availability, and Partitioning tolerance.  CAP Theory states that you can only have TWO of the three capabilities in a system.  So you can have Consistency and Availability, but then you don’t have Partitioning tolerance.  You could have Availability and Partitioning tolerance without rigid Consistency.  Finally you could have Consistency and Partitioning tolerance without Availability.

The KEY assumption is that the system needs to persist data and/or has state of some type, if you don’t need either Data persistence or State ANYWHERE, you can get very close to having Consistence, Availability, and Partitioning simultaneously.

Understanding Consistency, Availability, and Partitioning:

Consistency is a system’s ability to maintain ACID properties of transactions (a common characteristic of modern RDBMS). Another way to think about this is how strict or rigid the system is about maintaining the integrity of reads/writes and ensuring there are no conflicts.  In an RDBMS this is done through some type of locking.
Availability is system’s ability to sucessfully respond to ALL requests made.  Think of data or state information split between two machines, a request is made and machine 1 has some of the data and machine 2 has the rest of the data, if either machine goes down not ALL requests can be fulfilled, because not all of the data or state information is available entirely on either machine.
Partitioning is the ability of a system to gracefully handle Network Partitioning events.  A Network Partitioning event occurs when a system is no longer accessible (Think of a network connection failing). A different way of considering Partitioning tolerance is to think of it as message passing.  If an individual system can no longer send/receive messages to/from other systems, it has been effectively “partitioned” out of the network.
A great deal of discussion has occurred over Partitioning and some have argued that it should be instead referred to as Latency.  The idea being that if Latency is high enough, then even if an individual system is able to respond, the individual system will be treated by other systems as if it has been partitioned.
In Practice:

CAp – Think of a traditional Relational Database (i.e. MS SQL, DB2, Oracle 11g, Postgres), if any of these systems lose their connection or experience high latency they can not service all requests and therfore are NOT Partitioning tolerant (There are ways to solve this problem, but none are perfect)
cAP –  A NOSQL Store (i.e. Cassandra, MongoDB, Voldemort), these systems are highly resilient to Network Partitioning (assuming that you have several servers supporting any of these systems) and they offer Availbility.  This is achieved by giving up a certain amount of Consistency, these solutions follow an Eventual Consistency model.
CaP – This isn’t an attractive option, as your system will not always be available and wouldn’t be incredibly useful in an Cloud environment at least.  An example would be a system that if one of the nodes fails, other nodes can’t respond to requests.  Think of a solution that has a head-end where if the head-end fails, it takes down all of the nodes with it.
A Balancing Act
When CAP Theorem is put into practice, it is more of a balancing act where you will not truly leave out C,A, or P.  It is a matter of which two of the three the system is closest to (as seen below).
NOTE:  Post to follow tying this more closely to the Cloud coming tomorrow.
In my research I came across a number of great references on CAP Theorem (In order most favored order):

The Real Path to Clouds

2 Nov

I’ve been spending a great deal of time as of late researching the background and roots of Cloud Computing in an effort to fully understand it. The goal behind this was to understand what Cloud computing is at all levels, and is quite a tall order. I think I have it figured out and am now looking for the community’s feedback to vet and fully mature my theory.

First a brief review of CAP Theorem, which states that all implemented systems can only sucessfully focus on two of the three capabilities (Consistency, Availability, and Partitioning tolerance).  If you aren’t familiar with CAP Theory, please check out yesterday’s post and the resources at the bottom of that page.

First some background – Below are the two phases deployed today in >90% of Enterprises.  See “Where most Enterprise IT Architectures are today” for in-depth discussion on Phase 1 and Phase 2.


What follows is the theory:

Architecture Phase 3 – Cloud & The Real Time Web Explosion

The modern web has taken hold and Hyperscale applications have pushed a change in Architecture away from the monolithic-esque 3-Tier that traditional Enterprises still employ to that of a loosely coupled Services Oriented queued/grid like asynchronous design. This change became possible because developers and architects decided that a Rigidly Consistent and Highly Available system wasn’t necessarily required to hold all data used by their applications.

This was brought to the forefront by Amazon when it introduced its Dynamo paper in 2007 where Werner Vogels presented the fact that all of Amazon is not Rigidly Consistent but follows a mix of an Eventual Consistency model on some systems and a Rigidly Consistent model on others. Finally, the need for a system that operates at as close to 100% uptime and consistency, depending on a single system, was broken. Since then, we have found that all Hyperscale services follow this model: examples include Facebook, Twitter, Zynga, Amazon AWS, and Google.

Deeper into Phase 3
Why is Phase 3 so different than Phase 1 and 2?
Phase 3 Architecture not only stays up when systems fail, it assumes that systems can and will fail! In Hyperscale Architectures, only specific cases require RDBMS with Rigid Consistency, the rest of the time an Eventually Consistent model is fine. This move away from requiring systems to be up 100% of the time and maintaining Rigid Consistency (ACID compliance for all you DBAs) lowers not only the complexity of the applications and their architectures, but the cost of the hardware they are being implemented on.

Moore’s Law is pushing Phase 3
Until around 7 years ago, CPUs were constantly increasing in clock speed to keep up with Moore’s Law. Chip makers changed their strategy to keep up with Moore’s Law by changing from an increase in clock speed to increasing the numbers of cores operating at the same clock speed. There are only two ways to take advantage of this increase in cores approach, the first is to use Virtualization to slice up CPU cores and allocate either a small number of cores or even partial cores to an application. The second is to write software that is able to asynchronously use and take advantage of all of the cores available. Currently most enterprise software (Phase 2) is not capable of leveraging CPU resources in the second way described, however many systems in Phase 3 can.

How do IT processes fit into this?
Today most Enterprise IT shops have designed all of their processes, tools, scripts, etc. around supporting the applications and architectures mentioned in Phase 1 and Phase 2, NOT Phase 3. Phase 3 requires an entirely different set of processes, tools, etc. because it operates on an entirely different set of assumptions! How many Enterprise IT shops operate with assumptions such as:
Replicating Data 3 or more times in Real Time
Expecting that 2 or more servers (nodes) in a system can be down (this is acceptable)
Expecting that an entire Rack or Site can be down
(These are just a few examples)

Why will Enterprises be compelled to move to Phase 3 (eventually)?
The short answer is cost savings and operational efficiency, the longer answer is around the reduction in systems complexity. The more complex the hardware and software stack is (think of this as the number of moving parts) the higher the likelihood of a failure. Also, the more complicated the stack of hardware and software become, the more difficult it is to manage and maintain. This leads to lower efficiencies and higher costs, at the same time making the environment more brittle to changes (which is why we need ITILv3 in Enterprises today). To gain flexibility and elasticity you have to shed complex systems with hundreds of interdependencies and an operational assumption of all components being up as close to 100% as possible.


Different systems have different requirements, most systems do not in fact need an ACID compliant consistency model (even thought today most have been developed around one.  Some specific cases need ACID properties maintained and a level of 100% Consistency, but these are in the minority (whether in the Cloud or in the Enterprise).  This is causing a split in data models between the current CA (Consistency and Availability – “Traditional Enterprise Applications) and AP (Availability and Partitioning tolerance – many Cloud Applications).  A combination of these is the long term answer for the Enterprise and will be what IT must learn to support (both philosophically and operationally).

Follow me on Twitter if you want to discuss @mccrory

More on VMware’s Ruby Plans

9 Oct

Update: I got a response on twitter from Ezra Zygmuntowicz see below.

Original Post Begins Here:

Following yesterday’s post looking into Ruby’s creator Yukihiro Matsumoto’s visit to VMware’s Headquarters, I have gotten new information.

VMware has hired former EngineYard Co-Founder & Software Architect and Ruby Developer Ezra Zygmuntowicz to work on Ruby for VMware (Some type of project is underway). According to this source, there are rumors of a Ruby based Cloud Controller of some type being worked on. This may begin to resolve several of the bits of information from yesterday regarding VMware’s interest in EngineYard and Ruby. There is also further confirmation that Ruby is getting a lot of attention inside of VMware and the company is looking at joining Ruby organizations (I believe in an attempt to begin building credibility inside the Ruby community).
Note that I am referring to Ruby broadly, the specific Ruby derivative could easily be JRuby vs. pure Ruby.

VMware looking to Ruby?

8 Oct

I have been following many different trends in Cloud as of late, including:

  • Virtualization (VMware ESX, Citrix XenServer, RedHat KVM, and Microsoft Hyper-V)
  • Programming & Scripting Languages (Ruby, Java, Groovy, Powershell, Scala, Erlang, Akka, Clojure) to name a few
  • DevOps
  • Cloud Services (They are too numerous to mention in this post, but I will create a list soon)

It appears that VMware is looking at Ruby (or at least they are talking to Yukihiro Matsumoto@yukihiro_matz on Twitter) as seen in this FourSquare Checkin screenshot below:

Based on this I did some googling and here is what I have found so far:

  • This The Register Article talks about VMware looking to both Ruby and PHP as part of their Cloud Application Platform.
  • There was also a rumor a few months ago on GigaOM about VMware acquiring EngineYard (There hasn’t been anything announced or said since).
  • An article in the SDTimes about EngineYard using Terremark as a platform for Ruby with VMware as the middle layer and calling it xCloud.
  • And this article from Steve Jin talks about automating vSphere using JRuby
  • Finally this link on The Ruby Reflector site has stories about both New Relic (story here) and Opscode and mentions both Ruby and VMware (Opscode mentioning other Cloud Companies using their Chef solution as well)

So what is the takeaway from all of this?
VMware appears to be looking at Ruby or JRuby as a possible 3rd major programming language to bring under its wing. More interestingly is that Ruby/JRuby could be used to weave Cloud Services and Applications together and automate them on the backend. This can be done with Java, but the learning curve for and administrator would be high compared to leveraging Ruby (especially if the admin already understands Powershell).

On a side note – I’ve become very active on Twitter, feel free to follow me for realtime info. @mccrory

VMforce is Java / SpringSource Apps running on vSphere in Datacenters (PaaS)

27 Apr

The news broke just a little while ago with a few blog posts. One by Stephen Herrod (VMware’s CTO) on the basic strategy of why VMware is becoming an “Open” PaaS provider.

Spring Components and Object in the Cloud

Another by SpringSource’s Rod Johnson (founder) around the architecture and the mention of leveraging the Database (both for existing data such as credentials and account data, and for storing new data from new applications.) also has one blog post accessible, while a bit more short on real content, does purport the ability for developers to see a 5X increase in productivity! There is supposed to be an additional post, however it is not “online” yet.

Is this revolutionary? Not in my opinion, it is more evolutionary. VMware wants to compete with the likes of Microsoft and Google. Microsoft appears to be in their sites with this directly as this would seem to mirror the Microsoft Azure strategy and platform pretty well. Microsoft does have Java bindings to talk to Azure, I wonder if VMware will have .NET bindings to talk to their PaaS offering?

Update 1: In thinking this through, I came to a realization. When is Oracle going to make their PaaS move? They have all of the components necessary to make a PaaS solution. Think about it, with Sun they have Java, Glassfish, etc. and with Oracle they have Oracle and MySQL DB (hence the software components). Hardware wise, they have the Sun equipment for the hardware and both Sun and Oracle have Datacenters. The only question that remains in my mind is WHEN will this happen?

Update 2: The money for Spring/VMware is when you want to move out of the PaaS Cloud and into vClouds, local vSpehere, or elsewhere. You will need Licenses and Support for Spring and VMware!!!! On the side, if you put any reliance on the DB, you will have to figure out how to work with it outside of the PaaS solution. Likely it will be more performant locally than remotely right?

Update 3: VMforce is due to be available in a developer preview in the second half of 2010, with general availability anticipated either later this year or in 2011. 2011? Really? That is a LONG time from this announcement in April! –