Tuesday, August 2, 2011

Enterprise Cloud Governance: Policies and Metamodels



The Law
James Urquhart wrote a good piece for CNET yesterday, titled Regulation, Automation, and Cloud Computing. In it, James comments on a blog by Chris Hoff discussing some of the downsides to automation. Originally, Chris had pointed out that heavily automated environments don’t leave a lot of room for human intervention when things go wrong and rapid automatic response can actually lead to cascading failure when the world fails in a way that was not expected by the automation creator. James then made the point that automation also interacts with the legal and regulatory spheres. James says:
If we are changing the very configuration of our applications–including location, vendors supplying service, even security technologies applied to our requirements–how the heck are we going to assure that we don’t start breaking laws or running afoul of our compliance agreements?
 
It wouldn’t be such a big deal if we could just build the law and compliance regulations into our automated environment, but I want you to stop and think about that for a second. Not only do laws and regulations change on an almost daily basis (though any given law or regulation might change occasionally), but there are so many of them that it is difficult to know which rules to apply to which systems for any given action.
 
In fact, I long ago figured out that we will never codify into automation the laws required to keep IT systems legal and compliant. Not all of them, anyway. This is precisely because humanity has built a huge (and highly paid) professional class to test and stretch the boundaries of those same rules every day: the legal profession.
Chris is right.
James is right insofar as he identifies the problem and then says that it’s impossible to codify every single law and regulation into the automation system.
But, while we can’t codify everything, that also isn’t an argument to avoid codifyinganything.
The basic problem is that with cloud, we’re no longer building control systems strictly for IT operations personnel. I believe that the whole BIG IDEA with clouds is that we can decentralize and democratize the control systems that drive IT resources. Right now, the IT department controls all IT systems. You want something done? You talk to IT. If and when IT can get around to it, you might get what you want. And ultimately, that’s a slow, inefficient way to run a railroad. There are many ideas that business units have that simply can’t be executed on because the amount of time and energy spent trying to get IT to deliver the right resources is too high. But with that slow inefficiency also comes a control point such that we can enforce enterprise governance requirements. Today, there are enough human review and approval processes in place to put the brakes on most ill-conceived ideas that would violate laws or regulations.
With cloud, however, we have the opportunity to make IT completely self-service. And that’s wonderful for creating increased business value because it means that business units no longer have to beg and plead with the IT department to execute on projects that are important to the business. Rather, the business can make use of self-service resources to do whatever they need. By cutting out the IT middleman from the daily requests, the speed of the solution delivery lifecycle (SDLC) increases, and, if the business is doing its job, so does business value creation.
The challenge with the self-service model is not technical. We can build all the automated systems to execute a self-service model fairly easily, and there are many examples. The big problem with self-service is governance.
If you’re running a large, multinational financial institution of the kind that ServiceMesh deals with every day, is it reasonable to expect every business-unit developer or mid-level manager in the USA to understand all the laws governing financial information in Germany or Hong Kong? Do users and developers in London understand the laws and regulations in Tokyo? The answer is most assuredly not. But with a single click, we could move a workload or dataset across the planet, violating the laws of multiple jurisdictions at the same time.
So, James says that it’s unreasonable to expect to codify the legal system into our automation systems. But it’s equally as unreasonable to expect non-lawyers (and frankly even lawyers) to understand the legal and regulatory posture of a company across all its geographies. So, what can we do?
Do we really have to achieve 100% fidelity between automated infrastructure and a constantly changing legal structure. And if we can’t, does that mean that any attempt at control is inevitably fruitless and should not even be attempted?
I don’t believe so. The ServiceMesh Agility Platform was constructed with a very richpolicy management system that goes far beyond simple user-based or role-based access control to individual resources. The Agility Platform policy management system was created to allow layering of possibly multiple conflicting policies, created by a diverse group of governance people. The policies are sorted out, prioritized, and the right things happen. The policy management system operates on a customizable meta-model which allows every high-level object type within the Agility Platform (applications, stacks, scripts, clouds, etc.) to be tagged with attributes that can then be inspected as part of policy decisions.
Thus, we can create policies as rich as something like, “Bob is allowed to deploy workload X into Cloud Y. But because X requires SSAE 16 (the follow-on to SAS 70), X can only be deployed into datacenter Z, which has SSAE 16 certification. And all network traffic to and from the workload must be encrypted. And all storage must be encrypted. And only into the non-production environment. And only on Tuesday.” And even more complex than that. Or a lot simpler than that. If you want, you can just specify that Bob is only allowed to deploy things in Cloud A and be done with it.
In short, almost anything can be expressed in the Agility Platform policy system — it’s that rich. And that’s critically important when, as James says, you’re trying to track the whims of lawyers across the world.
Agility Platform policy editorIt’s another matter keeping all those policies up to date, however. James points out that the laws are constantly changing. That’s one reason it would be foolish to hard-code them into the automation system itself, whether that’s a standard management system, a low-level run-book automation oriented orchestration package, or a Perl script. With the Agility Platform, we made policies stackable and easily editable by mere mortals (AKA governance and compliance personnel) with a WYSIWYG graphical editor, rather than relying on coders. This means that the job of creating and maintaining policies can be delegated and distributed to those people who are in the best position to implement them. Policies are then checked at the appropriate times by the platform, automatically.
Is this a perfect solution? No. James is right in that the problem is hard and I can’t conceive of a 100% solution. We still rely on humans to codify laws and regulations and those must be kept up to date and applied correctly. But we’re not creating a brittle, completely unmaintainable system where the policies are “baked into” our scripting. We have a system where policies are stacked and interact correctly. In short, it’s built to scale and about as clean of a system that I can imagine.