Policies Rule Cloud and Datacenter Operations – Cloud 2.0

Trust but verify – A new way to think about Cloud Management

Cloud management platforms (CMPs) are very popular to manage cloud servers and applications and have been widely adopted by small and large enterprises. For datacenter management (DC) spanning over decades before, there has been a sprawl of systems management tools to manage datacenters. The common wisdom in both these models is to control access to the cloud at the gates by CMPs or DC tools just like in the historic days forts were protected and access controlled with moats and gates. However, with the increasing focus on agility and delivering faster business value to customers, developers and application release teams require a much greater flexibility in working with cloud than previously imagined. Developers want full control and flexibility on tools and APIs to interact with cloud instead of being stopped at the gates and prescribed a uniform single gate to use cloud. Application owners want to allow this freedom but still want cloud workload to be managed, compliant, secure and optimized. This freedom and business driver for agility is creating a new way to reimagine cloud 2.0 which does not stop you at the gates but allows you to come in while continuously checking policies to ensure that you behave well in cloud. Ability to create and apply policies will play a key role in this new emerging model of governance where freedom is tied to responsibility. We believe that the next generation cloud operational plane will drive the future vision on how workloads will be deployed, operated, changed, secured, and monitored in clouds. Enterprises should embrace policies at all stages of software development lifecyle and operations for datacenters in cloud and on-prem. Creating, defining and evaluating policies and taking corrective actions based on policies will be a strategic enabler for all enterprises in the new cloud 2.0 world.

Defining Cloud Operational Plane

In this new cloud management world, you are not stopped at gates but checked continuously. Trust but verify is the new principle of governance in Cloud 2.0.  Now, let us review the 5 key areas for a cloud operational plane and how policies will play a critical role in governance.

  • Provisioning and deployment of cloud workload
    • Are my developers or app teams provisioning right instance types?
    • Is each app team using within their allocated quota of cloud resources?
    • Is the workload or configuration change being deployed secure and compliant?
    • How many pushes are going on per hour, daily and weekly?
    • Are any failing and why?
  • Configuration changes
    • Is this change approved?
    • Is it secure and compliant?
    • Tell me all the changes happening in my cloud?
    • Can I audit these changes to know who did what when?
    • How can I apply changes to my cloud configurations, resources, upgrade to new machine images etc.?
  • Security and compliance
    • Continuously verify that my cloud is security and compliant
    • Alert me on security or compliance events instantly or daily/weekly
    • Remediate these automatically or with my approval
  • Optimization
    • Are my resources most optimally being used? Does it have the right capacity? Do I have the scaling where I need it?
    • Showback of my resources
    • Tell me where am I wasting resources?
    • Tell me how I can cut down costs and waste?
  • Monitoring, state and health
    • Is my cloud workload healthy?
    • Tell me what are key monitoring events? Unhealthy events?
    • Remediate these automatically or with my approval

How Cloud Operational Plane can be enabled through Policies?

The following table compares the new and old world cloud management.  In the old world of cloud management platforms (CMP), we block without trust.  In the new world of cloud operational plane, since gates are open, it becomes necessary to manage the cloud through policies as the central tenet for cloud operations.  This is the cloud operational plane (COP).

  CMP – Block without Trust COP –Trust but Verify Recommended Practices
Deployment      
Deployment to multi-cloud Single API across all clouds, forced to use this.
Catalog driven provisioning
Various tools + No single point of control

No single API

No single Tool

Use best API/tool for each cloud

No catalog

DevOps – your choice

 

Manage/start/stop your resources Single tool Various tools + No single point of control DevOp/Cloud tool – your choice
DevOps continuous deployment Hard to integrate, API of CMP is a hindrance to adoption Embraces this flexibility, allow changes through any toolset Policies for DevOps process for compliance
Config Changes      
Unapproved config changes Block if not approved Allow usually or block if more control desired Change Policies
Config changes API Single API No single API DevOps tool
Audit config changes Yes Yes Audit – capture all changes
Rollback changes No Yes, advanced tools for Blue-Green, Canary etc. DevOps tool
Change monitoring No Yes Change Monitoring
Change security No Yes Policy for change compliance/security
Security & Compliance      
Security in DevOps process N/A Yes Policy for DevOps security
Monitor, scan for issues, get notified N/A Continuously monitor for compliance & security Multi-tool integrations
Prioritize issues N/A Yes, multiple manual and automated prioritization Policy based prioritization
Security and Compliance of middleware and databases

 

 

N/A Yes Compliance and security policies for middleware and databases
Optimization      
Quota & decommissioning Block deployment if out of quota

Decommission on lease expiry

Allow but notify or remove later with resource usage policies.

Decommission on lease expiry

Policies for quota and decommissiong
Optimization N/A Yes Policies for optimization and control

Policies in Enterprises

As enterprises move into a world of freedom and agility with Cloud and DevOps, it becomes increasingly important to use policies to manage cloud operations.  An illustrative diagram below shows how policies can be used to manage everything from DevOps process, on-prem and cloud environments, production environments, cloud infrastructure, applications, servers, middleware and databases.

Policies rule the world

For agile DevOps, policy checks can be embedded early or as needed in the process to catch compliance, security, cost or shift-left violations in source code and libraries. For example, consider a DevOps process starting with a continous integration (CI) tool such as Jenkins®. Developers and release managers can trigger the OWASP (Open Web Application and Security Project) checks to run a scan against source code libraries and block the pipeline if any insecure libraries are found.

Production environments have applications consisting of servers, middleware, databases and networks hosted in clouds such as AWS and Azure.  All these need to be governed by policies as shown above.  For example, RHEL servers in cloud are governed by 4 policies – cost control, patch policy, compliance policy and a vulnerability remediation policy.  Similarly there are security, compliance, scale and cost policies for other cloud resources such as databases and middleware.  Finally, the production environment itself is governed by change, access control and DR policies.

All these policies in the modern cloud 2.0 will be encoded as code.  A sample policy as code can be written in a language such as JSON or YAML:

  • If s3 bucket is open to public, then it is non-compliant.
  • If a firewall security group is open to public, then it is non-compliant.
  • If environment is DEV and instance type is m4.xlarge, then environment is non-compliant

Using policy-as-code will ensure that these policies are created, evaluated, managed and updated in a central place and all through APIs.  Additionally, enterprises will choose to remediate resources and processes on violation of certain policies to ensure that cost, security, compliance and changes are governed.

Recommendations

Cloud management is changing from a “block on entry” to “trust but verify” model.  Some enterprises who wish to govern with an absolute control at gates will continue to use cloud management platforms extensively and effectively.  However, many enterprises are beginning to move to a new cloud 2.0 model where agility and flexibility of DevOps tools and processes are critical for their success.  Instead of prescribing a single entry choke point or a single “CMP tool” to work with cloud, we allow everybody in with their own tools and processes, but continuously verify that policies for deployment, resource usage, quota, cost, security, compliance and changes are continuously tracked, monitored and corrected. Simple effective API based policy as code definition, management, evaluation and remediation will be a central capability that enterprises will need to run new clouds effectively.

Full disclosure:  I work for BMC Software and my team has built a cloud native policy SaaS service, check out the 2 minute video here: https://www.youtube.com/watch?v=hSFP5-kzbT0

Acknowledgement: A few of my colleagues at work, JT and Daniel proposed the analogy of forts and cloud operational plane that is fascinating and cool.  This motivated me to write this blog to show how cloud management itself is evolving from guard at gates to trust but verify model.

 

 

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s