Running Docker containers on Swarm, Mesos and Kubernetes Clusters

We believe that the next gen cloud and datacenter will be based on cluster managers such as Kubernetes and Mesos.  Some call it the “Datacenter operating system”.  These cluster architectures can not only support big data analytics and real time apps like Hadoop and Storm but also can support Docker containers and many other type of application workloads.

Mesos is a cluster management framework heavily used by Twitter, Airbnb and Google.  . Kubernetes is a cluster manager based on internal Omega  cluster manager that Google has been using for over a decade managing their  Linux containers. Even though large scale web and social internet companies have used these infrastructure, these infrastructure architectures are now slowly finding their way into the traditional enterprises.   I ran two experiments to get familiar with these technologies and convince myself that running Docker container workloads is truly as easy as it sounds.

I was able to successfully deploy Mesos, Marathon and Zookeeper to build a datacenter OS on Amazon EC2 and provision several Docker containers on it. through  a UI and REST API.  A couple of hours on a lazy Saturday was enough to get this done.  I also setup a Amazon ECS as well.  Once these two clusters were ready, it was very easy to provision containers to either of the two clusters.

In a later post, I will post the instructions to get this done.  All you need is an AWS account.  It is really exhilarating to see that without IT I can deploy large complex infrastructures and push Docker workloads on it so easily.  This is clearly going to be the wave of the future.

Best practices for containerizing your application – Part-I

We have a team working to dockerize a few of our applications to Docker over the past few months. As we went through this dockerization process, we have collected a set of challenges and best practices in using Docker.  We don’t yet have all the answers but at least we know the questions and issues that any application team will face when converting a traditional application into Docker containers.  Broadly, we have categorized our best practices into 3 categories:

  1. Design containers for your apps
  2. Getting the devops pipeline ready
  3. Running containerized app in operations (production)

We will cover each of these in three parts.

1. Design your containerized App

Break your app down

One of the first steps is to understand the overall architecture of your application in terms of tiers, complexity, stateful/stateless, datastore dependencies and such.  It would also be useful to know the running state of the application, such as the number of processes as well as the distributed nature of the application.  Based on this information, it might be useful to break a monolithic application into components that would each represent a container or keep the monolithic application as one container.  If there are multiple containers, then the complexity of the solution increases as communication/links between containers will need to be designed.  Of course, keeping one big monolothic container is also difficult to use if it becomes huge (multiple GB) and replacing components would require a full fork lift upgrade where you lose the container benefits. The rest of the discussion applies to each component container.

Base image selection – standardize and keep it updated

It is important to standardize on a base image or a small set of base images that you need for your containers.  These can be RHEL, Ubuntu or other operating system images.  Traditionally, these are maintained by an infrastructure team or can be maintained by apps team as well.  Note that these images will go through a number of compliance and possible patching and rebuilt due to vulnerabilities and hence require careful selection and attention during their lifecycle to keep it updated.  Finally, standardization is a key aspect so that multiple application teams use a common set of base images.  This will help enterprises achieve simplification in day 2 operations and management of these images.

Configuration and Secrets

Configuration and secrets should never be burned into a container (Docker) image and must always be externalized.  There are a number of ways to achieve this such as through environment variables, through scripts with Chef/Puppet and through tools such as Zookeeper and Consul.  Secrets also require a store outside the container ecosystem especially in production such as vault and many others such as Keywhiz, HSM and others.  Externalizing configurations help in making sure that when the containers are provisioned in DEV or QA or PROD, each have a different set of environment variables and secrets.

Datastores and databases

The stateful containers such as databases and datastores require a careful analysis of whether these should be containerized or not.  It is alright to keep them non-containerized in the initial releases of the application.  There are ways in Docker to use data containers and mounted volumes but we haven’t investigated this further.

Once you have taken a few of the above critical decisions on a plan to containerize your application, you are on your way.  In part-II, we will cover how to build DevOps pipeline and best practices for tagging builds and testing images.  And finally, in part-III, we will cover the best practices for running container applications in production.

Can OpenStack be used without a Cloud Management Platform? Five Challenges that you would face if you did just that.

Many customers are attracted by the fact that OpenStack is freely available and wonder about the need for a cloud Management Platform, such as BMC CLM. However, market experience and a number of cloud false starts have shown that cloud computing—at least the successful kind—is not easy. Heterogeneous infrastructure and multiple platforms are the reality for most enterprises today and, combined with increasing levels of IT security threats, make cloud management a complex and sometimes daunting task. Enterprises struggle to manage this complexity with OpenStack alone. To explain why, we analyze 5 challenges of running a private cloud with OpenStack without an accompanying CMP, such as CLM.
Challenge 1: Breadth of functionality
Building a cloud solution takes more than just the technical infrastructure functions and management tools that OpenStack provides. It involves implementing a set of business, architectural, and functional requirements which OpenStack usually lacks.

Challenge 2: When is “free” really free?
Although OpenStack is marketed as free software, industry experience so far has been quite the contrary because there are hidden costs to implement, operate, and support OpenStack. Each year there is increasing agreement among customers that a skilled engineering team is needed to develop missing capabilities and then customize, integrate, and maintain OpenStack to make it usable in enterprises. Most deployments require five to ten engineers to do development, customizations, integrations, and operations using OpenStack. The development team typically enhances OpenStack with needed cloud management capabilities such as governance, UI enhancements, compliance, automation, and policies. With BMC CLM, this additional development effort for would not be needed. Of course, both BMC CLM and OpenStack require integrations to enterprise systems as well as day to day operations for these systems.

Challenge 3: Depth of functionality
Governance, policies, and pooling of resources
OpenStack does not have deep and flexible functionality in governance, policies, and pooling of resources into higher-level logical constructs, such as logical data centers and configurable user-extensible policies to map workloads to logical data centers. BMC CLM offers an extensive mechanism to group resources into pools and logical data centers and mark them as shared or private, and has flexible, configurable policies for workload placement based on tenants, tags, or custom workflows.  It also has deep governance ranging from reclamation of resources, quota management (OpenStack has this capability), change management, and CMDB integration.

Platform support
Even though there is a good breadth of platform support in OpenStack, the deep functionality required for enterprise cloud management is lacking at times in many of the drivers. OpenStack Nova provides full support for KVM/QEMU but limited support for Microsoft Hyper-V, Citrix XenServer, and VMware vSphere (which are fully supported by BMC CLM). Hence, if the deployment is using KVM, OpenStack has full functionality, but for others it is better to use platform support that BMC CLM provides directly to these hypervisors.
Service catalog
While BMC CLM has a very extensive service catalog to allow administrators to define offerings and entitlements per tenant, OpenStack lacks this level of flexibility.

Challenge 4: Managing risk
We have all heard about the huge increase in IT security threats that have emerged over the last year or so. Hacking incidents, viruses, and vulnerabilities such as Heartbleed, Ghost and ShellShock have hit many companies hard. No IT organization can afford to ignore risk management for both legacy and new cloud infrastructures. Compliance, security, patching, governance, and policies are not built into OpenStack. Again, additional effort is required to integrate OpenStack to Chef, Puppet or some other tool to provide these policies, such as server hardening, server compliance, and server patching. BMC CLM can perform automated compliance and patching on services across all legacy data center infrastructure as well as public and private cloud infrastructure, including OpenStack private clouds, in a consistent manner to reduce risk from provisioning and throughout the lifecycle of the service.
Challenge 5: Heterogeneous platforms and hybrid cloud infrastructure are a reality
If an organization has a single OpenStack infrastructure; does not have any other infrastructures such as VMware vCenter, Microsoft Hyper-V, or public clouds; and has little governance or automation requirements, then the need for a CMP is questionable. However, most enterprises have a hybrid infrastructure with multiple platforms such as Hyper-V, vCenter, and KVM; multiple private clouds; and possibly even multiple public clouds. Sourcing policies seeking to avoid vendor lock-in, as well as mergers and acquisitions, dictate that heterogeneous infrastructure is the new reality. Managing across all of these different platforms becomes very complex: with different people, processes, and technologies required to manage each infrastructure, IT costs can quickly skyrocket. To provision agile services while ensuring costs are kept under control and risk is minimized, IT organizations require a management platform that can abstract the complexity of provisioning and managing across heterogeneous infrastructures and provide a single pane of glass for users as well as administrators.  BMC CLM orchestrates the agile delivery and ongoing management of IT services across hybrid cloud and legacy infrastructures to reduce costs while applying consistent compliance and governance policies across all platforms.

Conclusion

Running OpenStack without a cloud management platform is sufficient only in basic cloud use cases. OpenStack has a number of gaps that preclude it from being a complete enterprise grade cloud solution. OpenStack and CMPs such as BMC CLM are not competitive but complementary. Using them together will make private clouds truly enterprise grade.

Cloud Lifecycle Management for CI/CD DevOps – Part II

In part-I, we described the challenges in a typical CI/CD environment.  In this blog, we will show how BMC Cloud Lifecycle Management (CLM) can be used to address these challenges.

AUTOMATING CI/CD DEVOPS PIPELINE USING CLM

By using CLM along with Jenkins and a few other automated testing tools and scripts, we were able to address the above challenges and provide a complete fully automated devops pipeline that resulted in huge developer time savings, sanitized consistent environments and deployment automation.  Let us see how CLM was used to achieve the devops pipeline for CLM leading to a high ROI far beyond our expectations in many other areas shown below.

As seen below in Figure 2, as soon as build is successful, automated testing of the CLM application takes place that runs thousands of tests by provisioning test infrastructures using CLM.  After successful tests, the hardened CLM application is automatically converted into a “service offering” in a CLM for that specific build.  The CLM service catalog is updated with this new service offering and made available to the development and test teams for downstream activities such as provisioning of the latest CLM application stack for their individual testing.  We have 100’s of developers each day creating CLM application stacks (‘application environments’) as shown below.

CLMatCLMprovisioningstacks

Figure 2. CLM is used to provision hundreds of CLM stacks each day

CLM Service Catalog

Figure 3 below shows an example service catalog that is used by our engineering team for on-demand deployment of CLM application stacks for current and older releases.  Offering deployable application environments for CLM daily and prior releases have resulted in high developer productivity.

CLMatCLMservicecatalog

Figure 3. Service catalog of CLM service offerings

Converting CLM deployable artifacts into service offerings

Once the CLM has been built and tested through the DevOps cycle, the CLM deployable artifacts are automatically converted into service offerings that developers can request through service catalog as shown above.  This consists of a number of steps that are automated:

  1. A virtual machine is provisioned through vCenter
  2. CLM deployable artifact is next installed on it
  3. Sanity tests are executed
  4. After this, a template is created using vCenter CLI
  5. Finally, CLM SDK/API calls are used to update or create new service offerings based on the newly created template that reflects the new version of the CLM application

Day 2: CLM application – Take environment snapshot

A developer who has provisioned their own CLM application stack can take snapshots of the complete stack using a day 2 action, such as “TakeVMSnapshot”.  This is shown in the figure 4 below and can be useful in saving application and machine state for debugging during the dev cycle or reverting back to a consistent state.

CLMatCLMTakeSnapshot

Figure 4. Taking a snapshot of developer’s CLM environment

This new custom action was implemented by creating a BMC Atrium Orchestrator (AO) workflow that implements the action to take a snapshot, configuring the Callout Provider, and then using API calls to import the BMC Atrium Orchestrator workflow to BMC Cloud Lifecycle Management.  The high level steps to do this are given below:

  • Step 1: Define and write the AO workflow.  This workflow accepts the context information for the virtual machine identifier (additionalInfo parameter holds this information about ComputeContainer, VirtualGuest, User, Service Offering Instance), calls nsh script that connects to BSA which then invokes snashot on vCenter.  Alternatively, AO workflow can also directly use the vCenter adaptor to directly execute a snapshot call to vCenter.
  • Step 2: Configure Callout provider in providers.json
  • Step 3: Use REST API to import AO workflows
  • Step 4: Customize parameters in Reference Action Catalog.  In our case, we added these parameters:  Set optional, Set encrypted, Set parameter order and Set end user input required
  • Step 5: Use REST API to refresh Action Catalog
  • Step 6: Put i18n labels

See user documentation for more information on creating custom server actions: https://docs.bmc.com/docs/display/public/clm45/Creating+custom+server+operator+actions?src=search&src=search.

Day 2: CLM application – Update a component

In addition to taking day 2 actions for taking snapshots of the CLM application stack, a developer can also update a specific component within the application stack by either pushing code directly to the stack or using another day 2 action such as ‘Update Component on CLM Application’.  This can help in developer level integration testing of their component and testing it against a complete consistent CLM application environment.

Benefits

The benefits for several of the stages in our devops pipeline are summarized below:

Stage Product used for automation Metrics Savings
CI Builds Jenkins 5000 builds/release in 6 months
Automated deployment CLM Hourly, Nightly, Daily and Weekly deployments done using CLM!
Automated testing Silk and Selenium 1000’s of tests run each day automatically
Environment infrastructure provisioning and deprovisioning CLM 30 per day No static infrastructure – savings $$$
Day 2 actions – snapshots CLM Hundreds each day by developers
Service catalog and Portal 20 service offerings, 200 users and 150 SOIs Productivity benefits with consistent build and deploy environments – developers can deploy environments in one-click
CLM Application Lifecycle management CLM Developers and QA can manage lifecycle of their private CLM application stacks by starting, stopping, updating continuously as needed. Consistent simplified UI and environment to provide common dev tasks results in developer savings.
Automated reclamation CLM Unused CLM stacks are automatically reclaimed $$$ savings
Testing and Application infrastructure on any target – on-prem and in cloud CLM CLM allows to provision application environments to on-premise as well as AWS and other cloud infrastructures $$$ savings and flexibility
Maintenance of older releases CLM Service offerings for prior releases also available with pre-baked data Huge savings as prior releases can be deployed in one click through service offerings

INNOVATION AND LABS USING CLM

In addition to CLM application stack as service offerings for our developer community, we are also offering many other container and PaaS stacks for developers to do innovations and experimentations.  For example, we have made available “Docker Hosts” as a service offering in our service catalog that allows any developer to request a Docker Host in a few minutes to start using containers.  We also plan to make PaaS environments and other middleware application environments available to developers to experiment with new technologies and continue with innovations.  Finally, during our conferences, we have used CLM to manage and run our hands-on labs to provision hundreds of CLM application environments on AWS cloud for training at scale.

CONCLUSIONS

At BMC, we take drinking our own champagne very seriously.  We continuously build, test and deploy CLM using CLM.  This has resulted in huge improvements in product quality, speed and agility in providing faster releases to the development team and infrastructure savings.  We believe that CLM can be used very effectively in managing any application DevOps pipeline in three critical areas:  a) Infrastructure as code b) Dynamic infrastructure can result in cost savings such as test environment creation and decommissioning, and c) Providing service offerings to developers for machine, PaaS, middleware and application environments that will increase developer agility and happiness.

Cloud Lifecycle Management for CI/CD DevOps – Part I

At BMC Software, we build, test and deploy Cloud Lifecycle Management (BMC CLM) product in our devops cycle using CLM itself.  This continuous integration/continuous deployment CI/CD DevOps pipeline enables our engineers to provision on demand the latest consistent application environment with a single click.  We use CLM to run DevOps pipeline for CLM application itself by taking advantage of CLM’s service catalog, service offerings, deployment capabilities and “infrastructure as code”.  CLM is effectively used to provision and de-provision several hundreds of development and testing infrastructures and application environments each day by our engineering team across continents.  This has allowed us to build high enterprise grade quality into CLM with automated pipelines, faster agile deployments, managing infrastructure more efficiently and ability to provide a consistent latest development environment for our engineering team all available through service catalog.

We will in this part-I highlight the challenges in a typical DevOps CI/CD environment and in part-II we will show how CLM solves this challenge.

CHALLENGES

Our DevOps CI/CD pipeline is shown below in Figure 1.  As you can see, CLM product development process goes through a number of stages that includes build, test and deploy of CLM product.

CICDProcess

Figure 1. CI/CD pipeline for CLM
Early on, we faced a number of challenges implementing and automating our DevOps CI/CD  process.


Challenge 1. Complex and slow manual CI/CD Process caused lack of agility and productivity

One of the first challenges we faced is that our CI/CD pipeline was manual requiring a lot of manual steps by engineers to get daily or weekly builds deployed to multiple target environments.  This resulted in days to weeks to produce a good working environment.  This started impacting developer productivity before we automated our pipeline using CLM and other tools.

Challenge 2. Consistent development application environments were hard to maintain without automation

As developers checked in code continually, there was a need to get a complete consistent view of multiple check-ins by the team of developers to do unit, integration and system testing.  Before CLM, we had many application and test environments that were not easily reproducible and traceable that led to wastage of resources.

Challenge 3. Static infrastructure and application environment sprawl increased our costs

As a part of our CI/CD pipeline, we have a large number of infrastructure environments that need to be provisioned and decommissioned each day for unit, integration, quality, system, performance and security testing.  This lead to an explosion of number of environments that we had to maintain and led to increasing costs for keeping infrastructure always available and ready when it was only used for a limited time during testing.  We also experienced application environment sprawl since there was no automated reclamation of unused environments.  This increased our costs as well.

In our next part-II, we will show how CLM addresses these challenges and simplifies the developer experience.