Implementing Cloud Governance as a code using Cloud Custodian

Why ?

You would assume, I would start with What is Cloud Custodian but in this case Why is more important.

As organizations continue to increase their footprint in public cloud, the biggest challenge they face is applying governance and effectively enforcing the policies.

Most organizations drive this process (detecting violations and enforcing policies to remediate those violations) in the form of multiple custom scripts. There are tools like for AWS Config and Azure policy that also solve the same problem that Custodian does but there are some pros and cons.

AWS Config and Azure Policy are fully managed services as opposed to Custodian where you manage the setup. Moreover Custodian is an open source tool which is free to use whereas to work with AWS Config you have to pay.

One another reason as to why Custodian is preferable is because it is not as tightly bound as AWS Config and Azure policy where there are some predefined rules which limits the customization.


What is Cloud Custodian ?

Cloud Custodian is an open source rule engine where you can define your policy in YAML and then by enforcing these policies you can manage your resources in public cloud for compliance, security, tagging and saving cost.


Scenario – Enforce a policy that detects missing tags in EC2 instances and adds those tags.

Prerequisites –

  • An AWS account
  • Python v3.7 and above
  • Basic understanding of resources in cloud
  • Proficiency in YAML

Installation –

For AWS, the installation is straight forward. Just log in to your AWS account and open AWS cloud shell and hit the following commands

python3 -m venv custodian
source custodian/bin/activate
pip install c7n

Defining a Policy –

A custodian policy consists –

  • Resource – Custodian can target resources in AWS, Azure as well as GCP. Resource is basically the target for which you want to enforce your policy like EC2, S3, VM etc…
  • Filters – Custodian allows you to target a subset or an attribute of resource using filters. A common way of defining the filter is via JMESPath
  • Actions – Custodian allows you to enforce a policy with the help of actions. You can define any kind of action like marking, deletion, sending a report etc…

For our scenario, below is a sample policy file written in YAML that targets EC2 instances for missing tags CI and SupportGroup and then defines a tag action to apply those 2 tags wherever missing.

	policies:
  - name: ec2-tag-compliance
    resource: ec2
    comment: |
      Report on total count of non compliant ec2 instances
    filters: 
      - or:
          - "tag:CI": absent
          - "tag:SupportGroup": absent
    actions:
      - type: tag
        tags:
          CI: Test
          SupportGroup: Test

TRY IT OUT –

In the AWS cloud shell, create a file ec2-tag-compliance.yaml.

touch ec2-tag-compliance.yaml

Using an editor like VI, copy paste the policy as above and then save and quit VI editor.

If you are not familiar with VI then take a look at this blog where you can learn and get familiar with basics of VI.

Let’s first try a dry run where the actions part of the policy is ignored. Using dry run you get to know what resources would be impacted and it is always a good practise to test your policy before directly applying it.

custodian run --dryrun --region me-south-1 ec2-tag-compliance.yaml -s custodian/

syntax - 
custodian run --dryrun --region <region code> <name of policy file> -s <path to export the output>

As you can see in the image below, after this command is run Cloud custodian went ahead and checked all the ec2 instances where the configured tags were missing.

It was able to locate one such ec2 instance and hence the count as 1 (highlighted in yellow rectangular box).

To get a grid view of the impacted resource you can use custodian report

custodian report --region me-south-1 ec2-tag-compliance.yaml -s custodian/

The result is an output in the form of grid where you get the InstanceId of the ec2 instance that was missing the tags mentioned in the policy.

Now that we know how our policy will impact our resources, lets go ahead and run the custodian command to enforce the policy (add the missing tags).

custodian run --region me-south-1 ec2-tag-compliance.yaml -s custodian/

You can see in the image above that the action:tag successfully being implemented on that one resource (ec2 instance) that had the missing tags.

Logging –

The following files are created when we run the custodian command –

  • custodian-run.log – Detailed console logs
  • metadata.json – Metadata of filtered resources in json format
  • resources.json – A list of filtered resources in json format

WHAT’S NEXT ?

While this is a very simple and straightforward way of running custodian locally, this is not how custodian would be used in live environments.

Following are the different ways in which custodian is usually deployed –

  • Independent lambda function
  • With a CI tool like Jenkins and implemented within a docker image

We will try to cover the above 2 methods in upcoming blog posts.

One thought on “Implementing Cloud Governance as a code using Cloud Custodian

Leave a comment