This blog post will show how AWS CDK and higher-level languages are the future in Infrastructure as Code. Here are my experiences in adopting CDK. I first started using CDK with Python in a high compliance environment.
This was written after deploying production infrastructure and experience of 10 months writing infrastructure in CDK. This blog post contains my experiences after adopting CDK. I will first give a short introduction on what AWS CDK is. You can skip this if you are already familiar with AWS CDK.
Short introduction to AWS CDK
Using the command-line tool you can deploy or synth. Using the deploy command, you directly create resources in AWS. You can do this directly from a developer machine or a deployment server like CodePipeline. This command is similar to Terraform apply.
Using the synth command you can synthesize to CloudFormation. We opted to use this command for two reasons. First, the synthesizing allowed us to reuse already build CloudFormation pipelines. Second, We could argue that we provisioned our infrastructure with CloudFormation. So we would not need CDK to be approved in our high compliance environment. Sometimes you have to apply some lean compliance to get somewhere.
CDK creates constructs inside stacks. These constructs contain AWS resources like SQS or Lambda's. These stacks are converted into CloudFormation Templates in JSON files. You can deploy these templates using CloudFormation.
I would highly recommend seeing the AWS CDK re:Invent talk for an introduction to AWS CDK.
A developer is as fast as how powerful the IDE is. With AWS CDK, you have an extremely powerful IDE as you use a standard programming language. IDE developers have invested decades in making these powerful for languages like Typescript and Python. You can use all your shortcuts. With CDK, you can use your favourite IDE without any need for plugins.
With CDK, You have full code completion and error checking typical for your language. This power is a fantastic advantage for a developer. You get immediate feedback on the code you write.
The type lookup is fully working in Python, so you never have to leave the IDE to look up documentation. If you want to know what arguments you need to create a Lambda, use your shortcut to see the constructor and all the documentation is there. Any unknown type in the arguments? Just use your hotkey to jump to that type and read what it is and how it works.
Having documentation in code is much faster than Terraform or CloudFormation. There you are frequently looking at the documentation website. In CDK, you explore and discover while you are writing code, instead of beforehand reading up on everything you expect to need. You never know what you will need. The typings in CDK are incomparable to the AWS SDK Boto3 where you are also frequently stuck reading documentation online. You need to try it to experience the difference.
If I go back to writing CloudFormation, it feels like I am writing CloudAssembly. The CloudFormation plugin for JetBrains and combining it with the CLI tool cfn-lint gives some help but is very limited. CloudFormation is just YAML or JSON.
The Terraform auto-completion and type checking have come a long way in the last couple of years. As it is JSON compatible, it will always have limitations in comparison to a full language.
CDK still has one trick up its sleeves: debugging. You can run your program as a standard typescript or python program with breakpoints. Debugging helps in fixing small bugs like an interpolation of a name. Better yet, you could write unit tests for this in the same way as you usually do!
Train Engineering not Technology
With CloudFormation or Terraform, developers have to learn another technology. Either you need to learn Arcane templating skills for CloudFormation, or you need to learn the Hashicorp Configuration Language(HCL) for Terraform. You cannot hope to find candidates with the skillset that precisely overlaps with the particular technologies that you have chosen in the past. So you will need to train them.
Having the same language for application and infrastructure with AWS CDK allows your new devs to quickly start making small changes to the infrastructure with the familiar language. These small changes teach them how to add bigger changes. With CDK, you do not have any ramp-up time of first learning new technology.
Getting a dev up to speed already requires loads of investment in knowledge sharing. So having one big learning task less is a great advantage. You can start investing earlier in training the new hires for better software delivery engineering and teaching domain knowledge. Teaching these skills has a more meaningful benefit for the customer and the organization than teaching them a different technology.
AWS CDK still produces a declarative output. The declarative way of thinking in IaC is a stumbling block that is enhanced by CDK as it obfuscates it to the developer. So you will still need to teach this mindset. What works in regular application code, does not work in infrastructure code.
An example pitfall is if statements in the code that determines what resources get created. CDK evaluates these if statements during the synthesis step and not during the deployment of the infrastructure. So these if statements do not translate to conditionals in CloudFormation within the CDK output. Here is an example of adding an SMS subscription to an alert topic if it is a production-like account.
topic = aws_sns.Topic(stack,'AlertTopic') # The ACCOUNT_ID is not known during synthesize. if core.Aws.ACCOUNT_ID == ACCEPT_ACCOUNT_ID or core.Aws.ACCOUNT_ID == PROD_ACCOUNT_ID: subscription = aws_sns.Subscription( stack, 'SMS', endpoint=phone_number, protocol=aws_sns.SubscriptionProtocol.SMS, topic=topic) # Instead add an condition to a resource. expression = core.Fn.condition_or( core.Fn.condition_equals(core.Aws.ACCOUNT_ID, ACCEPT_ACCOUNT_ID), core.Fn.condition_equals(core.Aws.ACCOUNT_ID, PROD_ACCOUNT_ID)) account_condition = core.CfnCondition(stack, 'AccountCondition', expression=expression) subscription = aws_sns.Subscription( stack, 'SMS', endpoint=phone_number, protocol=aws_sns.SubscriptionProtocol.SMS, topic=topic) subscription.node.default_child.cfn_options.condition = account_condition
Resources that you use multiple times have common characteristics and dependent resources. In AWS CDK, these are called constructs. You can create higher-level abstractions constructs with compositions of resources. This is where CDK truly shines.
"Not being able to create these abstractions was the main motivation for us to start talking and thinking about CDK." - Elad Ben Israel
In CDK, it is as easy as subclassing a class. We heavily use subclassing to construct our resources that are compliant and come with default monitoring. For example, you want to add default metrics and alarms for a Lambda.
We usually deploy code from S3, but sometimes we have the code inside our CDK as it is a very simple worker function. By adding two subclasses we no can easily create these types of functions. All with our default alerting and metrics.
We are now building a library providing resources that are fully compliant with our organizational compliance rules. Teams can use our library and have compliant ready resources instead of everyone figuring out how to build themselves. This library speeds up anyone deploying infrastructure within the organization and creates a culture of inner sourcing. But you can reach abstraction levels of complete services.
AWS has begun building and providing these higher-level abstractions with the AWS Solutions Constructs. It is a library containing architecture patterns with sensible defaults. The library contains simple patterns like a Lambda connected to an API Gateway that writes to a DynamoDB or a Lambda that reads from a DynamoDB stream. These patterns can be combined so that you can quickly deploy an API that processes items.
In Terraform, you have modules that provide a similar concept. With modules, you can build small and single-purpose units of code. The modules are easily testable with infrastructure testing. There are high-quality community modules that I have used in production. But developers have a hard time starting to use modules in Terraform. It is much easier to subclass and extend resources in CDK as they are used to it from application coding.
In AWS CloudFormation, you can use nested stacks. But the learning curve is pretty steep to start using these. So most developers end up copy-pasting code.
The most common problem I see with IaC at companies is that it is just a big monolithic codebase! Teams deploy all their infrastructure in just a couple of stacks in CloudFormation or remote states in Terraform. The codebase can be huge files with no thought of architecture. Somehow they treat IaC differently and do not put the same engineering effort into it. But IaC code is much harder to change than application code as it translates to physical components that need to change.
With Infrastructure as Code (IaC), we need to relearn to apply the better practices that we already learned with the application code. Somehow developers commonly do not translate the same practices between infrastructure code and application code. AWS CDK helps solve this problem by allowing devs to use the same language. They know what good Python or Typescript looks like and can translate this to IaC.
Stability of AWS CDK
We have used AWS CDK for eight months now. During that time, CDK went from 1.16.* to 1.48.*. Due to an unrelated issue, we had to pin our versions to 1.17.* for a very long time. We now have upgraded without any significant backward compatibility issues. These issues are a common struggle that you see talked about on Twitter. But we did not experience it even with an overdue upgrade of 20 versions.
We did run into one bug in the API Gateway that was reported and open on the GitHub issue tracker. By restructuring our architecture, we were able to work around this bug. We did not lose much more than a few hours on this bug. This bug is now fixed.
Our choice is straightforward for adopting CDK in light of its stability. We choose a definite and always applied speed up now over a potential small inconvenience in the future.
Interoperability with CloudFormation
Likely, you are already deploying infrastructure with CloudFormation. So a common question about AWS CDK is how backward compatible it is with CloudFormation. CDK and CloudFormation use the same import and export system. So it is fully compatible with CloudFormation.
We never had any problems mixing CDK with CloudFormation. We deploy most of our lower layers with CloudFormation due to legacy. These layers export the necessary outputs that higher layers dependent on like S3 bucket names. But you can also build CloudFormation on top of layers deployed by CDK as it can export values in the same way. We are migrating lower layers to CDK, where the investment makes sense.
The benefits of AWS CDK are immense for us. We can develop and deploy our resources faster. This speed-up will only increase when we can stand on previous work as reuse is much higher in CDK. We are sharing this with other teams inside our organization making compliance easier.
In any new project with AWS, I would adopt CDK over CloudFormation. I would propose to migrate to CDK in any existing project with CloudFormation. In multi-cloud environments or with third-party providers, Terraform can still be a good option, but I would first investigate Pulumi for new projects.
Give AWS CDK a try. You are sure to get return on your investment. We did for one day and never looked back!