Fractal Architectures: A Software Craftsman's take to Infrastructure as Code

richardjennings · on Dec 14, 2020

"Infrastructure As Code" is a misnomer in my opinion. The underlying mechanism is more "Infrastructure as Configuration". Terraform syntax is called HCL "HashiCorp Configuration Language". The "code" aspect of (consuming) Terraform is oriented around providing dynamic configuration capabilities utilising reusable code Modules. As GraphQL has led people to question the applicability of REST type API's in SPA contexts, so I believe IaC will eventually suggest a transition towards single Cloud API endpoints where the entire required state is described; reducing IaC to configuration. Currently the programming or Code aspect is required as a function of API design and the corresponding wiring of components together. Potentially not necessary and not particularly helpful.

avrionov · on Dec 15, 2020

The term makes a lot of sense if you know when and how it was introduced. When the early proponents of "Infrastructure As Code" started to write about it, the infrastructure was defined often time in multiple config files or directly in the UI of tools like load balancers, firewalls, proxies, etc. The big value of the movement for me at least was the proposal to threat the infrastructure definitions and the corresponding scripts as code: - all infrastructure definitions are under source control. - there is well defined process how the changes are made with review and approval process. - the infrastructure can be reproduced and re created for integration environments, load testing, staging etc. - using a common language like Teraform allows better visibility, helps with code reviews, etc.

The cloud providers enabled all that.

To summarize: even if most of "Infrastructure As Code" is implemented via descriptive language and not procedural, it still follows the best software development practices.

Gravityloss · on Dec 15, 2020

Yes, you apply some software development principles: version control, builds, continuous integration, versioning, automation.

The software industry is stupendously large and has invested a lot of person centuries to these concepts and tooling. They have been able to create these tools for themselves because the tools are software. These hard won fruits of labor could be applied to many fields, and are.

Of course some tooling doesn't fit so well directly, there needs to be some adaptation of the concept, and writing of new software and creation of new processes.

hardwaresofton · on Dec 15, 2020

> "Infrastructure As Code" is a misnomer in my opinion. The underlying mechanism is more "Infrastructure as Configuration". Terraform syntax is called HCL "HashiCorp Configuration Language". The "code" aspect of (consuming) Terraform is oriented around providing dynamic configuration capabilities utilising reusable code Modules.

While TF is the most popular interpretation of IaC these days it is definitely not the only way to do it. I have maintained that HCL is the worst part of Terraform -- it's likely that DSL will grow until it very nearly rivals a full grown language, when they could have gone with full blown languages to start (like Pulumi[0]). Terraform does now support native programming language via their CDK support[1]. This meets the "code" requirement.

There is a trade-off of course -- opening up the door to a fully powered programming language means opening the door to infinite complexity, but I think I'd rather have that than have the DSL that has spiky (though constrained) complexity.

> . As GraphQL has led people to question the applicability of REST type API's in SPA contexts, so I believe IaC will eventually suggest a transition towards single Cloud API endpoints where the entire required state is described; reducing IaC to configuration

Agree -- the interface will likely be kubernetes-like.

> Currently the programming or Code aspect is required as a function of API design and the corresponding wiring of components together. Potentially not necessary and not particularly helpful.

On the meta level, it looks like we've run into the ol' "are declarative languages code" (alternatively, "is HTML code?") question.

A free unsolicited hot take on GraphQL: In my opinion GraphQL is just funny looking SQL, which offers every codebase the opportunity to become as complex as the time-tested query execution engines in real production-ready databases. It's almost like how mongo let you pull schema validation and constraint checking from your relational database up to your application language, which as far as I'm concerned is a bad idea most of the time. That said, just like mongo, the productivity brought about by GraphQL is undeniable (whether real or imagined), so I tread lightly and often don't speak ill of it.

[0]: https://www.pulumi.com/docs

[1]: https://www.hashicorp.com/blog/cdk-for-terraform-enabling-py...

monus · on Dec 14, 2020

In Crossplane, we have a similar notion for your apps deployed to Kubernetes. Platform builders would define the schema definitions for their API and then they would create several `Composition` objects, each representing a different way of satisfying the same API. Then app developers include a claim YAML together with their artifact. The `Composition` to be chosen is decided via selectors or by platform builder.

This way you can include generic `AcmeMySQLInstance` (think of it as interface in Golang or protocol in other languages) and then one of the Compositions is selected to satisfy this requirement: `AWSRDSInstance`, `AzureSQLServer` or `GCPCloudInstance` or some in-house configuration. The protocol between app dev and builder is the schema and the credential object (`Secret`) that is being published.

See details in docs: https://crossplane.io/docs/v1.0/getting-started/package-infr...

Disclosure: I'm one of Crossplane maintainers.

tinodelletna · on Dec 14, 2020

Awesome, thanks for sharing!

jcims · on Dec 14, 2020

"When you build infrastructure as code, you get infrastructure by coders."

- Michael Scott

movedx · on Dec 14, 2020

So straight off the bat I won't be reading this article.

The title sounds interesting and I think there are insights to be gained, but I had to disable NoScript for the page/tab to load; I had to disable NoScript, the means by which I protect my self online (one of many few layers), so I could render some text.

After disabling NoScript to see that text it looked terrible.

This has led me to believe that a "software craftsman" that can't get text to me over the Internet without JavaScript doesn't really have much to teach me.

Sorry, friend. I'm sure your intentions are good, but the delivery of those intentions was frustrating and utterly pointless.

tinodelletna · on Dec 14, 2020

Really sorry about that. Our site is a SPA so it really relies on Javascript. We have put a copy of the article on Medium: https://tinodelletna.medium.com/fractal-architectures-a-soft... I hope that will work

_rzt9 · on Dec 14, 2020

Cheers! much better. (screencap for how it renders on windows: https://i.imgur.com/jhxhjH0.jpg)

tinodelletna · on Dec 14, 2020

Thanks! We will get on that CSS ASAP :)

rodrigosetti · on Dec 14, 2020

Why is it a SPA?

avaldeso · on Dec 15, 2020

Single Page Application. A single page running JavaScript to render different views without page reloading.

https://medium.com/@NeotericEU/single-page-application-vs-mu...

tinodelletna · on Dec 14, 2020

Really good question, a bit off-topic though. The main reason was to get off-line capabilities and other future features we have in the pipeline atm.

hardwaresofton · on Dec 15, 2020

I wish the author had spent more time looking at Pulumi. I think I can boil the article down to the following assertion:

"True"/Better infrastructure as code means packaging infrastructure with code. The ability to define structure (a la classes/data types) and instantiate instances (a la class instances/value types) at the macro system level is important.

This is expressly supported by Pulumi's component and custom resources[0]. I think they don't even know how powerful this feature is, because it's buried so deep in the documentation.

[0]: https://www.pulumi.com/docs/intro/concepts/programming-model...

yowlingcat · on Dec 14, 2020

While I'm not sure I find the fractal metaphor clarifying (fractals are absolute _last_ kinds of structures I want to see anywhere near my infrastructure definitions), I do sympathize with the "false advertising" of infrastructure-as-code when it's merely checked in. As the age old saying goes, programming in XML (or JSON, or ...) is hell when you really want to reach for a proper programming language.

This is part of the issue with CloudFormation. I've seen monstrosities like an entire lambda function source code definition written into a plaintext key in a YAML file, and worse yet, seen this described as normal. Yes. Normal in hell is what it is.

So with all of this said, I think developments like the AWS CDK are very cool. CDK takes the approach of a high level SDK over programming in YAML or anything like it. I like it a lot. It's higher level than something like `boto` which is just raw function calls, but not so high level that you're trying to write deployment logic in YAML. You get a nice in between which is written to behave in whatever language you use it in. I believe this approach was spearheaded by AWS internally after seeing how CFN become something widely considered a failure.

tinodelletna · on Dec 14, 2020

The Fractal metaphor is intended to be seen in the light of their self-similar property. It is the idea of having micro-structures of standardised infrastructure components that could be replicated indefinitely through the invocation of interface operations within a macro-structure (the blueprint).

About the AWS CDK I completely agree, we refer to CloudFormation in the article as that is ultimately the "output" of CDK, but we definitely agree that is a really good step in the right direction for IaC tooling. We "just" need someting portable across clouds now :)

sleepybrett · on Dec 15, 2020

I think thats what pulumi wants to be, but their pricing puts me off.

shaicoleman · on Dec 15, 2020

There's a fully functional open source version

idclip · on Dec 14, 2020

Im sorry, but this was alot of fluff to read. Good visuals, though, i see what you mean.

DevSecOps might be interesting. Looking forward.

Im an ops guy.

Edit: we need management to understand this.

tinodelletna · on Dec 14, 2020

Yes, I guess too much intro... if you have any specific feedback on parts I could have left unsaid it will be great to know for next time.

About management, totally agree. That's why we are already working on the next two articles.

idclip · on Dec 15, 2020

I would skip:

   1. introducing IaC, 
   2. and the OOP/functiobal bits

This stuff really just repeats itself everywhere.

I would: directly put my method in an RFC style

   1. „this is what this is (jpg here, maybe two comparison graphics. Of old arch, and yours), 
   2. and here how it works.

I did find the pet/catle metaphor great. im not against prose.

tinodelletna · on Dec 16, 2020

This is really great feedback! Thanks, we will definitely take this into account when writing more technical documentation.

ATsch · on Dec 14, 2020

Surprised to see there's still people using the term "Software Craftsman" in a positive way.

awill · on Dec 14, 2020

when I hear 'software craftsman' I assume it's a regular software engineer who wants to elevate their status.

baja_blast · on Dec 15, 2020

I cringe whenever I hear developers refer to themselves as Craftsman.

UncleOxidant · on Dec 14, 2020

thin font is very difficult to read.

tinodelletna · on Dec 14, 2020

Sorry, we will fix the CSS asap, until then if you are still interested you could look at the copy on Medium: https://tinodelletna.medium.com/fractal-architectures-a-soft...

jaza · on Dec 15, 2020

Pretty good article. As someone who has recently started using Terraform a fair bit (in AWS), I have to agree that it leaves me thinking "surely there's a better way". I agree, it's not code (any more than HTML is), and I agree, it does inevitably get heavily copy-pasted and become hard to maintain.

But I'm not sure that the article presents an alternative, at least not one that's fundamentally different. The conclusion says "hint: Ansible + Terraform". Ok, that's a start, but it's still just "config + config". Where's the code?

kords · on Dec 15, 2020

Serverless framework(https://www.serverless.com/) is another popular tool which can be used with multiple cloud providers. Unfortunately, it has similar limitations as the tools described in the article. I just wanted to mention it because, for me, learning curve was easier than Terraform or Pulumi.

gcaraciolo · on Dec 14, 2020

While reading I have been thinking that u implemented the framework by using some sdk like aws sdk + some interactive shell program that check the current infra and prompts changes on it. This should be well tested and could include a kink of integration test for changes and their impact considering the entire database.

This could be a way to mitigate problems with environment variables rotation.

stevenalowe · on Dec 15, 2020

grey font is unreadable

eyelidlessness · on Dec 15, 2020

And drastically too thin. They didn’t design this high DPI display to provide less detail. Thankfully I was able to get what I assume is the gist of the post with reader mode.

_5659 · on Dec 14, 2020

Can't get the page to load despite disabling all of my trackers.

tinodelletna · on Dec 14, 2020

https://tinodelletna.medium.com/fractal-architectures-a-soft...

tinodelletna · on Dec 14, 2020

Oh that's strange, I will post a copy on medium and get that link in here too.

oh_sigh · on Dec 14, 2020

http://archive.is/hBtjp

I'm sorry but there is no sane reason why you need to enable javascript to read a site that is 100% static content of text and images.

tinodelletna · on Dec 14, 2020

You are absolutely right. I am utterly sorry. We will work on a fully-compatible Links (the browser) version of the site as penitence.

yudlejoza · on Dec 15, 2020

We've had chroot jails, freebsd jails, and a ton of other options in 1980/1990s. Then someone decided to create a landing-page with a lipstick, and an SV pitch, and viola, docker is the revolution!

We've had make, autotools, ssh-expect/pexpect, pxe-boot, not to mention, shell itself, and a ton of other options. Oh but a few folks decided to create landing pages with lipsticks, and SV pitches, and viola, we have Ansible, Terraform, Kubernetes, the whole IaC revolution!

The same folks who shitted on autotools all their life without investing time in learning it properly, now are willing to invest 10x more time, putting up with all the warts and bugs and crap documenation of neo-IaC, and all-in-all being carpet rag fanboys of the new tools.

snickell · on Dec 15, 2020

I agree with the assertion that the core "hard tech" in K8S, docker, Ansible, etc has been with us a long time, and some of us did indeed use it.

That said, knowing what to mix and match to create a "pattern that you can learn and everyone uses and learns too" does contribute value, its essentially parallel to the value provided by linux distros.

I can "docker pull" from many different distros, with lots of premade apps.

I think the lesson for the "hard technology" folks is when you solve a hard technical problem DO think about setting up and promoting high-level standards with sane defaults. You are the most knowledgable person(s) for your "hard tech problem", and thereby often in the best position to standardize the default pattern by which it can be used.

That can contribute as much or more value than the hard technical work itself.

EDIT: also, don't forget to publish and promote STANDARDS giving a well documented "if you don't have a reason not to, do it this way" path for integrating useful tools together. That's essentially what docker is/did, and even though it can be recreated in a 100 lines of bash (https://github.com/p8952/bocker) the branding led to ~100,000 developers publishing containers that are fairly easy for anyone else to understand and build from.

bob1029 · on Dec 15, 2020

I sometimes wonder how many of us are running infrastructure across multiple nodes in support of an application that can practically be ran on a single instance.

For us, having production go down for 2~5 minutes while we spin up a prior VM snapshot is totally acceptable for our customers. I have a hard time believing this RTO does not exceed what would be meaningfully required for most businesses (i.e. actual $$$ impact vs someone's paranoid fantasies).

I totally concur on your 10x point. If you find your engineering staff arguing over containerization technologies or multi-cloud event-driven virtual actor architectures, you are probably wasting a lot of time. I would state that simply making the arbitrary decision to use containers is a massive mistake. All that overhead and complexity for what? You better have a damn good reason. Maintaining a typical Dev/QA/Staging/Prod stack is not sufficient justification. Unless you seriously fucked something up, your software should be able to be cloned+built+deployed+started in a few lines of powershell, et. al. What is stopping you from running this script on 4 servers or writing a little tool to do it from a web interface every time you click a button? Oh right, can't write your own CI/CD tools when there are so many on the market. So it goes. Down the rabbit hole and to the right.

For those who can handle a 2-5 minute RTO, and have business application performance requirements that can be addressed by a single 32+ core server, there is no reason you should be screwing around with anything beyond your basic language+framework tooling, SQLite, source control, and project management tools. This is an engineering paradise if your constraints allow for it. I would never squander this opportunity with shiny bullshit.

dcolkitt · on Dec 15, 2020

> I would state that simply making the arbitrary decision to use containers is a massive mistake. All that overhead and complexity for what?

What complexity? I've been hacking a small Node app in my dev environment for the past few weeks. Decided to stand it up in prod. It took me literally 10 minutes to setup the Dockerfile, build the image and deploy it to the server.

Later this week, I'll probably throw it onto a GKE cluster. It will take maybe 15 minutes to write and test the Kubernetes YAML. (It will also save 50%+ on the hosting costs, since self-healing means I'll be able to put it on GCE preemptible nodes.)

bob1029 · on Dec 15, 2020

When I refer to complexity there are dimensions aside from time to prod. You also have to consider the added risk of introducing all of these additional vendors and codepaths into your application stack. For some, this is not a concern at all. For others, myself included, there are practical concerns regarding minimizing attack profile for the types of applications we need to deliver.

I really enjoy being able to tell our clients (finance industry) that our software is entirely on first-party Microsoft dependencies. It makes dealing with audits so much easier. We have clients that will scan our servers and bug us about specific DLLs that show up on the various enterprisey security scanning tools. We got hit with an audit on one of our 3rd party DLLs and had to spend a week rewriting for a compliant implementation. This kind of thing can kill us at our scale, so we don't even risk it.

There will be those who rally against writing everything in house, but there are some serious advantages to it, especially if/when your team actually gets good at doing it. We can crank out a fairly complex dashboard in 1-2 hours using Blazor and our existing platform services. Add in another 5-10 minutes for a code review, 5 minutes for a build, and then its in all required environments within 2-3 more minutes (total). All of this managed via a system that is part of our application's codebase. So, you can certainly get some fun numbers going either way you attack the puzzle. Difference is - In my case, if I want to make a very nuanced change to the behavior of a build/deploy/hosting item, I could quickly locate the code and make the required adjustments. If you need Kubernetes to do something magical that its not quite prepared for yet, you could spend a long time screwing around fruitlessly.

dcolkitt · on Dec 15, 2020

Thanks for the response! Good points and interesting perspective.

cle · on Dec 15, 2020

What happened was that folks realized that the technology isn't enough--you need to build communities around the technology.

Docker for example includes a bunch of features to help build a strong community of Docker users (image standards, tooling to make building & sharing images easy, public registries, etc.).

xfitm3 · on Dec 15, 2020

Yet Docker still doesn't implement 2FA, which means you can't trust the community. Software supply chain management is a often overlooked problem and Docker just contributes to it.

leetrout · on Dec 15, 2020

Two things:

All the other tools are more approachable and show the user more value quicker.

This is similar to all the Dropbox naysayers “I can just use (rsync|ftp|etc)”

travbrack · on Dec 15, 2020

You honestly believe the new generation of tools provides 0 value over their predecessors?