Quick automation tips for clearing out your AWS S3 buckets.

I see that the rule has removed 30,222,969 objects since 2/20. I would say give it a few more days and it should empty the buckets.

So I was cleaning up some S3 buckets. These buckets, for better or for worse, had versioning enabled, and each contained hundreds of thousands — if not millions — objects. AWS does not allow you to delete non-empty buckets in one go, and definitely not buckets with versioning on — you have to remove all of the objects first (docs here).

The fact that…

GCP Cloud Composer is a wonderful product aimed at delivering a managed, easy-to-use Apache Airflow deployment. It comes as a single resource (or so we think), runs of GKE — Google Kubernetes Cluster — and allows data engineers to run their pipelines without knowing much about infrastructure (or so we think).

We use Cloud Composer a lot. We see graphs like this, a lot:

Scheduler health and CPU usage of a moderately used GCP Cloud Composer cluster
Monitoring graphs of a moderately used Cloud Composer environment

Sometimes, when redeploying or deleting a Composer environment from terraform or UI you will experience errors that cannot be recovered.

Cloud Composer operations — be it creation, modification or deletion — can take anywhere between 10…

“It WORKS!!”

You know that sweet, sparkling feeling when something you’ve been debugging for days, finally works?

As a platform engineer who accidentally became the resident GCP expert, I’ve been working closely with our data engineers throughout the last year. I help with their architecture, define — or help them define — infrastructure-as-code, remind them to not test in production and to consider security. Speaking of which…

The more we go into cloud technologies, embrace microservices and venture into the land of serverless, the more data is floating around, ready to be used - or breached. For those of us…

