Posts

Terraform vs. Manual Scripting - provisioning and maintaining modern platforms - Snowflake

Terraform vs. Manual Scripting:  In today’s cloud-centric data world, managing complexity is less about raw technical ability and more about choosing the right tools for repeatable, scalable, and auditable processes. When it comes to provisioning and maintaining modern platforms like Snowflake, two foundational approaches stand out: Infrastructure as Code (IaC) with Terraform and manual scripting (SQL scripts or CLI commands) maintained in version control . While both can be committed to git and automated, they represent fundamentally different paradigms—for how you model, govern, and evolve your data infrastructure. Understanding their differences isn’t just an academic exercise—it’s strategic. It shapes your ability to support agility, compliance, and operational excellence as your data estate grows in scale and sophistication. This post explores the key differences between Terraform and manual scripts for managing Snowflake or any cloud data platform, highlighting where T...

Infrastructure as Code for the Modern Data Cloud

Terraforming Snowflake: Infrastructure as Code for the Modern Data Cloud Modern data platforms like Snowflake have fundamentally changed how organizations think about analytics, scaling, and governance in the cloud. Yet, as cloud-native data estates grow more sprawling and dynamic, the risk of configuration chaos and governance drift looms large. That’s where the concept of “Terraforming Snowflake” enters: the use of Infrastructure as Code (IaC) with tools like Terraform to bring order, discipline, and automation to the world of cloud data infrastructure. This post explores why this paradigm is becoming crucial, how it works, the challenges it addresses, and why it forms the backbone of tomorrow’s data-driven organizations. Introduction: What Does "Terraforming Snowflake" Really Mean? Picture a world where every warehouse, database, schema, role, and privilege in your Snowflake environment is defined in version-controlled code, not managed by frantic mouse-clicks or ...

How to Organize and Divide an ML Project Team

A Three-Sub-Team Approach Introduction:     Machine learning (ML) projects are becoming increasingly common and important in various domains and industries. However, ML projects are also complex and challenging, requiring a diverse set of skills and expertise , as well as a collaborative and efficient workflow. How can we organize and divide an ML project team to ensure a smooth and successful project delivery?   In this article, we will explore one possible way of structuring an ML project team into three sub-teams: data, modeling, and deployment. We will also discuss the roles and responsibilities of each sub-team, as well as the communication and coordination among them. Finally, we will provide some general principles and best practices for managing an ML project team.   The Data Sub-Team: Data Collection, Preprocessing, and Transformation   The data sub-team is responsible for the first and crucial step of any ML project: data collection, preprocessing,...