The Zeigarnik Effect & The DevOps Brain

Ari Yonaty

It’s 8:30 AM. Standup time. Coffee in hand. The baton was passed to me, but I’m frozen for words.

Yesterday was a complete war zone. I remember the high-stakes deployment, the issues with the database migration, CI builds failing. I remember the constant slack threads and huddles, the skipped lunch, and of course, finally resolving the issue. However when comes my turn to give a status update, my mind turns blank. All I can recall is the IAM refactor I had to abandon midway because of the outage. I remember exactly where I stopped and which policies need to be deprecated. Strangely, I don’t feel like my day was productive, despite have put out multiple fires.

As a {DevOps/SRE/Platform} Engineer, the constant fire-fighting likely resonates with you; a given in the role. You fix the immediate crisis, everything is all green again, and then… where were you?! Oh yeah, the tech debt with the IAM refactor. Those who know me are familiar with my obsession of trying to find a relationship between non-technical industries to my daily working environment. So welcome to the bridge between psychology and sysops engineering; welcome to the Zeigarnik Effect.

Gestalt psychologist Kurt Lewin noticed waiters had a better recollections of unpaid orders than paid orders. However the second the “loop” was closed – order had been paid – it was as if some rm -rf’d the memory from the waiter. He relayed this observation to his graduate student Bluma Zeigarnik, who in the late 1920s, conducted and published her research that describes the underlying phenomenon, known as the Zeigarnik Effect. In short, our brains remember uncompleted or interrupted tasks significantly better than completed ones. Furthermore, each uncompleted task has an associated congnitive overhead which periodically demands our attention and drains mental resources until completed.

Those who know me know how I enjoy tying in external areas into computing and technology, so let’s give this a shot. This of your brain like a server with limited RAM. Each time you start a task, a background process spins up. When you finish a task, the process receives a SIGTERM. The RAM is cleared. You feel great, but you also forgot the details because the brain isn’t tracking them. This is a “closed loop”. But what happens when you get paged mid-task? The original process stays open, still consuming from the precious mental RAM. And instead, it pings your conciousness about it’s existense. This is an “open loop”.

In DevOps, Platform, and SRE roles, our entire job is basically interrupt-driven development. We live in a state of perpetual open loops. You start a developing a new Helm chart (Open Loop 1). You then get pinged about an alert in a Kubernetes cluster (Open Loop 2). Then, a developer requests help with a failing CI pipeline (Open Loop 3). By the time standup comes around, your brain has “garbage-collected” all the fires that were extinguished because they are now closed loops. But those unfinished tasks are like a bright-red siren to your subconscious. We likely have to-do lists miles long. And hot-take here, I don’t believe tracking in a Jira-like environment actually benefits the contributor in these cases. Perhaps makes it worse as it requires an upkeep of it’s own, as maintaining a backlog also has a congnitive burden.

As a reflect on this idea deeper, this explains my deep passion for DevOps and Continuous Delivery culture. It’s about being truly agile and getting stuff out as soon as possible. Reminiscent of The Phoenix Project, when work is stuck sitting in a queue, such as waiting for code reviews in a PR, is frustrating and keeps the loop open. There’s a reason there’s such a dopamine rush when the PR is closed and merged. Task on hand is often behind you.

While not everything is in our control, here’s a simple solution that works for me: context dump. Before jumping into fires, or after resolving an issue, write it down. Keeping a journal of what you accomplished each day is not only beneficial for answering the brain-freeze moment during standup (or during performance reviews), but also clears the mental cache and recovers valuable energy. The Zeigarnik Effect isn’t there to cause issues and is actually a pretty nifty congnitive mechanism to remember important goals and ensure we follow through. But in the fast-paced, incident-driven world of DevOps, it can become a serious mental drain. This ties in to another core practice of the DevOps movement from Kanban/Lean methodologies - limit work in progress.

Now, if you’ll excuse me, I have a half-finished Terraform script from yesterday that’s been living rent-free in my head for the last 14 hours. Time to close that loop.

Stay Tuned

Practical insights and strategies from the world of DevOps and Platform Engineering.