Canaries in the coal mine: Why your IT department may be in worse shape than you think

by Peter Kretzman

Think about it: you can’t really tell the difference, on a day-to-day basis, between a car that has had its oil changed every 3,000 miles and one that has had its oil changed every year or two.  Only eventually.

Similarly, the stability of most IT departments proves very difficult to discern from outside.  Even insiders within IT can have myopia.  And non-technical senior management (CEO, COO, CFO)?  They usually can’t really tell either; they often don’t even know the right questions to ask, and their gut instincts on IT matters can actually run dizzyingly counter to best practices.  In short: to many or most people, it can look like things in IT are going pretty well, but in fact it’s all getting shakier and riskier every day.  Truth is, if a company is passionate about excellence, IT has to function well both on the surface and to the careful trained observer. IT is a service organization, and getting a few key things wrong means that the entire company suffers as a result.  Eventually.

My claim here sounds like an admittedly rather pessimistic one: that your IT department may be in much worse shape than appears to the eye. Yet, industry statistics indicate that’s probably the case. Having worked for and/or consulted to a lot of companies in the past decade, I’ve walked into a lot of “opportunities”, places where there was a lot of unchanged oil, so to speak. In fact, I’d be willing to bet that most companies have at least one, if not several, of the situations I’m going to describe in this post.

On the optimistic side, though, there are identifiable common root causes, all of which can be addressed, over time, by the appropriate focus and leadership.  As people always say, the first (and often hardest) step is simply recognizing that there is a problem. Let’s dive into the specifics, at a high level.

Here’s a reverse top-10, David Letterman-style, loosely ranked list of IT “anti-patterns“. I’ve actually seen companies where all of these situations existed. How many hold true where you work?  These gaps represent failures at meeting important best practices; like canaries in the coal mine, you should consider them to be potent indicators of looming instability in one or many of the dimensions where IT needs to serve the company.  Each of these deserves a separate post, or more, to treat fully; in some cases, I’ve already written posts on the item, so for those I provide a link below.

  • 10. There’s no published record of system uptime and failures. IT departments that don’t monitor, measure, and publish their operational success rate probably aren’t doing all that well, and although people may suspect that’s the case, they have no facts-based way of truly knowing.  And what you don’t know will hurt you here.
  • 9. Development doesn’t separate maintenance work from new development. Combining maintenance and new development in the same set of developers is a pretty guaranteed way of neglecting one or the other, often on an alternating basis.
  • 8. Developers are performing production-level operational tasks on a regular basis. If you want to deliver new work consistently, you can’t afford to have your developers worrying about production tasks and issues. As with #9, one or the other, or both, will be neglected. Worst example of this? Having your developers push code live into production.
  • 7. There’s no crisp handoff of new production code to the operations group, and no formal Operations Acceptance Test (OAT). Putting code into production needs to be a big deal, with metrics established, gates in place, and confirmation required prior to doing the deed itself.  As I’m fond of saying, the most destabilizing thing you can do to a software/hardware system is to change it.  You need a well-tuned system of checks and balances to avoid problems and instability.
  • 6. Capital expenditures aren’t well planned or tracked, and often there’s a “last minute purchase order” frenzy. This is a CIO-level gap, often, compounded with CFO tolerance or lack of awareness. There should never be a situation where anyone needs to run to the CFO or CEO waving a purchase order, usually with no documentation or justification attached, insisting that it must be signed before close of business or systems will fail.  Nearly all expenditures, with few exceptions, should be in the context of a pre-established, signed-off plan.  Discipline (and accountability) in how IT spends money is a key hallmark of excellence.
  • 5. The IT desktop/laptop inventory is murky, with no clear and published unit replacement plan.  What you don’t monitor, you can’t measure; what you don’t measure, you can’t control. Lack of discipline here not only costs money, it’s indicative of general lack of due diligence.
  • 4. Development and test environments are all different, with no clear change control for new builds or tweaks. Without adherence to an airtight process for installing new code, software environments inevitably diverge, and at that point, chasing bugs can be quite frustrating.  A dogged drive to keep environments consistent with one another is a key benchmark of world-class development shops.
  • 3. There’s no published list of projects underway, along with commitment dates: a public report card. If IT isn’t formally and publicly accountable for delivery, that’s the first step down the road to chronic slippages in schedule.
  • 2. Projects are begun one-by-one. Estimates, if done at all, are unrelated to actual resource allocation. Without a project portfolio approach, projects fall prey to the “oh, that’s gonna take us three months” throw-off estimate which gets cemented into a commitment.  Without doing the math (figuring out competing projects and actual resource availability), you’re stacking up the dependencies and the likelihood of frequent failure to deliver.
  • 1. The business doesn’t sit down regularly with IT to do a formal prioritization of major projects. Most critical of all: what gets worked on needs to be constantly examined and revisited, with the priorities set, the resource commitment understood, and the trade-offs recognized by business stakeholders.

You’ll notice that the most important (last) three items all deal with project management. That’s because strong project management, handled on a portfolio basis, is the most critical success factor facing an IT department, and the most common place where IT organizations are pressured to cut corners and “just get it done.”  Again, ensuring that you have that kind of project management will require focus, as well as active, informed leadership inside and outside IT.

This has been a scary post, perhaps.  You know what’s even scarier? This list could actually have been much longer than 10 items — these are just the most potentially damaging of the situations I’ve witnessed, but they are by no means the only canaries to look for.

Sign up to get future posts (for free) via email:

{ 2 comments… read them below or add one }

jerome baxter November 9, 2011 at 1:32 pm

Excellent post Peter.
I’b be interested in hearing of the next 10 or 20 key canary warning signs – as you say at the end “This list could actually have been much longer than 10 items”. Would you do a follow-up article.

Rgds – Jerome

Peter Kretzman November 10, 2011 at 6:22 am

Thanks, Jerome. Yes, it’s on my (extensive) list of future blog posts. I actually have more such “canary in the coal mine” warning signs occur to me every month or so. I’ll try to get to a post on these.

Leave a Comment

{ 3 trackbacks }

Previous post:

Next post: