This page has been locked and cannot be edited.
!!! ~DevOps Blocker I've been spending a lot of time lately thinking about this whole [DevOps|http://www.wikimedia.org/wikipedia/en/wiki/DevOps] thing. Briefly, ~DevOps is the pursuit of merging the disparate cultures of Development and Operations groups in the computer software (particularly web-based) word. ~DevOps is the __big thing__ in computer system administration these days. For many good reasons, Development and Operations groups have historically operated relatively independently. The biggest reason for this is that Operations is concerned primarily with stability and Development with creating new features. This leads to a natural conflict, particularly since Operations gets blamed for outages, and outages are associated with new features. The result of this is the __throw it over the wall__ mentality - Development creates a new thing, and tosses it to Operations to deploy and manage. Operations resists because of perceived threats to stability and the risk of the dreaded *site downtime*. Tempers flare and resentment grows. ~DevOps recognizes that this is not an optimum way to get things done and tries to tear down the wall between Ops and Dev. I think that's a noble and necessary goal. The ~DevOps movement also acknowledges that many of the techniques and tools of the software development world can be applied in Operations. I'm particularly interested in the idea of [sysadmins of using development tools like formal codereview|http://sysadvent.blogspot.com/2010/12/day-5-why-arent-you-doing-code-reviews.html]. Software developers have created many excellent tools and methodologies, why shouldn't we all use them? This leads directly to the idea of defining our Operations environment by applying programmatic rules or as it's more generally known today, [configuration management|http://ww.wikimedia.org/wikipedia/en/wiki/Configuration_management]. In the unix/open source world I inhabit, this revolves around three tools: [Cfengine|http://www.cfengine.org/], [Puppet|http://www.puppetlabs.com/] and [Chef|http://wiki.opscode.com/display/chef/Home]. I think this is the purest expression of the idea of leveraging Development in Operations - define your environment with formal rules, and apply that configuration consistently. This is analogous to defining a specification and writing code to turn that specification into reality. Thus, I find myself in agreement with many of the ideas and philosophies of ~DevOps. To be clear, I do not agree with all of ~DevOps. In particular, it smells too much like a 'movement', with all the negative connotations of intolerance and inflexibility that can lead to. However, I want to set that aside for the purpose of this post. Whether or not I agree with the label __~DevOps__, I agree with the basic ideas (to be fair though, I do have a Bachelors in Computer Science so I'm probably biased towards the development angle). Here's the question that I'm stuck on though: how do you implement ~DevOps in a large, established organization? It it even possible? Some background: I currently work in one of those large, established organizations. My company owns some of the most prominent sites on the web. Just about everyone on the internet interacts with us in one way or another every day. My department is the Operations team responsible for one of the biggest internet properties at the company, and it's a property that has been around for most of the life of the world wide web. This means we've got a lot of people working on a lot of mature systems. The passage of time brings both maturity and complacency. Our team does a great job of managing the tools and procedures we currently have. In particular, we keep servers all over the world running with amazingly minimal downtime. We know how to analyze our failures and remediate them. We track metrics; we strive to reduce outages. *Site Up* is our mantra above all else. The downside of all this is that gigantic systems take an extremely long time to change. Several years ago I was part of a team that worked on a major software conversion on 10,000 servers. That project consumed the lives of 6 people for a year. We got it done and I'm proud of the work we did, but a year is forever in internet time. That project dealt with just one part of our infrastructure. How then do you make all the day-to-day changes you need __and__ move forward on these long-term projects __and__ move towards something like a ~DevOps methodology? Part of the issue here is that changing culture requires conscious effort from everyone in your organization. ~DevOps is a methodology and a movement. That means you have to win the hearts and minds of your peers. Large organizations actively resist change. This makes sense, since large organizations are safe and comfortable. Your job is generally well-defined - do what you boss tells you, keep in line with your peers, and you will do all right. I am in no way belittling the people in large operations organizations - I'm one of them myself, and I'm proud of the work we do. I just want to acknowledge the fact that the people in the group are somewhat self-selected. If you want excitement and risk, you go work in a startup. If you need a regular salary (particularly if you have a family to support) you migrate towards the larger company that can provide these guarantees. From what I can tell, ~DevOps comes from the other direction. You might call it __bottom up__ vs. __top down__. ~DevOps comes from the startup world (or at least that's where I hear the most about it these days). Startups are all about wearing a lot of hats. Most importantly startups don't start with system administrators. They start with developers (or salespeople, but let's ignore that case). A few people come up with an idea and write some software. That software has to run on servers, so they start buying equipment (or space in the cloud) and configuring it. The good startups begin with configuration management immediately. Since the developers are running the show, everything is naturally developer-driven. Sysadmins and Ops folks tend to come in later, after much of the initial design of the infrastructure has been laid down. Compare this with existing large companies - the software developers who started things are either long gone, or the company has evolved so much that virtually none of their original designs remain. This conflict, then, is my ~DevOps blocker. The culture at small companies comes from the software developers. Big companies are already established and have procedures for dealing with their world. Those procedures can involve some ~DevOps ideas and tools (such as code review and agile programming) but in general the large company world is not a ~DevOps world. Now maybe there are large companies out there which have fully integrated ~DevOps. I don't think there are, but maybe I'm wrong. What are those companies doing differently? I imagine it has to revolve around the people that work at those companies. All I can draw on is my experience, and I'm stuck - how do you put a large organization on the path to ~DevOps? Is that a worthwhile goal, or is ~DevOps only appropriate for small groups? *Update, Next Day*: I was [reminded by @wastedcarbon|http://twitter.com/#!/wastedcarbon/status/31329984752656385] about the wonderful [DevOps presentation|http://www.youtube.com/watch?v=Fx8OBeNmaWw] given by Adam Jacob at Velocity 2010. Go watch that short talk for a great take on what ~DevOps means. ----- CategoryGeekStuff CategoryDevops CategoryBlog
comments powered by Disqus.
What links here
Recent Changes Cached