Do You Really Need and Ops Team
Startups and DevOps
The following is an excerpt from a conversation I had w/ a developer who is part of a small start up. Most of the conversation was related to DevOps principals and toolchains, but after he told me about their staff of five I became worried that this was really too much. He was already worrying about CI/CD, Configuaration Management, and Automation w/o really having the time.
The question from reddit.
"I am trying to setup, a deployment workflow for our nodejs apps. Its a small startup as of now, but we have big plans and a lot of services planned for execution. I am planning to setup the infrastructure in a way that it enables us to develop, deploy, test and measure as fast as possible.
What according to you should be strategy to start small and keep building from there? I am already aware of docker etc.. We are using AWS as our backbone to deploy apps. Should we leverage things like opswork or have our own chef server, puppet server? What in the long run would you suggest has helped you grow and manage IT infra better?
I was thinking of creating our CI stack using chef, jenkins how will move from here to CD? I would really appreciate your insights here."
My Response:
There are two schools of thought for you particular scenario. Use what the cloud gave you and roll your own, but first you have to decide "Do I really need an ops team?"
What I mean by that is that maintaining infrastructure is hard and usually requires more than just one or two people. Usually it's hard to justify having an Ops team when you are in startup mode. In most cases the company would be better served by hiring another developer and forgoing the complicated infrastructure decisions that come with rolling your own solution. Additionally I am opposed to doing new things just for the sake of doing them. You should always be trying to solve a problem with whatever new technology or methodology you are going to use.
When in Start Up Mode
Consider hosted solutions first. You've already decided on using AWS so it would be worth evaluating Elastic Beanstalk, Cloud Formation, and Opsworks. Opsworks is built on Chef so the structure should look familiar to anyone who has touched Chef. I personally would rather have my own Chef server, but only because I would want to be able to support non EC2 instances with it. Cloud Formation let's you describe a collection of AWS resources that can be spun up when your developers/customers need them. Elastic Beanstalk further simplifies things by taking care of everything for you. You just upload your code and it deploys. When it comes to CI there are many excellent hosted solutions that offer great integration with AWS. See TravisCI, CircleCI, Drone, or Codeship.
If using AWS for everything isn't your bag OR if there are technical challenges, or security concerns, that you are either currently facing or know about that precludes you from using AWS then it's time to look at rolling your own solution. Here again I would always look at hosted solutions first. Chef offers a hosted solution that has served my personal systems well for a year now. Travis, Drone, etc are all great hosted platforms. Past that you are looking at your own Chef/Puppet server. Both are great and I have clients that use both. Try each one out and go with the one that makes the most sense to you, or whomever is going to be managing it. When it comes to CI servers Jenkins seems to be king, although TravisCI is right up there. Again this is largely a preference thing.
The important thing to remember here is that if you are going to roll your own infrastructure you are going to have to maintain it yourself. That means patching, upgrades, monitoring, etc. As a 5 person shop, do you really have the bandwidth to dedicate at least two resources to that?
The CI Pipeline and the Road to CD
IMO a CI pipeline has three big parts. Local/Cloud Dev, Automated Testing, and Automated deployment. First your developers should have a common platform to develop on. I solve this by either using Docker containers and/or Vagrant and Virtualbox. It's imperative to avoid "it works on my box" as there is no greater time waster than trying to figure out why it works in some special snowflake environment and not in in production. Another way to solve this, and one I'm becoming fond of, is to create cloud based dev platform that all your developers use instead of using their own box. The next part, testing, is a hybrid. Automated testing should be happening both at the local level and then again at the CI server. I use Test Kitchen to stand up multi-node tests locally before I ever commit my code to the main repo. I believe node has Tuts and Jasmine that let you write tests so you can use them to help with TDD. You may have heard people talking about "test driven development." While this is great in theory it's really hard to do in practice. Often there is a lot of pressure to get code out the door, but if you can write tests that your developers can perform locally then you will be able to catch simple things like syntax, spelling, and regressions before they becomes a problem and cost you time. These tests need to be expanded upon when they get to CI land. Your CI server not only needs to validate syntax and compile but also needs to actually deploy the code to a node and run acceptance tests. Only after all the tests pass do you actually deploy the code to an env that real people are going to see. Writing quality tests will allow you to move to the third part of CI which is automated deployment. Until you are sufficiently comfortable that your tests are catching all of the silly stuff it is hard to let your CI server pull the trigger on deployments. Eventually you will feel confident and code will move from local development to deployment in an extremely quick fashion.
Continuous Delivery, unlike it's name suggests, isn't about deploying code in an automated fashion. CD is a development style that emphasizes making small changes in a rapid development cycle. Each of these changes should be delivering some new feature and you should be soliciting feedback from your clients. CD NEEDS CI in order to work at it's optimum level but CI does NOT require CD. Once all of your CI pipeline is in place then you can automated deployments to production as part of adopting CD. CD should have a solid feedback loop and no individual feature should be able to break everything. CD assumes that you should be able to deploy any commit at any time so things like on off switches are super important.
Enable Your Developers
My final parting thoughts, and I know this has been long winded so thanks for sticking with me, is that all of this requires two things. Buy in from everyone AND enabling constraints. The first is vital to any of this working because if your dev's don't want to do it then it will just be a fight. Include them in the decisions or at least be ready to explain to them WHY you chose a particular thing. The second is vital to you not being called on the weekends. You have to place constraints on the system so that people aren't breaking it all the time, but these have to enable your devs to do their jobs. I like to give my my ruby developers a rake file that allows them to type things like "rake test, rake commit, rake deploy, etc" instead of having to memorize a new tool chain. Make it easy for them to use your new pipeline and they will thank you.
Fell free to reach out to me if you want to clarify anything or poke holes in any of this. One of the most important things in DevOps is being willing to take new ideas and facts and reform your opinions.