Startups can be unpredictable; traffic spikes and plummets happen at a moment's notice without rhyme or reason. Cloud infrastructure, due to its easily scalable nature, has become the standard for most startups as a solution to this, which of course means the need for DevOps engineers (infrastructure/platform/new buzz word) at startups has increased too.
Many DevOps engineers come from big companies where they focused on a few pieces of very large scale infrastructure. In my experience, working at a startup as a DevOps engineer is very different. Multiple projects are spread across a few people. For this reason, myself and many others with a similar background choose to work for a startup.
Why work at a startup?
For starters, why should you even make the jump? Startups don’t have nearly the same scale of infrastructure as a Fortune 500 company so what is there to learn? The answer to this is personal and everyone has their own reasons for wanting to make the jump. Here are some of mine:
- Autonomy - no red tape, no limits, no micromanaging. You are here because the people around you trust you and your opinion is valued. You decide what needs to be done, how it’s going to be done, and how important it is. This was my biggest reason for making the jump.
- Flexibility - hours and deadlines are what your team decides to make them, and depending on the culture, that can vary a lot. Every company is different and finding a group of people that match your style of work makes doing your job significantly more enjoyable.
- Purpose - have you ever worked on a project for months, mulled over tiny bugs, spent hours on hours writing tests, only to be told the project was scrapped and you need to move onto something else? I have. Multiple times. And that’s okay. It happens at all companies, but it certainly happens much less frequently at small ones. When you build something out, it’s almost certainly going into production and used by customers.
- Growth - lifecycles in startups are short. Very short. Since scale is small, projects are designed and rolled out in the blink of an eye. What I would do in 6 months at a large company would be done in 2 weeks at a startup. That is not an exaggeration.
Don’t think it’s all skittles and rainbows though. What makes working at a startup great also makes it very difficult:
- Autonomy means more responsibility and stress
- Flexibility means less routine
- Purpose means your mistakes can be catastrophic
- Growth means you are in a constant state of catching up
This post is not meant to convince you to make the jump, but only to help you understand what there is to gain in moving and how I handled making the switch. Now that all of that has been covered, I want to share some principles (in somewhat of a priority order) that helped me through growing pains in my first year as a DevOps engineer at a startup!
Phase 1: Understand the system
If you take anything away from this post, take this section. Spend as long as you need to understand the application and the underlying infrastructure you’re inheriting. This will make everything in the future significantly easier. When I first joined Anvil I wanted to start building things fast. After a few months I had to circle back and clean up a lot of my original work because it fit the direction we needed to move in. Don’t be hasty!
Diagram and document
Everyone learns differently, but infrastructure systems tend to be understood best visually. If there aren’t architecture design diagrams, then create them! Work with engineers to understand how all the pieces of your system work together. Create a system diagram to connect all your cloud resources. Then another for the network with all the protocols being used. And then another for all third party services and where they are implemented. As you do all this you will not only gain a deep understanding of your system, but you will also start noticing potential pitfalls. Is all traffic encrypted? Are there multiple points of entry for services? Is there a lot of drift between multiple environments?
Furthermore, these diagrams are a great contribution to the organization as a whole, not just for you. engineers, managers, and even clients will likely use these diagrams to understand how data is transferred and to maintain security compliance.
Check out draw.io as a free option to get started with.
Respect what came before you
As you create an agenda for work that needs to be done, be respectful of what already exists. Sure your last organization had a much cooler way of handling secrets, but why was it implemented like this here? Compliance requirements, protocols, and lack of resources or technology could leave no other option. On the other hand, solutions are often implemented quickly and moved away from. Before going down the path of removing or upgrading something, do your due diligence and understand the context. On multiple occasions I implemented a new workflow only for it to hit an edge case months into the future. Then I would have to roll back and realize the workflow had already taken that edge case into account.
Build a sandbox environment
Developers get development environments to test locally before ever pushing, why shouldn’t you? Breaking a staging environment will stop all developers from being able to test until you have fixed it. Do everyone a favor and build a test application that is a simplified version of your main application and replicate development workflows. You will understand pain points for the developers and have a safe space to test infrastructure changes without ever slowing down your development team.
Have a plan
Software is complicated, but now you should have somewhat of an understanding of the architecture and development lifecycles of your organization. Take note of pain points and areas of improvement and notice patterns. Is there a core service that’s causing a lot of issues? Could there be one solution that addresses many pain points? Identify what’s high priority and what addresses the most concerns. Set yourself up for success with goals and a vision for how you want your organization to look in the future!
Creating a 30, 60, 90 plan is a great way to get started with this.
Phase 2: Minimize complexity
As you follow through on your plan and build out new infrastructure and systems, make an effort to keep it simple. Your team is likely small, or it may just be you, so being very picky about what you incorporate into your system is a necessity. The less moving parts, the easier to diagnose and resolve.
Plan for the near future
Piggybacking on having a plan, don’t plan too far ahead. Things change, and at startups they change fast. Don’t bog yourself down by planning for when you have 100x the user base you have now. Plan ahead to support 2x or 3x of your current usage. There will likely be many more factors to consider when you surpass that threshold. Furthermore, tooling and systems built for much more complex environments usually come with several features that will go unused, again, adding to complexity.
New tools can reduce complexity
I know this whole section is about reducing complexity, but often there is legacy code that bridged a gap before there was an accepted solution in place. Adding new technology can simplify your system a lot. But when you’re the subject matter expert and you don’t have the answer to something, you have to hope someone else has run into your issue and posted about it online. New tools don’t have the biggest communities and you likely won’t find the answer you’re looking for if it’s a unique issue. Stick with largely adopted and stable communities. They may not be the most fun, but they are the most stable. You can always update them later.
Flex your muscles
Everyone has their own expertise. Use what you already know to make an impact early. This could be CI/CD, containers, security, database optimization, really anything that you are confident in. This could also be what you’ve worked on most recently since it is fresh in your mind. You will often be working on something you have never touched before, so use what you know while you can.
Trust your gut
There’s a sixth sense you develop after several years of experience. If something feels overly complex, fickle, or just down right wrong, look into it. A lot of solutions are put in place with strict time constraints just to get something working and then never looked at again. Maybe that clunky script is made obsolete by some new feature in a tool that didn’t exist before. You won’t know unless you dig into it.
Phase 3: Zoom out
Making an impact, designing and building new systems, and implementing new technologies is tough, but really fun and rewarding. As you get the hang of things you’ll be itching to keep growing, and that’s what working at a startup is all about! However, don’t get too lost in the sauce. You were brought in to maintain stability, improve resilience, and ease development lifecycles. That is always the direction you should be heading.
Be comfortable with the unknown
Scope is incredibly broad at startups, you now own everything that isn’t the application code, and you still kind of own that too. The most important skill to have now is self education. Dig deep into your system and discover what you don’t know well. Research it, identify if what you have is up to snuff, and implement any changes if needed. There are so many ways to make an impact, so make an impact where it’s needed, even if you need to spend most of the time learning about it.
Stay on top of new technology
New technologies spring up all the time. No one is going to tell you that it’s time to move away from the tool you used for the last 10 years because something better is out. Use podcasts, online articles, Youtube, Reddit and your own network to stay on top of the new standards. Your job is to understand these tools and bring them in if they solve problems. There are tons of great Youtubers, podcasts, and online forums for this! Some that I recommend:
Don’t be afraid to redo your work
As the aforementioned new technologies come up, there will inevitably be services that make your custom tooling obsolete. That’s okay! Don’t get attached to your work. Your goal is to help the organization and a tool built by a team instead of one person is always going to have an advantage. Your life will be made easier in the end by not having to support custom code. Not to mention as your company grows so will the demand of tooling. Some of the systems you put in place may not have been meant to scale so high.
What this looks like in practice
In your first week at a startup you will realize that nothing happens in a vacuum. You will have multiple streams of work with varying importance all the time. These phases should be taken with a grain of salt, your priority will always be what the team needs first. Every organization is different and what they need from their DevOps engineer will vary greatly based on the current state of the business. Just keep these phases in mind as you start contributing.
The first phase, understanding the system, is probably a couple weeks up to a month, and usually never really stops since there’s always something undocumented to discover. Make diagrams, understand limitations that dictate systems, create an isolated environment just for infrastructure tests, and then make a plan to address shortcomings you have found.
Just like repainting a car, your second phase will be to strip away any excess complexity and streamline your system by minimizing complexity. Begin acting on your plan with solutions for the near future, or where your organization plans to be in one to two years. This will likely mean replacing custom tooling with largely popular third party tools and service, which is expected. Also take this time to strut your stuff and make an impact in the area of your system that you know best. This phase is ambiguous and could take two to six months. It’s possible, based on the infrastructure you inherited, that this takes up to a year.
Now that everything makes sense, is documented well, and has been well established in your organization, you move into Phase 3 and zoom out. Your goal is to make your infrastructure world class from now on. This means a lot of research, all the time. Go to conferences, listen to podcasts, be active in forums, but stay involved in the DevOps community to learn about successful companies and how they handle infrastructure. Go back and update systems and tools you put in place if needed. Staying on top of technology is a never ending process.
Have a different experience working at startups as a DevOps engineer? We would love to hear about it! Let us know about your experience by emailing us at firstname.lastname@example.org!