retain security operations staff

4 Strategic Approaches to Retaining Security Operations Staff

Frank McClain

Share this

Search the internet on the subject of “InfoSec talent shortage” and you will get enough results to keep you busy for a long time. But if you’re in management or another leadership role, you don’t need the internet to prove there’s a problem. You feel the pain every time you search for a good candidate to fill an open position, or hear that an employee is moving on to “greener pastures.”

We all know there is a substantial cost associated with finding, hiring, and onboarding new people in security roles. Not only is it challenging to find the right people (thus taking time, effort, and ultimately money), but there are business costs associated with onboarding and getting new hires “up to speed” to work efficiently and effectively as part of your team. I have heard and read different numbers, but a rough guideline is 1.5 times the hiring salary. So in addition to the money you’re paying in salary, you have to nearly double that just to get the person in the door and working.

Think about that for a moment. If your new analyst/engineer/architect moves on in six months to a year (which seems to be the norm these days), where does your initial investment stand? In terms of a mortgage, you’re still way upside down; if it were an investment, you’d have lost money.

So what can you do to change the situation? What can WE do as an industry? This post will discuss some strategic approaches to improving retention from a security operations perspective. The goal here is to “stop the bleeding” in the near term, to give time to shift focus on how we hire (which will be the topic of another article down the road).

To take a deeper dive, watch an on-demand webcast: Improving SecOps Retention From Day One

It’s Not Them…It’s Us

On average, employees switch jobs every six months to a year on average, and those who have been at a company for two years are often considered “long term” employees. So why can’t we keep people? It’s easy to say that times have changed, and that people just don’t stay anywhere very long anymore. There may be some truth to that, but I don’t think that’s the full scope of the situation.

I think the bigger issue is that as companies, as an industry, we’ve gotten ourselves into a rut and just keep spinning our wheels without making any progress. We need to take a step back and strategically change what we’re doing, or we will continue expending energy without making positive progress.

Time to Change Gears

So far, this is quite a “doom and gloom” picture we’re painting, but it doesn’t have to be that way. And it certainly doesn’t have to end that way! We’re in an exciting industry (regardless of vertical), with lots of exciting opportunities (a big reason for the talent shortage), and there is no reason why we can’t make the future even more exciting with better employee retention.

Have you ever been excited about employee retention before? Let’s get the party started!

A few business benefits of improved retention:

  • Less recruiting costs
  • Less initial payout for onboarding, benefits package, signing bonus, etc.
  • Reduced time for training and getting someone up to speed
  • Employees stay because they want to, which makes you an attractive employer
  • Employees invest in the business to make it more successful
  • Operations become more streamlined and efficient, opening the doors for new opportunities

Let’s say at this point you agree, and you want to take action to improve retention. What do you do? It can be challenging, even daunting, to face this type of situation, where it seems you have to turn everything upside down to find a solution. That’s why I propose starting at ground zero.

Training for the Long-Term

One thing I have observed (and experienced firsthand) is that many companies throw new hires directly into the fire. By this, I mean that they put new employees to work immediately—perhaps in an attempt to shortcut some of the upfront cost associated with hiring and onboarding. After all, it takes time and money to train someone and get them up to speed, and if they leave in six months, then you just have to start all over again and the long-term cost continues to rise. So if you can short-cut part of that process, you should be better off, right?

The problem with this approach is that no matter how much experience someone has, it takes time for them to become acclimated. New employees need to understand your business strategy, internal processes, how to thrive within the environment, and how to do their part to help the business grow and succeed over the long term. They need to learn about your people and how to work as a team, what your internal processes and workflow systems are like, and how to use your technology effectively.

Attempting to rush the process is setting the employee up for failure. They don’t get to meet the above criteria, get frustrated, and burn out more quickly. When that happens, long-term growth becomes nonexistent as employees look for other ways to reach their own professional goals. As a result, the business has to spend more time and money to fill the open role. Does this cycle sound familiar?

Analyst Onboarding: A Case Study

Sharing stories can be helpful, whether they’re of success or failure. With that in mind, I’d like to take you back to early 2016, when I started as a threat analyst at Red Canary. Back then, we did not have a solid, repeatable process for onboarding and training new hires. It makes sense, given that we had a total 10 people in the company with my hire. I was only the fourth person on the analyst team, all of whom worked remotely.

I flew to Denver early on Monday, and flew back home on Thursday. So basically, I had three days to onboard and train with our team lead, who lived near Denver, before being on my own to work. It had been close to a year since the company’s last new hire, and there had been a lot of changes in that time. We had little documentation, and no defined process—the perfect mix for a challenging situation.

Recognizing the potential for issues, our CEO tasked me with taking notes and reporting back to the company on how things went and areas of improvement. Shortly afterward, the company’s growth began to take off, and as our team expanded within a few months, we really started to feel the associated painpoints. This combination of events prompted us to change the way we did things, so that as we brought on new analysts, they would have a better chance for success and job fulfillment.

We are neither perfect nor at an “end state” with our process, but we have moved from a fair amount of chaos into a more organized state in several different ways. Here are four key areas where we focused efforts to enhance the onboarding process for our detection engineering team and, over the long-term, improve retention.

1: Establish consistent new hire training.

Regardless of past experience or skillset, people learn in different ways and at different rates. We look for candidates with diverse backgrounds and experience, to help maximize our team’s capabilities, which means we also have to account for that when training. As a result, we built out a detailed training process to help smooth out any rough areas and ensure that everyone has the same experience during onboarding.

To that end, we have a detailed topical matrix to ensure we cover everything from channels of communication (and associated protocols) to additional duties and time off. We want to make sure that everyone has a solid understanding of expectations both individually and for the team as a whole, as well as our processes and workflows.

retain security operations staffEveryone on the team has a part to play in the training process, and is directly involved with the new hire at least once. The initial training cycle is structured over four weeks, touching all shifts, and is focused on an end goal of being able to work independently and remotely, and do our core job (event analysis) in a confident, consistent, repeatable fashion. Our detection engineering team all works remotely, and training is done the same way, leveraging technology for video calls and screen-sharing.

The four-week training schedule gradually moves new hires through learning phases, from shoulder-surfing to “pseudo-driving” to independent driving. Our goal is not necessarily to have an analyst who knows everything we do to the nth detail; it’s to have a self-sufficient analyst who can effectively fulfill our core responsibility: to analyze events and push detections to customers. Neither the schedule nor the topics are written in stone; we are flexible on both, so long as we accomplish the primary goal.

To view an example 4-week training schedule, watch an on-demand webcast: Improving SecOps Retention From Day One

2: Build out documentation.

Documentation is hard, and something that most organizations struggle with. It’s no different for us; we have no magic solution. What we have done is take steps to acknowledge and address those challenges. Our documentation is largely focused on operational things, so we moved from housing everything in a shared document-centric environment, to keeping it alongside our detectors in our code repository.

That’s right: our documents essentially live as code. It makes sense for us because many of our daily duties exist in, or center around, our code repository. Plus, this gives us extremely granular control over the creation and modification of documents, tracking via tickets, pull requests, and merges into the repo. We have organized them into sections based on document types and areas of operation, cross-referenced, and focused on things from an operational standpoint, with what matters most to us at the forefront.

All of our documents are updated frequently (they really are “living documents”), as we make a concerted effort to initiate changes as needed to keep pace with operations. Key among these is a detailed analysis “handbook,” which goes deep into the operational aspects of the work we do on a daily basis. This ties back directly to training topics and helps ensure consistency across the board.

3: Eliminate tiers.

Let’s face it, if you’re involved in security operations, there’s a fair amount of “toil” involved. In this context, the “toil” to which I refer is based on the Google SRE definition, meaning things that are primarily manual and repetitive. And for security operations, this means high noise, low fidelity, which gets tiring very quickly. The worst part of it is that in a more traditional setting, noise and triage are pretty much the only things that lower level analysts (such as “Tier 1”) get to do. Noise, and triage.

If they think something might actually be a valid threat, their fun stops, and they escalate it to someone at the next level. That person does additional analysis and validation, and may also escalate to the next tier. Essentially, only the people at the top level are doing any of the exciting things; for everyone else, it’s toil. This doesn’t lead to much in the way of job satisfaction or career progression, and plays a decided part in that six-month retention rating.

We decided that wasn’t how we wanted to structure our team, so we don’t have analyst tiers. We do have more “senior” and more “junior” analysts from an experience and capability perspective, but not work-related tiers. From an analysis perspective, we all do the same work, have the same requirements and workflow, and focus on delivering timely and actionable detections (aka, “human-confirmed threats”) to our customers. We have built-in safety checks to help prevent human errors throughout our processes, but if anyone makes a mistake (we all have), then we address it appropriately (e.g., it doesn’t mean somebody’s written up or fired), learn from it, figure out what we can do to improve and prevent recurrence, and move on.

We all have the opportunity and responsibility to work on other things to benefit the team and company, aside from event analysis. This includes tuning our behavioral detectors and improving workflows or other areas of inefficiency, in order to reduce overall toil. There are also opportunities to work on code for our platform and web portal, analyze threat intelligence, hunt for new techniques, and contribute to blog posts, webinars, and other outreach events. To facilitate these types of growth, we established different areas of focus within the team, which leads to our next topic: “operational practices.”

4: Initiate “operational practices.”

Many organizations have distinct operational groups that segregate and silo employees from one another, creating unnecessary tension. (For example, network versus endpoint; or more specifically, firewall versus antivirus.) Our operational practices are broken down into different areas that focus on improving the work we do on a daily basis, rather than on a given technology or data type. Essentially, each area that has a potential to reduce toil and improve efficiency has a “practice” associated with it, such as detector development or threat intelligence.

Individual contributors can participate in multiple practices, without detracting from or changing their primary role on the team. Each practice lead is a more senior analyst who is responsible for setting the direction and pace of the practice, and overall accountability to ensure tasks and goals are getting accomplished appropriately.

The practice lead is NOT a personnel manager role; it is a technical lead, and represents just one of several opportunities for growth, expansion of responsibilities, and internal movement (both lateral and upward). We structured things this way to provide those opportunities without having to go into management, which tends to be a bad move for most technical staff (yet at the same time, typically the only available option for any sort of growth and upward movement). This provides another means for all of our staff to grow and develop knowledge, skills, and abilities based on areas of interest, without ending up in a silo.

Key Lessons and Takeaways

There’s no easy answer to the cybersecurity shortage. But by focusing on the different facets of employee engagement—from finding the right candidates, to hiring, training, ongoing development, and retention—we can start to make a difference. With the right combination of elements, both the employer and employee benefit exponentially.

What we’re doing at Red Canary may not be the absolute answer for every other organization (possibly not for any other organization), but I hope that our team’s learnings can help other organizations who are looking to retain security operations staff. To paraphrase Miyamoto Musashi from The Book of Five Rings, “Do nothing without purpose.” Our goal has been to do everything with a purpose, being deliberate and intentional to change as needed in order to get better results and continue growing.

As with our other processes, we want to be as transparent as possible with our approach, as we believe this can benefit the larger community. To that end, we’ll be doing a series of posts on related topics, and hope others will join in the conversation. (Don’t worry, we’ll still be doing our regular technical content as well!) Stay tuned for more articles on hiring, team development, and improving the overall talent shortage. Subscribe below to receive new articles automatically!