One of the most important parts of my Application Security program is Security Tooling. It is usually one of the first programs that I put into place, and it helps me with a number of my overall security goals.
This post is the first of a six-part series on Security Tooling. The upcoming posts on this topic will be on the individual security tools that are a part of my foundational tools (Bug bounty, SCA, SAST, Secrets Scanning, API security). In those posts, I will dive deeper into each tool, my philosophy of use, and my rubric for selecting vendors.
In this first post, we will dive deeper into the overall Security Tooling strategy and philosophy in this post.
Security Tooling goals
There should always be a purpose for whatever you do with your security program. Don’t create a program because everyone else is doing it; rather, you should work on a program because you think they are important.
For Security Tooling, I have several goals which include:
- Fixing vulnerabilities
- Building security culture in engineering
- Understanding gaps in security
- Building up security incident capabilities for engineering
Fixing vulnerabilities
At the end of the day, this is the main reason that we are hired within organizations. Our goal as security engineers is to find and fix vulnerabilities. We want to reduce the attack surface and the blast radius if something were to happen.
While tools are great at discovering vulnerabilities, *your* internal processes and the engineering org have to be good at fixing vulnerabilities. At the end of the day, if the engineering org isn’t fixing vulnerabilities, I don’t care how many findings my tool has.
Whenever I procure tools, my main focus is on the fixing side of things and what that will look like.
Building security culture in engineeringSpeaking of fixing vulnerabilities, I like bringing in tools slowly to build up the security muscle in engineering. It is common for security engineers to bring in tools and the engineering org didn’t fix the vulnerabilities. When bringing in security tools, make sure that the vulnerabilities are valid (low false positive) and don’t overwhelm the engineering org. I typically like to start with critical vulnerabilities and when those are nearly remediated, I will bring in a subset of high vulnerabilities. Managing the load of vulnerabilities helps encourage engineering to regularly work on security problems.
Understanding the security gaps
Whenever I join a new organization, the first thing that I do is look at the vulnerabilities and understand what the data is trying to tell me.
What are the common themes of the vulnerabilities? Do we have a lot of XSS vulnerabilities? Do we have a lot of Broken Access Controls? What have the tools discovered? Once I’ve determined the biggest problem areas, it’s easier to make the case to remediating full classes of vulnerabilities. It isn’t fun playing whack-a-mole with vulnerabilities, it is better to decimate the entire mole population with a larger fix.
Building security incident capabilities for engineering
Every security engineer has a breach mindset. We know that it is a matter of time before an adversary will find a hole in our armor and be able to do something harmful to our systems.
Whenever we discover critical vulnerabilities, things that are truly critical, I like to create incidents and have the engineering team work through them quickly. While it is important to fix vulnerabilities in general, it is also important to build the incident muscle as well. Real security incidents are ugly, stressful, and difficult. The more practice one has, the better-prepared one is for real incidents.
What are security orgs doing wrong?
There are a number of challenges that I have encountered with security tooling within organizations.
Challenges:
- Security orgs do not select the right tools for the business
- Most security orgs don’t fully operationalize their tools
- Operational costs too high
- Overwhelming engineers with vulnerabilities
- Not involving engineers in the decision-making process
Security orgs do not select the right tools for the business
I have worked at a lot of organizations where the wrong tool was selected. To me, the wrong tool is something that doesn’t provide value or can even be detrimental to the team.
Those previous teams had optimized for certain criteria that didn’t work for the long-term for the org. Sometimes the team optimized for cost because they didn’t have a strong budget; other times, the team optimized for the brand name. Whatever the reason, I have had to remove many tools in my career.
When selecting tools, I create a rubric to ensure that I am selecting the right tool for the business. The rubric will help me determine which tool is the right one for business in the longer term (3-5 years). While the cost of the tool is in my rubric, it isn’t at the top of my list. Typically I optimize for low false positives and integration into my workflows. Both of those items will reduce my team’s operational costs and ensure that I get more value out of the tool.
(bad) Example SAST Rubric
Criteria | Description | Weight | Rating |
---|---|---|---|
False positives | Of the repos that we have scanned, what percentage of findings were false positives? | 3x | |
Maintenance/Integration | How easily does the tool integrate into our environment? Is there an easy way to integrate into all of the repos? Is there an easy way to pull in the tooling data into our security data lake? What is my maintenance cost going to be supporting this tool? | 3x | |
Languages | Does the tool support the languages that we have at the organization? | 2.5x | |
Cost | Does the tool fit within our budget? | 2x | |
Vendor Support | (internal rubric – don’t share with vendor)Does the vendor see us as a partner or a payer? Will this vendor bend over backwards to help us with our security program or are we just revenue for them? | 2x | |
Leftness | How far does this tool push left? Does it integrate into Github? CI/CD? | 1.5x | |
Custom Rules | How easy is it to write custom rules? | 1.5x | |
Performance | How fast do the rules run? | 1x | |
Other tools | (internal rubric – don’t share with vendor)Does the vendor have other tools that would work in our environment? | 0.5x | |
Future | (internal rubric – don’t share with vendor)What direction is this vendor going? Are they solving my today’s problems or my future problems? | 0.5x |
Recommendations
- Talk to vendors regularly – Even if you are not going to procure their tool any time soon, this will help you understand the tools and what you may want in the future. If possible, try to understand their pricing model as well. That way, you can work with Finance when you need the budget.
- Create your rubric – You should have had many conversations with vendors and you should now know what good tools look like. Create a comprehensive rubric and make sure that you evaluate those vendors based on it. I typically share about 80% of the rubric with the vendors, so that they know what I am evaluating them on. I keep 20% hidden because they don’t need to know everything.
- Get your budget – You should know by now how much the right tool may cost the business. Work closely with your leadership team to make the business case for the budget
- Don’t make the wrong choice – If you can’t afford the ideal tool(s), then make sure that you can afford a less-than-perfect solution. Understand the total cost of ownership and how much operational workload will be added to your team. If it doesn’t make sense to select an inferior tool, don’t procure it. Instead, log the security tooling gap as a risk and get one of the execs to sign off on it, which will hopefully make it easier to procure in the future.
The wrong tool will slowly eat into your team’s operational costs and morale. I would rather have no tool, than the wrong tool.
Most Security orgs don’t fully operationalize their tools
You might notice a theme: I care about reducing the operational cost of a tool, but also ensuring that vulnerabilities get fixed.
At every organization that I have worked for, none of the tools were fully operationalized. What I mean by that is:
- The tool should be 100% integrated into your organization
- Vulnerabilities should be auto-created and auto-closed in your JIRA
- The tool should be tuned to ensure that there are very few false positives
The most important part of Security Tooling is fixing vulnerabilities. If the engineering team is not fixing the most critical vulnerabilities, why even have the tool in the first place? This is where you need to invest a lot of time with engineering leadership and ensure that they understand the importance of fixing vulnerabilities. This should be a top-down mandate for the organization.
I will dive much deeper into how to operationalize your tools in the Security Tooling Maturity Model.
Operational costs too high
There is a balance to maintaining a security tool. You need to build the infrastructure around it to ensure that it works well in your organization and workflows, but you shouldn’t invest so much time into it that you have a certain percentage of your operational budget devoted to maintaining the tool. If you get to that point, you must remove the tool from your ecosystem.
You should do a thorough PoC to know the percentage of findings that are false positives. Take your time to investigate that properly because higher false positives will mean more time working with engineers to unblock them and lower the confidence that engineers will have in the tool.
Some tools are just bad, requiring considerable maintenance. I don’t like investing time into tooling, other than the initial investment of getting it up and running.
Overwhelming engineers with vulnerabilities
Please do not overwhelm engineers by adding every single vulnerability ticket from the security tool. This will put too much pressure on the engineering org, and they will not be able to get anything done.
A more productive method is to give them a slower trickle of vulnerabilities and build up their ability to fix and patch. You will see a much better outcome when you do that.
Not involving engineers in the decision-making process
I have also made the mistake of not including engineers as a part of the decision-making process for selecting a tool. Once I made that switch, the buy-in I saw from the engineering team went through the roof.
Involve engineers, and make sure that they are active stakeholders in the process. They will provide their opinions on the tools that you may select and you will get a much better outcome. If they feel that their voices were heard during the selection process, they will advocate on the tool’s behalf going forward.
Security Tooling Maturity model
I have a crawl, walk, run, and sometimes, sprint model for Security Tooling.
Crawl
In the crawl phase, I want the security tool integrated into the ecosystem with 100% coverage. I want all of the Security Tooling data centralized in the security data lake and I want to create dashboards of the data for security and engineering leadership.
In the crawl phase, I need to understand how bad (or good) the security posture is for the tooling, and I can only do so once the tool is fully integrated in the ecosystem and I can look at the data via dashboards. Once the crawl phase is done, I can work with engineering leadership and let them know how much they have to invest their time to get into good standings. I am not expecting that they fix things immediately, but the keen teams will.
Walk
In the walk phase, my goal is to use the Security Tooling data to auto-create and auto-close vulnerability tickets.
As mentioned, my goal is to have engineering fix vulnerabilities, which means that I will only create tickets for an appropriate amount of vulnerabilities for them to fix. If there are 2 million vulnerabilities in the security tool, I might only create tickets for the most severe 1000 and work with leadership to ensure that they are resolved.
The goal here is to slowly build engineering muscle and to get engineers comfortable with fixing a new type of vulnerability.
As engineers finish up the first batch, I will continue to load them with more vulnerabilities until the most severe vulnerabilities have been dealt with.
I also want the security engineering team to auto-close vulnerabilities when they are fixed for two reasons. The most important is that engineers need to see that they have fixed the vulnerabilities *and* they will get a dopamine hit when they notice it was closed successfully, which will hopefully motivate them to fix more vulnerabilities.
The second reason is that it reduces the operational burden on the security team, who shouldn’t have to manually review that the vulnerabilities have been fixed.
Run
During the run phase, the security engineering team will put gates to prevent engineering from adding Critical- or High-level vulnerabilities into the system.
Many folks wonder why we would have a gating phase after the ticket-creating phase. They wonder, why not stem the bleeding sooner? The goal of the Walk phase is to build the security muscle of the engineering team. We want the engineering team to be comfortable with fixing vulnerabilities.
Sprint
The sprint phase is rare and only used with certain tools and the strongest of engineering teams.
In the sprint phase, the goal is to auto-fix issues and to decouple engineering from security work. This works for certain tools like SCA where you can auto-patch libraries, or for newer tools that provide SAST fixes. The engineering team would need to have a strong suite of tests in place to ensure that auto-fixing doesn’t break things even more.
Real-life problems
Here are a few real-life issues that I have run into in my career.
Company didn’t fully integrate SCA into all of their repos
If you do not have full coverage, you are not going to discover all of the vulnerabilities.
At a previous company, we spent a lot of time focusing on ensuring that our SCA tool was fully integrated into all of the appropriate Github Organizations and GitHub Repositories. This was a large effort in itself and it took us over one full quarter to integrate the tooling.
Once everything was flowing into our security data lake, we started to do some analysis of the data and we noticed that there were still many instances of log4shell in our ecosystem. We were not worried about any exploitation of the vulnerability because there were other controls in place, but we still wanted to resolve this issue. It took the team about 1 month to get 95% of the vulnerability resolved.
Lesson: Focus on coverage and analyze the data. You will not understand what is in your ecosystem until you have full coverage and have done a deep dive into the data itself.
The company invested in the wrong SAST tool
At another company I worked for, I learned that we had procured the wrong tool for the business. There was an SAST tool that was partially integrated into the ecosystem, but it wasn’t being used much and it caused a lot of friction with the engineering team.
This led to a number of issues:
- We were self-hosting the tool, which isn’t a problem in itself, but every time we updated the version of the tool, new problems would arise.
- The development team for this particular tool was overseas, which meant that every time we reported a problem, it would take a few days to solve.
- The pricing model of the tool was based on the number of users within the tool itself, so the company was incentivized to not have an integration with Github. This caused friction with engineering.
There were a number of great things about the tool, but unfortunately, the operational and maintenance costs of the tool were too high for us to continue with it.
Lesson: Ensure that you have a strong rubric for each security tool and make sure that it includes asking questions about their development and support centers.
Company overwhelmed by results of vulnerabilities
At a different company, we had a security engineer who focused their efforts on integrating security tooling vulnerabilities into our JIRA instance. There was a particular security tool that produced millions of results and the security engineer ensured that all of the vulnerabilities were in our JIRA instance.
Unfortunately, the tool added so many vulnerabilities into JIRA that JIRA slowed down for everyone. It ended up being so overwhelming that no one fixed any vulnerabilities at all.
Lesson: When integrating a new tool, only import the most critical of vulnerabilities and ensure that there is a digestible amount so that they will get fixed.
5 Security Tooling Takeaways
To summarize, please focus on the following rules:
- Encourage a culture of fixing vulnerabilities within your organization.
- Ensure full coverage of the security tool.
- Prioritize the developer’s experience.
- Centralize security data to understand themes of vulnerabilities.
- Automate where possible to reduce the operational load.
As mentioned, I will put out more articles on specific tools and how you can take advantage of them in the future.