Bandit Severity Levels: Understanding High, Medium, and Low Findings

Exploring Ideas: A Blog on Technology, Startups, Food, and More

The first time I ran Bandit on a legacy codebase, it returned 312 security findings. My immediate thought was: “Where do I even start?” Some issues seemed catastrophic, others looked like nitpicks, and many fell somewhere in between. The raw number of findings was overwhelming, but what really saved my sanity was understanding Bandit’s two-dimensional classification system.

Bandit doesn’t just tell you what’s wrong-it tells you how wrong it is and how confident it is about that assessment. This approach transforms an overwhelming flood of alerts into a manageable prioritization framework.

The Two-Dimensional Security Matrix

Think of Bandit’s classification system like a hospital emergency room. A doctor doesn’t treat patients in the order they arrive-they triage based on severity and confidence in their diagnosis. Bandit applies the same logic to code vulnerabilities.

# High severity, high confidence - immediate attention required
eval(user_input)  # B307: Enables direct code execution

# Medium severity, high confidence - significant security risk
password = "hardcoded_secret"  # B105: Hardcoded password

# Low severity, high confidence - security hygiene issue
import random
session_id = random.randint(1000, 9999)  # B311: Weak randomness

When Severity Levels Really Matter

High severity findings are the ones that keep security teams awake at night. These represent vulnerabilities that could lead to immediate system compromise: remote code execution through eval() calls with user input, fundamentally broken cryptography like DES encryption, or unencrypted network protocols that expose sensitive data in transit.

Medium severity issues require more context to exploit but still represent real security risks. These are the vulnerabilities that sophisticated attackers love because they often fly under the radar. A hardcoded database password might not seem urgent if the database is on an internal network, but it becomes critical the moment that network boundary is compromised.

Low severity findings are about security hygiene and defense in depth. Using Python’s random module instead of secrets for generating tokens might not create an immediate vulnerability, but it weakens your security posture over time.

The Confidence Factor

Confidence levels tell you how certain Bandit is about its assessment, which is crucial for prioritizing your remediation efforts and avoiding false positive fatigue.

High confidence findings are almost certainly real problems. When Bandit sees ssl.create_unverified_context(), there’s no ambiguity-you’re explicitly disabling SSL certificate verification. These findings rarely require deep investigation; they need fixing.

Medium confidence findings usually involve pattern recognition that might miss important context. Bandit might flag string formatting in SQL queries without knowing whether the variables come from user input or internal constants.

Building a Practical Triage System

The key to effective security remediation isn’t fixing everything immediately-it’s building a systematic approach that addresses the most critical issues first while establishing processes to prevent new vulnerabilities.

Start with high severity, high confidence findings. These are your “drop everything and fix this now” category. Next, tackle medium severity, high confidence issues. The medium confidence findings require more nuanced handling-set aside time for security reviews where you can properly investigate each finding.

Understanding Context and Impact

Bandit’s classifications provide a starting point, but they can’t replace human judgment about your specific application context. A hardcoded password in a test file might be medium severity according to Bandit, but if your test environment contains production data copies, it suddenly becomes much more critical.

The best security teams use Bandit’s classifications as a foundation but overlay their own understanding of the application architecture, data sensitivity, and threat model to make final prioritization decisions.

Avoiding Alert Fatigue

One of the biggest challenges with security scanning tools is alert fatigue-the tendency to ignore or suppress warnings when there are too many to process effectively. Bandit’s severity and confidence levels help prevent this by providing a clear path through the noise.

Start by configuring your CI/CD pipeline to fail builds only on high-severity, high-confidence findings. This ensures that critical issues get caught immediately while allowing less urgent problems to be addressed through normal development processes.

Making Security Sustainable

The most successful security programs treat Bandit findings not as a compliance checkbox but as opportunities to improve overall code quality and developer security awareness. By understanding the reasoning behind severity and confidence levels, you can build remediation processes that are both effective and sustainable.

Remember that security is a continuous process, not a one-time fix. The goal is building systems and practices that prevent vulnerabilities rather than just detecting them after they’re written. Bandit’s classification system provides the framework, but lasting security comes from teams that understand why these classifications exist and how to use them effectively.

Subscribe to the Newsletter

Get the latest posts and insights delivered straight to your inbox.