Skip to main content

ADR-0014: Scan Metric Collection

Status:DRAFT
Date:2022-09-01
Author(s):Max Maass max.maass@iteratec.com

Context

It can be difficult to notice if a scan setup has broken, as long as the scan is still running and returning at least some results. For example, if a ZAP scan is configured with authentication, but the credentials expire or the authentication system is changed, the scan will not error out - it will simply work without authentication and thus provide a lot less data than it would otherwise. We currently do not have a good way of monitoring failure cases like this. Thus, we are considering adding a metrics collection system to the secureCodeBox, so that we at least have the data to potentially write some alerting on.

The general consensus in the team is that the monitoring itself is not the job of the secureCodeBox - we want to provide the data on a best-effort basis, but we do not want to replace Grafana / Prometheus for the actual monitoring. This ADR thus deals with how metrics can be collected and sent on to a dedicated monitoring platform.

Metrics to Collect

In general, the metrics that could be collected can be split into two types of metrics:

  • Standard metrics for all scanners (runtime, number of findings, success or failure)
  • Specific metrics for the individual scanners (sent requests and responses, number of responses with specific HTTP status codes, lines of code scanned, ...)

These two types of metrics must be handled very differently, thus, we will discuss them separately here.

Standard Metrics

The standard set of metrics can be collected through the operator in combination with Kubernetes functionality. The following metrics should be collected:

  • Running time of the scanner pod
  • Running time of the parser pod (maybe)
  • Number of findings identified by the parser
  • ... TODO: Any more?

Scanner-specific Metrics

For the scanner-specific metrics, it gets a bit more complicated. We need to differentiate two cases: either the data we are interested in is part of the machine-readable output format generated by the scanner, or it is only provided in the log output of the tool (but not the file). Metrics that are provided in neither of these two places and cannot be collected by Kubernetes in a standardized way are out of scope for this ADR.

The exact metrics we are interested in would vary depending on the scanner, and would have to be decided on a case-by-case basis.

Metrics from the Results File

Metrics that are contained in the file the lurker is pulling from the scanner pod can be extracted by the parser.

Metrics from the Tool Logs

Metrics that are only contained in the stdout output of the tool are more challenging. Assuming we are interested enough in them that we are willing to invest some time, we could pull the logs from the Kubernetes pod and provide them to the parser pod. There, the logs could be searched using arbitrary JavaScript code to extract the necessary information. An example for this could be the number of received responses with their status code, as provided by ZAP.

Alternatively, we could add a sidecar pod that monitors the execution, collects the metrics, and provides them to the parser via the lurker. In this case, we could have a unified workflow that can work both for official (3rd party) docker containers (with a custom monitoring sidecar from us) and containers that we build ourselves (in which case we may in some cases be able to dispense with the sidecar and provide the metrics directly from our wrapper script).

Challenges:

  • Log format may change
  • In situations with more than one pod (for example, ZAP Advanced), which pods' logs are pulled? All of them?
  • Requires writing custom metrics collectors for all scantypes (or at least for all we are interested in)

How to Record the Metrics

There are multiple options for how the collected metrics could then be recorded.

Included in Findings

The metrics could be written into the Findings by the parser. This would require making changes to the finding format. The format may look something like this:

Old Format:

[
{"Finding1" : "content 1"}
]

New Format:

{
"findings" = [
{"Finding1" : "content 1"},
],
"metrics" = {
"runtime": 2413,
// ... Extra standard metrics
"scannerMetrics": {
"200_response": 20,
"403_response": 32,
"404_response": 82,
// etc.
}
}
}
Advantages
  • Having everything in one place makes it simpler to work with the data
Drawbacks
  • Requires changes to the findings format, which may cause breakage for third-party scripts developed against the findings format specification
  • Requires rewriting several parts of the SCB platform to deal with the new format

Extra File

Alternatively, the metrics could be provided as an extra file that sits next to the (unchanged) findings and results files. It would be an extra JSON file that contains just the metrics, and may look something like this:

// Filename: metrics.json
{
"runtime": 2413,
// ... Extra standard metrics
"scannerMetrics": {
"200_response": 20,
"403_response": 32,
"404_response": 82,
// etc.
}
}

The extra file would be generated by the parser, and would be accessible to ScanCompletionHooks.

Advantages
  • No changes to the findings format necessary, thus maintaining compatibility of findings format
  • Arbitrary format for the metrics file can be chosen
Disadvantages
  • Requires providing additional presigned URLs for this purpose
  • Requires changes to the parser and hook SDK, thereby potentially breaking third-party scanners maintained outside of the SCB

Transmitting Metrics to the Monitoring System

At this point, we have the metrics, but we still need to pass them on to the monitoring system. This would be achieved through a ScanCompletionHook that reads the data and converts it to a format suitable for the relevant platform. In the case of push-based metrics systems like Grafana / Elastic, this can be as simple as pushing the data to the platform. For scraping-based systems like Prometheus, a push gateway may have to be used. The exact implementation would be up to the hook implementer.

Open questions:

  • In a situation where you have several scans with the same scantype but for different targets, and you want to track these separately, how do we differentiate them for the metrics platform?
    • The goal is that several executions of the same scan can be monitored over time, without the time series data being contaminated by other scans
    • This would probably have to be done based on a scan annotation, similar to how we do it for DefectDojo

Decision

Consequences