dev
and make sure you’re up to date with the latest changes.git checkout -b parser-name
yet that’s up to you.It is probably easier to use the docker-compose stack (and benefit from the hot-reload capbility for uWSGI). Set up your environment to use the debug environment, such as:
$ docker/setEnv.sh debug
Please have a look at DOCKER.md for more details.
You’d want to build your docker images locally, and eventually pass in your local user’s uid
to be able to write to the image (handy for database migration files). Assuming your user’s uid
is 1000
, then:
$ docker-compose build --build-arg uid=1000
File | Purpose |
---|---|
dojo/tools/<parser_dir>/__init__.py | Empty file for class initialization |
dojo/tools/<parser_dir>/parser.py | The meat. This is where you write your actual parser. The class name must be the Python module name without underscores plus Parser . Example: When the name of the Python module is dependency_check , the class name shall be DependencyCheckParser |
unittests/scans/<parser_dir>/{many_vulns,no_vuln,one_vuln}.json | Sample files containing meaningful data for unit tests. The minimal set. |
unittests/tools/test_<parser_name>_parser.py | Unit tests of the parser. |
dojo/settings/settings.dist.py | If you want to use a modern hashcode based deduplication algorithm |
doc/content/en/integrations/parsers/<file/api>/<parser_file>.md | Documentation, what kind of file format is required and how it should be obtained |
Parser are loaded dynamicaly with a factory pattern. To have your parser loaded and works correctly, you need to implement the contract.
dojo.tools
dojo.tools.my_tool.parser
moduledojo.tools.my_tool.parser.MyToolParser
Parser
suffix.dojo.tools.my_tool.parser.MyToolParser
def get_scan_types(self)
This function return a list of all the scan_type supported by your parser. This identifiers are used internally. Your parser can support more than one scan_type. For example some parsers use different identifier to modify the behavior of the parser (aggregate, filter, etc…)def get_label_for_scan_types(self, scan_type):
This function return a string used to provide some text in the UI (short label)def get_description_for_scan_types(self, scan_type):
This function return a string used to provide some text in the UI (long description)def get_findings(self, file, test)
This function return a list of findingsdef set_mode(self, mode)
methodExample:
class MyToolParser(object):
def get_scan_types(self):
return ["My Tool Scan", "My Tool Scan detailed"]
def get_label_for_scan_types(self, scan_type):
if scan_type == "My Tool Scan":
return "My Tool XML Scan aggregated by ..."
else:
return "My Tool XML Scan"
def get_description_for_scan_types(self, scan_type):
return "Aggregates findings per cwe, title, description, file_path. SonarQube output file can be imported in HTML format. Generate with https://github.com/soprasteria/sonar-report version >= 1.1.0"
def requires_file(self, scan_type):
return False
# mode:
# None (default): aggregates vulnerabilites per sink filename (legacy behavior)
# 'detailed' : No aggregation
mode = None
def set_mode(self, mode):
self.mode = mode
def get_findings(self, file, test):
<...>
Some reports are not reachable as a file that the user or pipeline can upload but the results of the scans have to be downloaded via API (or we just want to add support for multiple methods). In that case, an “API parser” is needed. Core code is the same as a regular parser but there are some additional requirements.
File | Purpose |
---|---|
dojo/tools/api_<parser_dir>/api_client.py | API client should perform all HTTP API calls and JSON with data from the API |
dojo/tools/api_<parser_dir>/importer.py | Importer should prepare the API client and process its results |
dojo/tools/api_<parser_dir>/parser.py | Parser should fetch processed data from the importer |
unittests/tools/test_api_<parser_name>_parser.py | Unit tests of the parser. |
unittests/tools/test_api_<parser_name>_importer.py | Unit tests of the importer. |
dojo/tool_config/factory.py | Parser must be listed in SCAN_APIS |
unittests/test_tool_config.py | Unit tests for content of hints and other metadata |
api_
dojo/tools/api_mytool
Api
ApiMytoolParser
def api_scan_configuration_hint(self)
which returns a string with a hint, on how to configure service keys in Product …TODO. Using of HTML tag <b>
is required. Help will be rendered on the website.return 'the field <b>Service key 1</b> has to be set to ID of the project. <b>Service key 2</b> has to be set to the version of the project'
def requires_tool_type(self, scan_type)
which returns name of the required Tool_Type
.Tool_Type
. It will be created automatically based on the function requires_tool_type
.def test_connection(self):
and def test_product_connection(self, api_scan_configuration):
to be able to test connectivity and test permissions. It should return string with a sucessfull status (like you have access to 125 projects) or raise an exception.Use the template parser to quickly generate the files required. To get started you will need to install cookiecutter.
$ pip install cookiecutter
Then generate your scanner parser from the root of django-DefectDojo:
$ cookiecutter https://github.com/DefectDojo/cookiecutter-scanner-parser
Read more on the template configuration variables.
Here is a list of advise that will make your parser future proof.
We use 2 modules to handle endpoints:
hyperlink
dojo.models
with a specific class to handle processing around URLs to create endpoints Endpoint
.All the existing parser use the same code to parse URL and create endpoints.
Using Endpoint.from_uri()
is the best way to create endpoints.
If you really need to parse an URL, use hyperlink
module.
Good example:
if "url" in item:
endpoint = Endpoint.from_uri(item["url"])
finding.unsaved_endpoints = [endpoint]
Very bad example:
u = urlparse(item["url"])
endpoint = Endpoint(host=u.host)
finding.unsaved_endpoints = [endpoint]
Parsers may have many fields, out of which many of them may be optional.
It better to not set attribute if you don’t have data instead of filling with values like NA
, No data
etc…
Check class dojo.models.Finding
Always make sure you include checks to avoid potential KeyError
errors (e.g. field does not exist), for those fields you are not absolutely certain will always be in file that will get uploaded. These translate to 500 error, and do not look good.
Good example:
if "mykey" in data:
finding.cve = data["mykey"]
Data can have CVSS
vectors or scores. Don’t try to write your own CVSS score algorithm.
For parser, we rely on module cvss
.
It’s easy to use and will make the parser aligned with the rest of the code.
Example of use:
from cvss.cvss3 import CVSS3
import cvss.parser
vectors = cvss.parser.parse_cvss_from_text("CVSS:3.0/S:C/C:H/I:H/A:N/AV:P/AC:H/PR:H/UI:R/E:H/RL:O/RC:R/CR:H/IR:X/AR:X/MAC:H/MPR:X/MUI:X/MC:L/MA:X")
if len(vectors) > 0 and type(vectors[0]) == CVSS3:
print(vectors[0].severities()) # this is the 3 severities
cvssv3 = vectors[0].clean_vector()
severity = vectors[0].severities()[0]
vectors[0].compute_base_score()
cvssv3_score = vectors[0].scores()[0]
print(severity)
print(cvssv3_score)
Good example:
vectors = cvss.parser.parse_cvss_from_text(item['cvss_vect'])
if len(vectors) > 0 and type(vectors[0]) == CVSS3:
finding.cvss = vectors[0].clean_vector()
finding.severity = vectors[0].severities()[0] # if your tool does generate severity
Bad example (DIY):
def get_severity(self, cvss, cvss_version="2.0"):
cvss = float(cvss)
cvss_version = float(cvss_version[:1])
# If CVSS Version 3 and above
if cvss_version >= 3:
if cvss > 0 and cvss < 4:
return "Low"
elif cvss >= 4 and cvss < 7:
return "Medium"
elif cvss >= 7 and cvss < 9:
return "High"
elif cvss >= 9:
return "Critical"
else:
return "Informational"
# If CVSS Version prior to 3
else:
if cvss > 0 and cvss < 4:
return "Low"
elif cvss >= 4 and cvss < 7:
return "Medium"
elif cvss >= 7 and cvss <= 10:
return "High"
else:
return "Informational"
By default a new parser uses the ‘legacy’ deduplication algorithm documented at https://documentation.defectdojo.com/usage/features/#deduplication-algorithms
Each parser must have unit tests, at least to test for 0 vuln, 1 vuln and many vulns. You can take a look at how other parsers have them for starters. The more quality tests, the better.
It’s important to add checks on attributes of findings. For ex:
with self.subTest(i=0):
finding = findings[0]
self.assertEqual("test title", finding.title)
self.assertEqual(True, finding.active)
self.assertEqual(True, finding.verified)
self.assertEqual(False, finding.duplicate)
self.assertIn(finding.severity, Finding.SEVERITIES)
self.assertEqual("CVE-2020-36234", finding.cve)
self.assertEqual(261, finding.cwe)
self.assertEqual("CVSS:3.1/AV:N/AC:L/PR:H/UI:R/S:C/C:L/I:L/A:N", finding.cvssv3)
self.assertIn("security", finding.tags)
self.assertIn("network", finding.tags)
self.assertEqual("3287f2d0-554f-491b-8516-3c349ead8ee5", finding.unique_id_from_tool)
self.assertEqual("TEST1", finding.vuln_id_from_tool)
To test your unit tests locally, you first need to grant some rights. Get your MySQL root password from the docker-compose logs, login as root and issue the following commands:
MYSQL> grant all privileges on test_defectdojo.* to defectdojo@'%';
MYSQL> flush privileges;
This local command will launch the unit test for your new parser
$ docker-compose exec uwsgi bash -c 'python manage.py test unittests.tools.<your_unittest_py_file>.<main_class_name> -v2'
Example for the blackduck hub parser:
$ docker-compose exec uwsgi bash -c 'python manage.py test unittests.tools.test_blackduck_csv_parser.TestBlackduckHubParser -v2'
$ docker-compose exec uwsgi bash -c 'python manage.py test unittests -v2'
Some types of parsers create a list of endpoints that are vulnerable (they are stored in finding.unsaved_endpoints
). DefectDojo requires storing endpoints in a specific format (which follow RFCs). Endpoints that do not follow this format can be stored but they will be marked as broken (red flag 🚩in UI). To be sure your parse store endpoints in the correct format run the .clean()
function for all endpoints in unit tests
findings = parser.get_findings(testfile, Test())
for finding in findings:
for endpoint in finding.unsaved_endpoints:
endpoint.clean()
Not only parser but also importer should be tested.
patch
method from unittest.mock
is usualy usefull for simulating API responses.
It is highly recommeded to use it.
In the event where you’d have to change the model, e.g. to increase a database column size to accomodate a longer string of data to be saved
Change what you need in dojo/models.py
Create a new migration file in dojo/db_migrations by running and including as part of your PR
$ docker-compose exec uwsgi bash -c 'python manage.py makemigrations -v2'
If you want to be able to accept a new type of file for your parser, take a look at dojo/forms.py
around line 436 (at the time of this writing) or locate the 2 places (for import and re-import) where you find the string attrs={"accept":
.
Formats currently accepted: .xml, .csv, .nessus, .json, .html, .js, .zip.
Of course, nothing prevents you from having more files than the parser.py
file. It’s python :-)
If you want to take a look at previous parsers that are now part of DefectDojo, take a look at https://github.com/DefectDojo/django-DefectDojo/pulls?q=is%3Apr+sort%3Aupdated-desc+label%3A%22Import+Scans%22+is%3Aclosed
Please update [docs/content/en/integrations/parsers.md
] with the details of your new parser.