To classify reports using the Phase 02 models, nmrezman.phase02.classify.classifier() is used. This function takes a raw report, preprocesses the note, loads the model weights, and

  • determines if there are lung findings, adrenal findings, or no findings

  • if findings are found, determines the relevant portion of the note that made that decision,

  • and, if there are lung findings, determines if a chest CT is recommended.

A script nmrezman.phase02.classify.run_classifier is provided to easily run report text through the classifier:

python -m nmrezman.phase02.classify.run_classifier --data_path /path/to/data/report.txt --model_path /path/to/checkpoints/phase02/
nmrezman.phase02.classify.classifier(data: str, model_path: str) Dict[str, object][source]

Results Management Classifier using masked language models (MLM) according to Phase 02 of the project.

  • data (str) – Radiologist report text

  • model_path (str) –

    Path to the folder containing the model checkpoints’ folders


    The model weights should be in folders named as:
    • findings_model

    • comment_model

    • lung_recommended_proc_model

    for the (i) Lung, Adrenal, or No Findings Model, (ii) Comment Extraction Model, and (iii) Lung Recommended Procedure model, respectively.


A dictionary which includes the (1) recommended procedure, (2) nodule type (if found), (3) boolean indicating if a follow-up is required, and (4) the follow-up text (i.e., text of the report that indicates the finding) as stored / referenced by the dictionary keys “procedure”, “noduleType”, “followUpFlag”, “followUpText”, respectively


>>> report_txt = "a string with the radiology report text"
>>> model_path = "/path/to/checkpoints/phase02/"
>>> output = classifier(report_txt, model_path)
>>> print("Output:")
>>> [print(f"  {key}:", value) for key, value in output.items()]
... Output:
...   procedure: Chest CT
...   noduleType: Lung
...   followUpFlag: Findings Present
...   followUpText: several pulmonary micronodules. follow-up in one year recommended.