Classify

To classify reports using the Phase 02 models, nmrezman.phase02.classify.classifier() is used. This function takes a raw report, preprocesses the note, loads the model weights, and

determines if there are lung findings, adrenal findings, or no findings
if findings are found, determines the relevant portion of the note that made that decision,
and, if there are lung findings, determines if a chest CT is recommended.

A script nmrezman.phase02.classify.run_classifier is provided to easily run report text through the classifier:

python -m nmrezman.phase02.classify.run_classifier --data_path /path/to/data/report.txt --model_path /path/to/checkpoints/phase02/

nmrezman.phase02.classify.classifier(data: str, model_path: str) → Dict[str, object][source]

Results Management Classifier using masked language models (MLM) according to Phase 02 of the project.

Parameters

data (str) – Radiologist report text
model_path (str) –
Path to the folder containing the model checkpoints’ folders
Note
The model weights should be in folders named as:
- findings_model
- comment_model
- lung_recommended_proc_model
for the (i) Lung, Adrenal, or No Findings Model, (ii) Comment Extraction Model, and (iii) Lung Recommended Procedure model, respectively.

Returns

A dictionary which includes the (1) recommended procedure, (2) nodule type (if found), (3) boolean indicating if a follow-up is required, and (4) the follow-up text (i.e., text of the report that indicates the finding) as stored / referenced by the dictionary keys “procedure”, “noduleType”, “followUpFlag”, “followUpText”, respectively

Example:

>>> report_txt = "a string with the radiology report text"
>>> model_path = "/path/to/checkpoints/phase02/"
>>> output = classifier(report_txt, model_path)
>>> print("Output:")
>>> [print(f"  {key}:", value) for key, value in output.items()]
... Output:
...   procedure: Chest CT
...   noduleType: Lung
...   followUpFlag: Findings Present
...   followUpText: several pulmonary micronodules. follow-up in one year recommended.