Classify

To classify reports using the Phase 01 models, nmrezman.phase01.classify.classifier() is used. This function takes a raw report, preprocesses the note, loads the model weights, and

  • determines if there are findings or no findings

  • if findings are found, determines if there are lung or adrenal findings,

  • if findings are found, determines the relevant portion of the note that made that decision,

  • and, if there are lung findings, determines if a chest CT is recommended.

A script nmrezman.phase01.classify.run_classifier is provided to easily run report text through the classifier:

python -m nmrezman.phase01.classify.run_classifier --data_path /path/to/data/report.txt --model_path /path/to/checkpoints/phase01/
nmrezman.phase01.classify.classifier(data: str, model_path: str) Dict[str, object][source]

Results Management Classifier using biLSTMs according to Phase 01 of the project.

Parameters
  • data (str) – Radiologist report

  • model_path (str) –

    Path to the folder with model checkpoints and tokenizer

    Note

    The model weights and tokenizer should be located in the specified folder as:
    • findings_best_model.h5

    • comment_best_model.sav

    • lung_adrenal_best_model.h5

    • lung_recommend_best_model.h5

    • tokenizer.gz

    for the (i) Findings vs No Finding Model, (ii) Lung vs Adrenal Findings Model, (iii) Comment Extraction Model, (iv) Lung Recommended Procedure model, and (v) tokenizer, respectively.

Returns

A dictionary which includes the (1) recommended procedure, (2) nodule type (if found), (3) boolean indicating if a follow-up is required, and (4) the follow-up text (i.e., text of the report that indicates the finding) as stored / referenced by the dictionary keys “procedure”, “noduleType”, “followUpFlag”, “followUpText”, respectively

Example:

>>> report_txt = "a string with the radiology report text"
>>> model_path = "/path/to/checkpoints/phase01/"
>>> output = classifier(report_txt, model_path)
>>> print("Output:")
>>> [print(f"  {key}:", value) for key, value in output.items()]
... Output:
...   procedure: Chest CT
...   noduleType: Lung
...   followUpFlag: Findings Present
...   followUpText: several pulmonary micronodules. follow-up in one year recommended.