zensols.sdoh package¶

Submodules¶

zensols.sdoh.app module¶

A model that predicts Social Determinants of Health.

class zensols.sdoh.app.Application(config_factory)[source]¶

Bases: object

A model that predicts social determinates of health.

__init__(config_factory)¶

config_factory: ConfigFactory¶

few_shot_process(clear=False, max_sents=None)[source]¶

Predict SDOHs on the configured corpus as few-shot examples.

Parameters:

clear (bool) – whether to first clear previous results
max_sents (int) – the max number of sentences to process

test_process(clear=False, max_sents=None)[source]¶

Predict SDOHs on the configured test corpus.

Parameters:

clear (bool) – whether to first clear previous results
max_sents (int) – the max number of sentences to process

zensols.sdoh.binlabel module¶

A class to reduce multilabel output.

class zensols.sdoh.binlabel.BinaryHotCodeOutcomeReducer(model_settings)[source]¶

Bases: object

__init__(model_settings)¶

model_settings: ModelSettings¶: Configures the model.

class zensols.sdoh.binlabel.BinarySingleOutputBatchIterator(executor, logger)[source]¶: Bases: BatchIterator

zensols.sdoh.cli module¶

Command line entry point to the application.

class zensols.sdoh.cli.ApplicationFactory(*args, **kwargs)[source]¶

Bases: ApplicationFactory

__init__(*args, **kwargs)[source]¶

zensols.sdoh.cli.main(args=['/Users/landes/opt/lib/python/default/bin/sphinx-build', '-M', 'html', '/Users/landes/view/nlp/med/sdoh/target/doc/src', '/Users/landes/view/nlp/med/sdoh/target/doc/build'], **kwargs)[source]¶

Return type:: ActionResult

zensols.sdoh.corpus module¶

SDoH Corpus.

class zensols.sdoh.corpus.SdohCorpusStash(dataframe_path, split_col, installer, resource, mimic_corpus_file, mimic_corpus_columns, none_label, synthetic_files)[source]¶

Bases: ResourceFeatureDataframeStash

__init__(dataframe_path, split_col, installer, resource, mimic_corpus_file, mimic_corpus_columns, none_label, synthetic_files)¶

get_labels(**kwargs) → Tuple[str, ...]¶

Return type:: Tuple[str, ...]

mimic_corpus_columns: str¶

mimic_corpus_file: str¶

none_label: str¶

synthetic_files: Sequence[str]¶

zensols.sdoh.facade module¶

Application facade overrides default behavior.

class zensols.sdoh.facade.SdohModelFacade(config, config_factory=<property object>, progress_bar=True, progress_bar_cols='term', executor_name='executor', writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, predictions_dataframe_factory_class=<class 'zensols.deeplearn.result.pred.MultiLabelPredictionsDataFrameFactory'>, model_result_reporter_class=<class 'zensols.sdoh.facade.SdohModelResultReporter'>, result_name=None, suppress_transformer_warnings=True)[source]¶

Bases: MultilabelClassifyModelFacade

__init__(config, config_factory=<property object>, progress_bar=True, progress_bar_cols='term', executor_name='executor', writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, predictions_dataframe_factory_class=<class 'zensols.deeplearn.result.pred.MultiLabelPredictionsDataFrameFactory'>, model_result_reporter_class=<class 'zensols.sdoh.facade.SdohModelResultReporter'>, result_name=None, suppress_transformer_warnings=True)¶

model_result_reporter_class¶: alias of SdohModelResultReporter

class zensols.sdoh.facade.SdohModelResultReporter(result_manager, include_validation=True)[source]¶

Bases: ModelResultReporter

__init__(result_manager, include_validation=True)¶

zensols.sdoh.lituievner module¶

A vectorizer for the SDoH NER.

class zensols.sdoh.lituievner.SdohFeatureDocumentParser(config_factory, name, lang='en', model_name=None, token_feature_ids=<factory>, components=(), token_decorators=(), sentence_decorators=(), document_decorators=(), disable_component_names=None, token_normalizer=None, special_case_tokens=<factory>, doc_class=<class 'zensols.nlp.container.FeatureDocument'>, sent_class=<class 'zensols.nlp.container.FeatureSentence'>, token_class=<class 'zensols.nlp.tok.SpacyFeatureToken'>, remove_empty_sentences=None, reload_components=False, auto_install_model=False)[source]¶

Bases: SpacyFeatureDocumentParser

This fixes the issue of two (MedCAT and the Lituievner el al.) spaCy span extensions stepping on eachothers’ feet.

This fixes the issue by removing the cui extension added by MedCAT since it will be added again in the sdoh NER component. This is a class space attribute that should be sharble by both.

__init__(config_factory, name, lang='en', model_name=None, token_feature_ids=<factory>, components=(), token_decorators=(), sentence_decorators=(), document_decorators=(), disable_component_names=None, token_normalizer=None, special_case_tokens=<factory>, doc_class=<class 'zensols.nlp.container.FeatureDocument'>, sent_class=<class 'zensols.nlp.container.FeatureSentence'>, token_class=<class 'zensols.nlp.tok.SpacyFeatureToken'>, remove_empty_sentences=None, reload_components=False, auto_install_model=False)¶

class zensols.sdoh.lituievner.SdohSpacyFeatureVectorizer(name, config_factory, second_level, *args, **kwargs)[source]¶

Bases: SpacyFeatureVectorizer

Vectorizes SDoH features. Note that this vectorizer needs to use the sdoh_ (note the underscore) as the feature name. This is because the reverse lookup on the spacy.vocab.Vocab.strings won’t work as they appear to be added when the parser sees classes it hasn’t yet predicted for the lifecycle of the Python interpreter.

Citation:

Lituiev et al. (2023) Automatic extraction of social determinants of health from medical notes of chronic lower back pain patients

__init__(name, config_factory, second_level, *args, **kwargs)[source]¶

property first_level_labels: Tuple[str, ...]¶: The labels used by the SDoH NER prediction model for just the first level models (see the paper).

property labels: Dict[str, Tuple[str, ...]]¶: The labels used by the SDoH NER prediction model.

second_level: bool¶: Whether to use the second level models’ labels (see the paper).

property second_level_labels: Tuple[str, ...]¶: The labels used by the SDoH NER prediction model for models across both levels.