zensols.sdoh package¶
Submodules¶
zensols.sdoh.app module¶
A model that predicts Social Determinants of Health.
zensols.sdoh.binlabel module¶
A class to reduce multilabel output.
zensols.sdoh.cli module¶
Command line entry point to the application.
- class zensols.sdoh.cli.ApplicationFactory(*args, **kwargs)[source]¶
Bases:
ApplicationFactory
zensols.sdoh.corpus module¶
SDoH Corpus.
- class zensols.sdoh.corpus.SdohCorpusStash(dataframe_path, split_col, installer, resource, mimic_corpus_file, mimic_corpus_columns, none_label, synthetic_files)[source]¶
Bases:
ResourceFeatureDataframeStash
- __init__(dataframe_path, split_col, installer, resource, mimic_corpus_file, mimic_corpus_columns, none_label, synthetic_files)¶
zensols.sdoh.facade module¶
Application facade overrides default behavior.
- class zensols.sdoh.facade.SdohModelFacade(config, config_factory=<property object>, progress_bar=True, progress_bar_cols='term', executor_name='executor', writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, predictions_dataframe_factory_class=<class 'zensols.deeplearn.result.pred.MultiLabelPredictionsDataFrameFactory'>, model_result_reporter_class=<class 'zensols.sdoh.facade.SdohModelResultReporter'>, result_name=None, suppress_transformer_warnings=True)[source]¶
Bases:
MultilabelClassifyModelFacade
- __init__(config, config_factory=<property object>, progress_bar=True, progress_bar_cols='term', executor_name='executor', writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, predictions_dataframe_factory_class=<class 'zensols.deeplearn.result.pred.MultiLabelPredictionsDataFrameFactory'>, model_result_reporter_class=<class 'zensols.sdoh.facade.SdohModelResultReporter'>, result_name=None, suppress_transformer_warnings=True)¶
- model_result_reporter_class¶
alias of
SdohModelResultReporter
zensols.sdoh.lituievner module¶
A vectorizer for the SDoH NER.
- class zensols.sdoh.lituievner.SdohFeatureDocumentParser(config_factory, name, lang='en', model_name=None, token_feature_ids=<factory>, components=(), token_decorators=(), sentence_decorators=(), document_decorators=(), disable_component_names=None, token_normalizer=None, special_case_tokens=<factory>, doc_class=<class 'zensols.nlp.container.FeatureDocument'>, sent_class=<class 'zensols.nlp.container.FeatureSentence'>, token_class=<class 'zensols.nlp.tok.SpacyFeatureToken'>, remove_empty_sentences=None, reload_components=False, auto_install_model=False)[source]¶
Bases:
SpacyFeatureDocumentParser
This fixes the issue of two (MedCAT and the Lituievner el al.) spaCy span extensions stepping on eachothers’ feet.
This fixes the issue by removing the
cui
extension added by MedCAT since it will be added again in the sdoh NER component. This is a class space attribute that should be sharble by both.- __init__(config_factory, name, lang='en', model_name=None, token_feature_ids=<factory>, components=(), token_decorators=(), sentence_decorators=(), document_decorators=(), disable_component_names=None, token_normalizer=None, special_case_tokens=<factory>, doc_class=<class 'zensols.nlp.container.FeatureDocument'>, sent_class=<class 'zensols.nlp.container.FeatureSentence'>, token_class=<class 'zensols.nlp.tok.SpacyFeatureToken'>, remove_empty_sentences=None, reload_components=False, auto_install_model=False)¶
- class zensols.sdoh.lituievner.SdohSpacyFeatureVectorizer(name, config_factory, second_level, *args, **kwargs)[source]¶
Bases:
SpacyFeatureVectorizer
Vectorizes SDoH features. Note that this vectorizer needs to use the
sdoh_
(note the underscore) as the feature name. This is because the reverse lookup on thespacy.vocab.Vocab.strings
won’t work as they appear to be added when the parser sees classes it hasn’t yet predicted for the lifecycle of the Python interpreter.Citation:
Lituiev et al. (2023) Automatic extraction of social determinants of health from medical notes of chronic lower back pain patients