Healthcare NLP for Data Scientists
- Описание
- Учебная программа
- FAQ
- Отзывы
Hello everyone, welcome to the Healthcare NLP for Data Scientists course, offered by John Snow Labs, the creator of Healthcare NLP library!
In this course, you will explore the extensive functionalities of John Snow Labs’ Healthcare NLP & LLM library, designed to provide practical skills and industry insights for data scientists professionals in healthcare.
The course covers foundational NLP techniques, including clinical entity recognition, entity resolution, assertion status detection (negation detection), relation extraction, de-identification, text summarization, keyword extraction, and text classification. There are over 13 hours of lectures with 70+ Python notebooks for you to review and use. You’ll learn to leverage pre-trained models and train new models for your specific healthcare challenges.
We offer both hands-on coding notebooks with lectures and accompanying blog posts for you to review and apply. By the end of the program, you’ll emerge equipped with the skills and insights needed to excel in the dynamic landscape of healthcare NLP and LLM.
We recommend that you take the Spark NLP for Data Scientist first to have an understading of our library and platform, that you have working experience using Python, some knowledge on Spark dataframe structure, and knowledge on NLP to make the most out of the course. Of course having some healthcare experience is always a plus.
You will need a Healthcare NLP trial license for the course, so please reach out and get one to get started with learning. Looking forward to seeing you in the course.
-
4AverageEmbeddingsВидео урок
?Learning Objectives:
Understand how to use AverageEmbeddings.
Become comfortable using the different parameters of the annotator.
-
5BertSentenceChunkEmbeddingsВидео урок
? Learning Objectives:
Understand how to use BertSentenceChunkEmbeddings.
Become comfortable using the different parameters of the annotator.
-
6ChunkSentenceSplitterВидео урок
This annotator that splits documents or sentences by chunks provided. Splitted parts can be named with the splitting chunks.
? Learning Objectives:
Understand how It is useful when you need to perform different models or analysis in different sections of your document (for example, for different headers, clauses, items, etc.).
Become comfortable using the different parameters of the annotator.
-
7EntityChunkEmbeddingsВидео урок
? Learning Objectives:
Comprehend the need for Entity Chunk Embeddings and their relationship with BERT Sentence embeddings.
Understand the concept of a weighted average vector representation of related entity chunks.
Become comfortable using the different parameters of the annotator.
-
8AnnotationMergerВидео урок
? Learning Objectives:
Merging two or more same type annotation results in a spark nlp pipeline
-
9ReplacerВидео урок
Replacer allows to replace entities in the original text with the ones extracted by the annotators NameChunkObfuscatorApproach or DateNormalizer.
?Learning Objectives:
Understand how Replacer works.
Understand how Replacer can be used to with the DateNormalizer annotator and in the deintification process.
Become comfortable using the setUseReplacement parameter of the annotator.
-
10Chunk2TokenВидео урок
?Learning Objectives:
Understand how to use Chunk2Token.
Become comfortable using the different parameters of the annotator.
-
11ChunkKeyPhraseExtractionВидео урок
?Learning Objectives:
Understand how to extract key phrases from texts.
Become comfortable using the different parameters of the ChunkKeyPhraseExtraction.
-
12DateNormalizerВидео урок
This annotator normalizes Date chunks into a chosen format.
?Learning Objectives:
Understand how it is useful when using data from different sources, some times from different countries that has different formats to represent dates.
Become comfortable using the different parameters of the annotator.
-
13DrugNormalizerВидео урок
This annotator is designed to normalize unprocessed text extracted from clinical documents, like web pages or XML files, converting it into sentences.
?Learning Objectives:
Understand how it normalizes mentions of drugs in clinical text. You can utilize it to remove unwanted characters according to specific rules and perform lowercase formatting.
Become comfortable using the different parameters of the annotator.
-
14IOBTaggerВидео урок
IOBTagger merges token tags and NER labels from chunks in the specified format. For example output columns as inputs from NerConverter and Tokenizer can be used to merge.
This notebook will cover the different parameters and usages of IOBTagger.
? Learning Objectives:
Become comfortable using the different parameters of the IOBTagger.
-
15NerDisambiguatorВидео урок
This annotator links words of interest, such as names of persons, locations and companies, from an input text document to a corresponding unique entity in a target Knowledge Base (KB).
? Learning Objectives:
Background: Understand the NerDisambiguation
Colab setup
Become comfortable using the different parameters of the annotator.
-
16NerChunkerВидео урок
This annotator extracts phrases that fits into a known pattern using the NER tags.
? Learning Objectives:
Understand how NerChunker works.
Become comfortable using the Regex parameter of the annotator.
-
17FlattenerВидео урок
The Flattener converts annotation results into a format that easier to use. This annotator produces a DataFrame with flattened and exploded columns containing annotation results, making it easier to interpret and analyze the information. It is particularly useful for extracting and organizing the results obtained from Spark NLP Pipelines.
? Learning Objectives:
Understand how to use the annotator.
Become comfortable using the different parameters of the annotator.
-
18NerQuestionGeneratorВидео урок
? Learning Objectives:
Understand how to use NerQuestionGenerator.
Become comfortable using the different parameters of the annotator.
Programatically generate question to be used by Question-Answering models.
-
19InternalDocumentSplitterВидео урок
This lecture covers the uses of InternalDocumentSplitter. This annotator specifically target to split documents into relevant sections.
?Learning Objectives:
Understand how InternalDocumentSplitter works.
Become comfortable using the parameters of the annotator.
-
20RegexMatcherВидео урок
RegexMatcher
This notebook will cover the different parameters and usages of the RegexMatcher annotator. This annotator provides the ability to tag occurrences of regex patterns in raw text.
? Learning Objectives:
Find occurrences of regular expression (regex) patterns in text
Set one or more regex rules and assign an identifier for each regex rule
Create and use external regex rules file
Change the matching strategy of RegexMatcher
-
21NerConverterВидео урок
This annotator converts a IOB or IOB2 representation of a named entity to a user-friendly one, by associating the tokens of recognized entities and their label. Chunks with no associated entity (tagged “O”) are filtered out.
? Learning Objectives:
Understand how NerConverterInternal works.
Become comfortable using the different parameters of the annotator.
-
22Ner Model InferenceВидео урок
We will examine the MedicalNerApproach to to train an MedicalNerModel.
?Learning Objectives:
Understand the meaning and usage of Named Entity Recognition.
Learn how to preprocess your data before training a model.
Understand how to train a Named Entity Recognition model using MedicalNerApproach.
-
23NerModelВидео урок
This Named Entity recognition annotator is a generic NER model based on Neural Networks. Neural Network architecture is Char CNNs - BiLSTM - CRF that achieves state-of-the-art in most datasets.
In the original framework, the CNN extracts a fixed length feature vector from character-level features. For each word, these vectors are concatenated and fed to the BLSTM network and then to the output layers. They employed a stacked bi-directional recurrent neural network with long short-term memory units to transform word features into named entity tag scores. The extracted features of each word are fed into a forward LSTM network and a backward LSTM network. The output of each network at each time step is decoded by a linear layer and a log-softmax layer into log-probabilities for each tag category. These two vectors are then simply added together to produce the final output. In the architecture of the proposed framework in the original paper, 50-dimensional pretrained word embeddings is used for word features, 25-dimension character embeddings is used for char features, and capitalization features (allCaps, upperInitial, lowercase, mixedCaps, noinfo) are used for case features.
?Learning Objectives:
Understand how to detect Named Entities by using pre-trained models.
Become comfortable using the different parameters of the annotator.
-
24BertForTokenClassifierВидео урок
We will examine the MedicalBertForTokenClassifier annotator for Named-Entity-Recognition (NER) tasks.
? Learning Objectives:
Understand how to detect Named Entities by using Bert models with a token classification.
Become comfortable using the different parameters of the annotator.
-
25ChunkFiltererВидео урок
? Learning Objectives:
Understand how to use ChunkFilterer.
Become comfortable using the different parameters of the annotator.
-
26ChunkFilterer Model InferenceВидео урок
?Learning Objectives:
Understand how to set filters, via a white list of terms or a regular expression.
Become comfortable using the different parameters of the ChunkFilterer.
-
27ChunkMerge Model InferenceВидео урок
We will examine the ChunkMapperApproach to create custom mapper model based on the given json file.
This annotator ensures creating of a mapper to map the chunks based on a pre-defined dictionary with no machine learning/deep learning model.
? Learning Objectives:
Understand how to create a mapper model by using pre-defined dictionary.
Become comfortable using the different parameters of the annotator.
-
28ChunkMergeModelВидео урок
We will cover the different parameters and usages of ChunkMergeModel. This annotator provides the ability to merge chunk columns coming from two or more annotators using a model generated by ChunkMergeApproach. Common parameters with ChunkMergeApproach can be used in ChunkMergeModel in the same way.
? Learning Objectives:
Merging two or more chunks results in a spark nlp pipeline
Using ChunkMergeModel annotator's parameters to get desired outputs
-
29ChunkConverterВидео урок
? Learning Objectives:
Understand how to use ChunkConverter.
Become comfortable using the different parameters of the annotator.
-
30ContextualParserModelВидео урок
? Learning Objectives:
Understand how to use ContextualParserModel.
Become comfortable using the different parameters of the annotator.
Train a ContextualParserApproach annotator and use that model with ContextualParserModel in the future.
-
31ZeroShotNerModelВидео урок
?Learning Objectives:
Understand how to use ZeroShotNerModel.
Become comfortable using the different parameters of the annotator.
Identify clinical entities on text without training data.
-
32ContextualParser Model InferenceВидео урок
? Learning Objectives:
Understand how to use ContextualParserModel.
Become comfortable using the different parameters of the annotator.
Train a ContextualParserApproach annotator and use that model with ContextualParserModel in the future.
-
33EntityRulerВидео урок
EntityRuler
This notebook will cover the different parameter and usage of EntityRuler. There are 2 annotators to perform this task in Spark NLP; EntityRulerApproach and EntityRulerModel.
EntityRulerApproach fits an Annotator to match exact strings or regex patterns provided in a file against a Document and assigns them an named entity. The definitions can contain any number of named entities.
EntityRulerModel is instantiated model of the EntityRulerApproach
? Learning Objectives:
Understand how to extract entities with predefined regex patterns or match predefined exact strings.
Understand the difference between the EntityRulerApproach and EntityRulerModel.
Become comfortable using the different parameters of these annotators.
-
34TextMatcherВидео урок
In this notebook, we will examine the TextMatcherInternal annotator and its model version TextMatcherInternalModel.
This annotator match exact phrases provided in a file against a Document.
? Learning Objectives:
Understand how to match exact phrases by using pre-defined dictionary.
Become comfortable using the different parameters of the annotator.
-
35AssertionChunkConverterВидео урок
? Learning Objectives:Understand the meaning and use of assertion status.
Learn how to create a chunk column with metadata for training assertion status detection models.
Customize your assertion model by using the different parameters of the annotator.
-
36AssertionFiltererВидео урок
? Learning Objectives:
Understand the meaning and use of assertion status.
Learn how to create a chunk column with metadata for training assertion status detection models.
Customize your assertion model by using the different parameters of the annotator.
-
37AssertionDLModelВидео урок
? Learning Objectives:
Understand how to use AssertionDLModel.
Become comfortable using the different parameters of the annotator.
Build a pretraine pipeline using AssertionDLModel annotator.
-
38AssertionLogReg Model InferenceВидео урок
?Learning Objectives:
Understand how to use AssertionLogRegApproach.
Become comfortable using the different parameters of the annotator.
-
39AssertionLogRegModelВидео урок
? Learning Objectives:
Understand how to use AssertionLogRegModel.
Become comfortable using the different parameters of the annotator.
-
40AssertionDL Model InferenceВидео урок
?Learning Objectives:
Understand how to use AssertionDLApproach.
Become comfortable using the different parameters of the annotator.
Train a AssertionDLApproach annotator and use that model with AssertionDLApproach in the future.
-
41RelationExtractionDLModelВидео урок
This Relation Extraction annotator extracts and classifies instances of relations between named entities. In contrast with RelationExtractionModel, RelationExtractionDLModel is based on BERT.
?Learning Objectives:
Understand how to extract and classify the relations between named entities by using pre-trained BERT models.
Become comfortable using the different parameters of the annotator.
-
42RelationExtraction Model Inference Pt1Видео урок
We will examine the RelationExtractionApproach to to train an RelationExtractionModel.
?Learning Objectives:
Learn how to preprocess your data before training a Relation Extraction model.
Understand how to train a Relation Extraction model using RelationExtractionApproach.
Understand how to resume Relation Extraction model training.
-
43RelationExtraction Model Inference Pt2Видео урок
We will examine the RelationExtractionApproach to to train an RelationExtractionModel.
? Learning Objectives:
Learn how to preprocess your data before training a Relation Extraction model.
Understand how to train a Relation Extraction model using RelationExtractionApproach.
Understand how to resume Relation Extraction model training.
-
44RENerChunksFilterВидео урок
The RENerChunksFilter annotator filters NER chunks that contains the desired NER entity pairs only.
?Learning Objectives:
Understand how to filters desired relation pairs for the RelationExtractionDLModel
Become comfortable using the different parameters of the annotator.
-
45RelationExtractionModelВидео урок
This Relation Extraction annotator extracts and classifies instances of relations between named entities.
? Learning Objectives:
Understand how to extract and classify the relations between named entities by using pre-trained models.
Become comfortable using the different parameters of the annotator.
-
46ZeroShotRelationExtractionModelВидео урок
? Learning Objectives:
Understand how to use ZeroShotRelationExtractionModel.
Become comfortable using the different parameters of the annotator.
Identify clinical relations on text without training data.
-
47FeaturesAssemblerВидео урок
? Learning Objectives:
Understand how to use FeaturesAssembler.
Using FeaturesAssembler with several columns.
Using FeaturesAssembler with embeddings.
-
48DistilBertForSequenceClassificationВидео урок
? Learning Objectives: Become comfortable using the different parameters of the annotator.
-
49BertForSequenceClassificationВидео урок
? Learning Objectives: Become comfortable using the different parameters of the annotator.
-
50GenericClassifier Model InferenceВидео урок
? Learning Objectives:
Understand how trains a TensorFlow model for generic classification of feature vectors.
Become comfortable using the different parameters of the annotator.
-
51GenericSVMClassifierModelВидео урок
?Learning Objectives:
Background: Understand the 'GenericSVMClassifierModel' annotator.
Colab setup.
Become comfortable with using the different parameters of the annotator.
-
52GenericLogRegClassifier Model InferenceВидео урок
? Learning Objectives:
Understand how to use GenericLogRegClassifierApproach.
Become comfortable using the different parameters of the annotator.
-
53GenericClassifierModelВидео урок
? Learning Objectives:
Understand how to map chunks by using pre-defined dictionary.
Become comfortable using the different parameters of the annotator.
-
54GenericSVMClassifier Model InferenceВидео урок
?Learning Objectives:
Understand how trains a TensorFlow model for SVMClassifier of feature vectors.
Become comfortable using the different parameters of the annotator.
-
55DocumentMLClassifier Model InferenceВидео урок
? Learning Objectives:
Understand how to train a model to classify documents with a Logarithmic Regression algorithm.
Become comfortable using the different parameters of the annotator.
-
56DocumentMLClassifierModelВидео урок
?Learning Objectives:
Background: Understand the 'DocumentMLClassifierModel' Annotator.
Colab setup.
Become comfortable with using the different parameters of the annotator.
-
57FewShotClassifierВидео урок
This lecture covers the uses of FewShotClassifierModel. This annotator specifically targets few-shot classification tasks, which involves training models to make accurate predictions with limited labeled data.
?Learning Objectives:
Understand how FewShotClassifierModel works.
Become comfortable using the parameters of the annotator.
-
58WindowedSentenceModelВидео урок
This lecture will cover the different parameters and usages of WindowedSentenceModel. This annotator helps you merge the previous and following sentences of a given piece of text, so that you add the context surrounding them.
?Learning Objectives:
Understand how it is super useful when using for especially context-rich analyses that require a deeper understanding of the language being used.
Become comfortable using the different parameters of the annotator.
-
59DocumentLogRegClassifierВидео урок
This lecture covers the uses of DocumentLogRegClassifier. This annotator uses a supervised learning algorithm that learns to classify documents (or text) into predefined categories or classes based on the content of the text.
?Learning Objectives:
Understand how DocumentLogRegClassifier works.
Become comfortable using the parameters of the annotator.
-
60DocumentFiltererByClassifierВидео урок
This lecture covers the uses of DocumentFiltererByClassifier. This annotator is designed to filter documents based on outcomes generated by classifier annotators.
?Learning Objectives:
Understand how DocumentFiltererByClassifier works.
Become comfortable using the parameters of the annotator.
-
61Resolution2ChunkВидео урок
? Learning Objectives:
Become comfortable using the different parameters of the Resolution2Chunk.
-
62ChunkMapperModelВидео урок
This annotator maps chunks based on pre-defined dictionary with no machine learning/deep learning model.
?Learning Objectives:
Understand how to map chunks by using pre-defined dictionary.
Become comfortable using the different parameters of the annotator.
-
63DocMapperModelВидео урок
This annotator maps document typed strings based on pre-defined dictionary with no machine learning/deep learning model.
? Learning Objectives:
Understand how to map document based strings by using pre-defined dictionary.
Become comfortable using the different parameters of the annotator.
-
64DocMapper Model InferenceВидео урок
We will examine the DocMapperApproach to create custom mapper model based on the given json file.
This annotator ensures creating of a mapper to map the document typed strings based on a pre-defined dictionary with no machine learning/deep learning model.
? Learning Objectives:
Understand how to create a mapper model by using pre-defined dictionary.
Become comfortable using the different parameters of the annotator.
-
65ChunkMapper Model Inference Pt1Видео урок
We will examine the ChunkMapperApproach to create custom mapper model based on the given json file.
This annotator ensures creating of a mapper to map the chunks based on a pre-defined dictionary with no machine learning/deep learning model.
?Learning Objectives:
Understand how to create a mapper model by using pre-defined dictionary.
Become comfortable using the different parameters of the annotator.
-
66ChunkMapper Model Inference Pt2Видео урок
We will examine the ChunkMapperApproach to create custom mapper model based on the given json file.
This annotator ensures creating of a mapper to map the chunks based on a pre-defined dictionary with no machine learning/deep learning model.
?Learning Objectives:
Understand how to create a mapper model by using pre-defined dictionary.
Become comfortable using the different parameters of the annotator.
-
67ChunkMapperFiltererВидео урок
We will examine the ChunkMapperFilterer annotator.
This annotator filters the chunks based on whether the ChunkMapperModel annotator successfully mapped the chunk or not.
? Learning Objectives:
Understand how to filter the chunks that were passed through the ChunkMapperModel.
Become comfortable using the different parameters of the annotator.
-
68Doc2ChunkВидео урок
?Learning Objectives:
Understand how to use Doc2ChunkInternal.
Become comfortable using the different parameters of the annotator.
-
69RouterВидео урок
This annotator provides the ability to split an output of an annotator for a selected metadata field and the value for that field.
?Learning Objectives:
Using Router annotators' parameters to get desired outputs
Using Router for optimizing getting sentence embeddings and multiple sentence entity resolver models.
-
70SentenceEntityResolverModelВидео урок
This annotator maps clinical entities to a particilar ontology / curated dataset using sentence embeddings.
? Learning Objectives:
Understand the application and relevance of these models in healthcare data analysis
Map clinical entities to standard codes (ICD-10, RxNorm, SNOMED, etc.)
Become comfortable using the different parameters of the annotator.
-
71SentenceEntityResolver Model InferenceВидео урок
This annotator trains a SentenceEntityResolverModel that maps sentence embeddings to entities in a knowledge base.
?Learning Objectives:
Understand the application and relevance of these models in healthcare data analysis, particularly in coding and classification tasks related to healthcare ontologies like ICD-10, RxNorm, SNOMED, etc.
Become comfortable using the different parameters of the annotator.
-
72DeIdentification_DeIdentificationModel Pt1Видео урок
This annotator provides the ability to obfuscate or mask the entities that contains personal information.
?Learning Objectives:
Background: Understand the Deidentification module
Colab setup
Become comfortable with deidentiifcation using the different parameters of the annotator.
-
73DeIdentification_DeIdentificationModel Pt2Видео урок
This annotator provides the ability to obfuscate or mask the entities that contains personal information.
?Learning Objectives:
Background: Understand the Deidentification module
Colab setup
Become comfortable with deidentiifcation using the different parameters of the annotator.
-
74ReIdentificationВидео урок
This annotator can reidentifies obfuscated entities by DeIdentification. It requires the outputs from the deidentification as input. Input columns need to be the deidentified document and the deidentification mappings set with DeIdentification.setMappingsColumn.
? Learning Objectives:
Background: Understand the Deidentification then ReIdentification
Colab setup
Become comfortable using the different parameters of the annotator.
-
75ResolverMergerВидео урок
This annotator provides the ability to merge sentence enitity resolver and chunk mapper model output columns.
?Learning Objectives:
Merging sentence enitity resolver and chunk mapper results in a spark nlp pipeline
-
76NameChunkObfuscator Model InferenceВидео урок
This module can replace name entities with consistent fakers.
?Learning Objectives:
Obfuscation background
Colab setup
Become comfortable using the different parameters of the annotator.
-
77NameChunkObfuscatorВидео урок
It allows to transform a dataset with an Input Annotation of type CHUNK, into its obfuscated version of by obfuscating the given CHUNKS. This module can replace name entities with consistent fakers, remain others same.
? Learning Objectives:
Obfuscation background
Colab setup
Become comfortable using the different parameters of the annotator.
-
78DocumentHashCoderВидео урок
DocumentHashCoder() annotator is used for determining shifts date information for deidentification purposes.
This annotator gets the hash of the specified column and creates a new document column containing day shift information.
? Learning Objectives:
Understand how to shift days in Deidentification tasks by using DocumentHashCoder.
Become comfortable using the different parameters of the annotator.
-
79SummarizerВидео урок
? Learning Objectives:
Background: Understand the MedicalSummarizer Annotator.
Colab setup.
Become comfortable with using the different parameters of the annotator.
-
80ExtractiveSummarizationВидео урок
? Learning Objectives:
Background: Understand the 'ExtractiveSummarization' Annotator.
Colab setup.
Become comfortable with using the different parameters of the annotator.