Human evolution is all about people becoming ever smarter and technology-savvy. At the outset of the third millennium, we have become intelligent enough to delegate most of our work (especially repetitive and monotonous assignments) to machines. Moreover, the digitalization drive known as Industry 4.0 has ushered the artificial intelligence know-how that aims to humanize mechanisms as much as possible.
The umbrella notion of AI refers to a number of technologies that make computers and automatic systems think and act like humans. Here belongs robotics, visual recognition, machine learning, big data, and natural language processing (NLP).
Natural language processing is a technology that enables computers to recognize and interpret human speech — both oral and written texts. Being a high-tech blend of linguistics and computer science, NLP utilizes both approaches in its operational algorithms. The latter fall into two main categories.
The first is the linguistically-powered rule-based systems that use a network of pre-defined grammatical criteria to interpret texts. The second relies on statistical models and allows gradual learning as long as new data are fed into the system. In either case, NLP starts its operation by preliminary processing the input data when it is organized into a logical format, typically, dividing a text into smaller chunks or tokens. This tokenization procedure facilitates further interpreting of datasets of any complexity, finding implementation in the following NLP techniques.
The OCR know-how empowers a machine to read a printed or hand-written text to convert it into a digital format (for instance, a PDF file) that can be utilized for analytical purposes. Moreover, OCR extracts texts or tables from unstructured data sets like images, that can also be presented in a wieldy shape.
After analyzing the information it contains, this technique goes a step further, assigning labels/tags to texts of various lengths (even separate clauses).
Here, the NER algorithms help identify words that name particular objects or phenomena and place them within pre-defined categories.
By applying it, the text classification can be performed. Such categorization relies on words or phrases common for the processed documents and semantic relationships between them.
Also known as opinion mining, this analysis allows experts to detect the sentiment underlying a text or statement.
All this may sound a bit too brainy, but in fact, people make use of NLP software almost every day. It happens when we scan some documents, interact with virtual assistants like Siri or Alexa “living” inside our smartphones, or communicate with a customer service chatbot explaining some problem.
As a standalone technology, NLP finds a wide scope of applications in many industries, such as retail, security, advertising, automotive, etc. However, its power increases manifold when it is combined with other IT-orchestrated products. As far as healthcare software is concerned, such a high-tech alliance is possible when NLP practices are married with the system of Electronic Health Records.
Long gone and almost forgotten are the times when being a doctor spelled huge amounts of paperwork. Equally, contemporary patients are spared the onerous task of lugging along their bulky medical histories to consult every clinician and trying not to leave behind any of its content in numerous examination rooms. Today, all the patient data is kept in EHRs.
As an IT company with numerous healthcare software products in our portfolio, we at DICEUS see a significant potential of enhancing existing EHR systems with cutting-edge NLP techniques.
Most clinical texts are in a free form, that is why they abound in acronyms and abbreviations and may contain some errors (both spelling and typos) to say nothing of the hardly-intelligible handwriting. Deciphering the former and correctly interpreting the latter is mission-critical for identifying disease symptoms, establishing a diagnosis, prescribing medications, determining treatment timeline. NLP-fueled tools can eliminate such issues and provide medics with clear-cut and accurate patient data.
Since patient encounter is the major unit of arranging data in a garden variety EHR, doctors often have trouble in finding all necessary information about them. NLP can help them out by breaking the interface into sections where the words that describe a patient’s concerns are entered into the PROBLEM section. The doctor flags such words (for instance, headache, dizziness, nausea, pain), and the special NER tool would show all occurrences of this complaint. This clinical assertion model can show how often individual experiences certain symptoms and give the specialist a complete picture of their diagnostic judgment on.
NER can also be instrumental in the analysis of the keywords extracted from clinical data by assigning them to the problem, test, treatment, or other fields. The usage of this NLP algorithm enables patient identification via drug and dosage filtration mechanisms.
Another NLP tool helps streamline a chart review process. The specialized software uses keywords to sift through multiple medical reports and then summarizes and even visualizes the received data classifying them into relevant categories (such as chronic diseases, allergies, taken drugs, lab test results, procedures, etc.). More advanced NLP solutions employ speech-to-text dictation to facilitate searches of this kind.
ICD-10-CM is a comprehensive database that contains information about numerous conditions that are symptomatic of various diseases, serving as an ultimate reference point for doctors in diagnosing their patients’ cases. In the database, all illnesses and their symptoms are codified. NLP mechanisms can simplify diagnosing as well as searching for relevant statistics on a disease (like mortality rates or quality outcomes) by assigning the system’s code to the data taken from a patient’s medical records.
The term “phenotype” is used to describe a palpable (physical, physiological, biochemical, behavioral, or otherwise) display of a certain trait of an organism. Taking their combination into account, clinicians can group patients into classes to compare their cases and make knowledgeable conclusions as to their diagnoses.
Doing it manually is a job of work, even if the data is structured. With unstructured data (such as charges, orders, follow-up appointments, and interactions), accurate phenotype analysis is achievable only by leveraging NLP techniques. Moreover, the latter provides access to pathology reports unavailable for traditional analytic solutions, thus immensely enriching the dataset submitted for thorough scrutiny.
Patients’ medical data belong to the category of sensitive information that can cause reputational or financial damage if disclosed. Naturally, governments issue ordinances (in the USA, it is Health Insurance Portability and Accountability Act) aimed at safeguarding the anonymity of such data. It can be achieved through deidentification, that is removal of name, address, phone number from the open-access databases. NLP solutions can replace such personal data with semantic tags, which makes such EHR privacy compliant.
To find eligible candidates for participation in clinical trials, healthcare institutions must peruse a slew of clinical and personal patient data, which is a daunting task even when dealing with the digitalized dossiers in EHR. The ordeal can be made into a no-sweat venture by employing the keyword search practiced in NLP techniques. The same mechanisms can be applied for choosing the trial venues where the best combination of qualifying patients, sponsors, and research organizations can be obtained.
NLP solutions can be used to gauge the efficiency of a particular drug in terms of its intensity, duration, and frequency. It is performed by establishing connections between entities identified by NLP-powered software based on keywords and phrases they contain.
The healthcare industry makes use of many financial documents (such as contracts or insurances) that are related to medical and personal data EHRs contain. NER mechanisms can help identify various sums, organizations, individuals, dates, etc., mentioned in them, thus minimizing errors and reducing fraud attempts.
This has much to do with sentiment analysis that involves not only EHR data but also social media texts posted by patients. Their combination processed by NLP mechanisms allows experts to determine the quality of patient experience, identify at-risk individuals, and even forestall undesirable developments (for instance, suicide attempts).
Having numerous applications when implemented in EHR, the NLP know-how has yet to go a long way to overcome obstacles that impede its across-the-board triumphant advent.
Like any novel technology, NLP has some limitations that should be borne in mind while introducing its mechanisms into EHR documentation processing.
Each natural language includes a range of sublanguages, or “lects” (dialects, sociolects, idiolects, etc.). Medical vernacular with its distinct vocabulary and peculiar lexical standards is one of the sociolects functioning within a concrete natural language. It takes a specifically “taught” NLP to figure out the meaning of texts written in “Medish”. And if its mechanisms have been trained to understand the language of, say, mass media (traditional or social) with its unique abbreviations and emoticons, they would have problems deciphering medical texts.
Moreover, “Medish” isn’t a homogeneous entity either. It has its own sublects so that the languages of clinical records and medical blogs may differ considerably. That is why acquiring boxed NLP solutions should be approached with utmost care. Otherwise, you may end up buying software developed for dentists’ clinical notes and be surprised to discover that it is inadequate to be employed for interpreting maternity ward documentation.
The best way out to combat this difficulty is to commission custom software that a seasoned company with expertise in the field will tailor to suit your unique needs.
Natural language is an extremely rich dataset. Even within one lect, there are such phenomena as polysemy and homonymy when one lexical form may have various (often unrelated) meanings. Plus, different forms may come to denote the same notion (aka synonymy). The existence of these kinds of linguistic variations makes developing NLP in healthcare especially challenging as misinterpretation of a medical document may have the most gruesome consequences – both for the medicare provider and for the patient.
Contemporary NLP systems have a tough row to hoe to handle not only lexical issues but syntax as well. For instance, NLP can’t differentiate the subject and the object in sentences where the word order is crucial. Thus, structures like “husband helped patient with medications” and “patient helped husband with medications” will seem identical to NLP because the semantic roles of agent and recipient are confused.
If an NLP solution is “trained” to understand words it doesn’t mean it will be efficient in understanding texts. When strung together in connected sequences, words and their combinations may acquire different meanings depending on the context they are used in. Developers must be aware of this linguistic poser and create software that would be able to understand not separate words but texts and consider context while interpreting their meaning.
When entering data into EHR, people tend to use templates and shortcuts, which amounts to a plethora of type information in medical dossiers. And this is a problem for NLP algorithms that identify sentences but not templates. The data will remain unprocessed.
Another problem related to the abundance of shortcuts within the type of information is the so-called note bloat when medical records contain more data than it is absolutely necessary, thus confusing NLP that works with them. And don’t forget about mistakes or outdated entries. In a word, if the data EHR contains is of dubious quality you can’t expect NLP to make a silk purse out of a sow’s ear.
EHR is a state-of-the-art technology that remarkably streamlines and facilitates keeping medical records via digitalizing all related documentation. By reinforcing it with NLP, EHR can be given an additional impetus so that healthcare providers could make the most of the clinical data they work with (in case all possible weaknesses of employing NLP are mitigated). If they recruit a competent IT company to develop a custom NLP-powered EHR solution, they are sure to receive a first-class product that will satisfy their particular requirements perfectly.