Upcoming webinars presented by John Snow Labs


WATCH LIVE: April 8th at 2pm EST

AI Model Governance in a High-Compliance Industry

Model governance defines a collection of best practices for data science – versioning, reproducibility, experiment tracking, automated CI/CD, and others. Within a high-compliance setting where the data used for training or inference contains private health information (PHI) or similarly sensitive data, additional requirements such as strong identity management, role-based access control, approval workflows, and full audit trail are added.

This webinar summarizes requirements and best practices for establishing a high-productivity data science team within a high-compliance environment. It then demonstrates how these requirements can be met using John Snow Labs’ Healthcare AI Platform.


Sign up for the webinar


Ali Naqvi

Ali Naqvi

Ali Naqvi is the lead product manager of the AI Platform at John Snow Labs. Ali has extensive experience building end-to-end data science platform & solution for the healthcare and life science industries, using modern technology stacks such as Kubernetes, TensorFlow, Spark, mlFlow, Elastic, Nifi, and related tools. Ali has a Master’s degree in Molecular Science and over a decade of hands-on experience in software engineering and academic research.

Recorded on: March 18th at 2pm EST

Accurate De-Identification of Structured & Unstructured Medical Data at Scale

Recent advances in deep learning enable automated de-identification of medical data to approach the accuracy achievable via manual effort. This includes accurate detection & obfuscation of patient names, doctor names, locations, organizations, and dates from unstructured documents – or accurate detection of column names & values in structured tables. This webinar explains:

  1. What’s required to de-identify medical records under the US HIPAA privacy rule
  2. Typical de-identification use cases, for structured and unstructured data
  3. How to implement de-identification of these use cases using Spark NLP for Healthcare

After the webinar, you will understand how to de-identify data automatically, accurately, and at scale, for the most common scenarios.


Watch recording


Julio Bonis

Julio Bonis

Julio Bonis is a data scientist working on Spark NLP for Healthcare at John Snow Labs. Julio has broad experience in software development and design of complex data products within the scope of Real World Evidence (RWE) and Natural Language Processing (NLP). He also has substantial clinical and management experience – including entrepreneurship and Medical Affairs. Julio is a medical doctor specialized in Family Medicine (registered GP), has an Executive MBA - IESE, an MSc in Bioinformatics, and an MSc in Epidemiology.

Recorded on: February 26th at 2pm EST

State-of-the-art named entity recognition with BERT

Deep neural network models have recently achieved state-of-the-art performance gains in a variety of natural language processing (NLP) tasks. However, these gains rely on the availability of large amounts of annotated examples, without which state-of-the-art performance is rarely achievable. This is especially inconvenient for the many NLP fields where annotated examples are scarce, such as medical text.

Named entity recognition (NER) is one of the most important tasks for development of more sophisticated NLP systems. In this webinar, we will walk you through how to train a custom NER model using BERT embeddings in Spark NLP – taking advantage of transfer learning to greatly reduce the amount of annotated text to achieve accurate results. After the webinar, you will be able to train your own NER models with your own data in Spark NLP.

Watch recording


Veysel Kocaman

Veysel Kocaman

Veysel Kocaman is a Senior Data Scientist and ML Engineer at John Snow Labs and has a decade long industry experience. He is also pursuing his PhD in CS as well as giving lectures at Leiden University (NL) and holds an MS degree in Operations Research from Penn State University. He is affiliated with Google as a Developer Expert in Machine Learning.