Omkar Panchal
Data Engineer | Healthcare Analytics | GenAI & LLMs | GCP • Python • DBT • Databricks | AI-900 Certified
Contact Me
Professional Summary
Data Engineer with 3+ years building scalable data platforms and analytics solutions in healthcare and regulated domains. Specializing in Google Cloud Platform, Python, SQL, DBT, Databricks, and BigQuery to design reliable ETL/ELT pipelines handling billions of records.
Current focus: healthcare analytics and Generative AI, with hands-on experience in LLM-based NLP pipelines, embeddings, RAG architectures, and ML workflows using MLflow. Published researcher in healthcare NLP and de-identification using large language models.
3+
Years Experience
1.2M
Patient Records
3B
Data Records
Core Expertise
Technical Skills & Specializations
Data Engineering
Python, SQL, DBT, ETL/ELT pipelines, BigQuery, Databricks
Cloud Platforms
Google Cloud Platform expertise, Docker containerization, scalable architecture
Healthcare Data
OMOP CDM, HL7 standards, clinical terminologies, Australian healthcare system
AI & ML
GenAI, LLMs, NLP pipelines, RAG architectures, MLflow, embeddings
Current Role: Data Engineer at CGD Health
May 2022 - Present
Hyderabad, India
Leading data engineering initiatives for healthcare analytics, managing 1.2 million sensitive patient records with observational data up to 3 billion records. Developing Python-based ETL pipelines using DBT on Google Cloud Platform.
01
Pipeline Development
Design and deploy Python ETL pipelines with DBT for efficient data operations
02
Data Management
Implement OMOP CDM concepts and maintain healthcare data warehouses
03
Analysis & Insights
Conduct exploratory analysis using SQL, Python, and R for strategic decisions
04
Client Delivery
Generate insights, prepare reports, and present findings in client calls
Key Achievements & Responsibilities
Technical Excellence
  • Extensive work on Google Cloud Platform and BigQuery
  • Deep understanding of healthcare terminologies and interoperability standards
  • Solid proficiency in HL7 messaging standard and electronic health data
  • Docker utilization for Web Application projects
Project Management
  • JIRA, Confluence, GitHub, Microsoft 365, and Slack
  • Seamless integration of CDM OMOP concepts into operations
  • Effective client communication and high-quality solution delivery
  • Specialized knowledge of Australian healthcare system
Early Career: Student Internship
Pantech ProLabs India
Aug 2021 - Jan 2022
ChatBot Development
Created customized ChatBot using Dialogflow
Computer Vision
Moving object detection, face detection and recognition using OpenCV
Deep Learning
Object detection, image classification, hand gesture and character recognition
Vehicle Tracking
Vehicle detection, tracking and license plate recognition project
NLP Applications
Title generation from paragraphs and speech recognition systems
Education & Academic Background
1
Master of Science - Bioinformatics
Guru Nanak Khalsa College (Autonomous) | November 2020 - April 2022
2
Bachelor of Science - Biotechnology
Guru Nanak Khalsa College (Autonomous) | July 2017 - November 2020
3
Higher Secondary Certificate - Science
TM Hinduja National Sarvodaya High School | July 2015 - May 2017
4
Secondary School Certificate
Vivekanananda English High School
Research Publications
Published researcher contributing to healthcare NLP and de-identification using large language models, with focus on protecting sensitive health information.
De-identification Using LLMs
Deidentification and Temporal Normalization of Electronic Health Record Notes Using Large Language Models: The 2023 SREDH/AI-Cup Competition
OpenDeID Pipeline Evaluation
Evaluation of OpenDeID Pipeline in the 2023 SREDH/AI-Cup Competition for Deidentification of Sensitive Health Information
LLM Applications in Healthcare
Leveraging large language models for the deidentification and temporal normalization of sensitive health data
Certifications & Professional Development
AWS for Beginners
Cloud computing fundamentals
Introduction to R
Statistical programming language
Complete SQL Bootcamp
From zero to hero comprehensive training
Chromatography Techniques
Six-day hands-on workshop: HPLC, GC, HPTLC, TLC & CC
Bioinformatics with Python
Computational biology applications
Get In Touch
Location: Shree Sai Shraddha Co-op Society, Building No. 7, Room 315, 3rd Floor, Maharashtra Nagar Mankhurd, Mumbai-400088
Phone: 8452027096
Languages
Hindi (Full Professional)
English (Professional Working)
Marathi (Native or Bilingual)
Made with