🧠 Patient Sentiment Analysis Using NLP & LLMs

This project analyzes open-ended patient reviews to predict sentiment (positive or negative) using Natural Language Processing (NLP) techniques and Large Language Models (LLMs).

🚀 Project Overview

Goal: Classify patient feedback based on sentiment
Dataset: 996 hospital reviews with labeled sentiment
Techniques used:
- Text cleaning and preprocessing (NLTK, regex)
- Exploratory Data Analysis (word clouds, word frequencies, review length)
- Feature extraction with TF-IDF
- Sentiment classification using:
  - Logistic Regression (baseline)
  - distilBERT LLM from Hugging Face (zero-shot)

🗂️ Project Structure


patient-sentiment-healthcare/
├── data/
│ ├── dataset_hospital_reviews.csv #raw
│ └── dataset_hospital_reviews_cleaned.csv #processed
├── notebooks/
│ ├── 01_data_cleaning_and_eda.ipynb # Data cleaning + EDA
│ └── 02_modeling_and_llm_comparison.ipynb # Model training + LLM comparison
├── README.md

📊 Results Summary

Logistic Regression (TF-IDF)

Accuracy: 0.86
High precision on positive class
Poor recall on negative class

distilBERT (LLM)

Accuracy: 0.78
Much better at identifying negative reviews
Balanced recall across classes

🧪 Example Review

"Wait hour despite appointment isn’t first time happened understanding manage appointment queue it’s random unorganised lot scope improve"

--> Detected as NEGATIVE by distilBERT

🛠️ Tech Stack

Python, Pandas, Scikit-learn, NLTK, Matplotlib, Seaborn
Hugging Face Transformers (distilBERT)
Google Colab (for LLM execution)

📁 How to Run

Open 02_modeling_and_llm_comparison.ipynb in Google Colab
Mount your Google Drive and upload the cleaned dataset (or use the one provided)
Run the cells to explore, train, and evaluate both models

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 Patient Sentiment Analysis Using NLP & LLMs

🚀 Project Overview

🗂️ Project Structure

📊 Results Summary

Logistic Regression (TF-IDF)

distilBERT (LLM)

🧪 Example Review

🛠️ Tech Stack

📁 How to Run

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data		data
01_data_cleaning_and_eda.ipynb		01_data_cleaning_and_eda.ipynb
02_modeling_and_llm_comparison.ipynb		02_modeling_and_llm_comparison.ipynb
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

🧠 Patient Sentiment Analysis Using NLP & LLMs

🚀 Project Overview

🗂️ Project Structure

📊 Results Summary

Logistic Regression (TF-IDF)

distilBERT (LLM)

🧪 Example Review

🛠️ Tech Stack

📁 How to Run

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages