AI & ML Projects

End-to-end ML/RAG work with reproducible runs: classification baselines, retrieval-augmented workflows, evaluation artifacts, and clear “what to review” paths.

Back to Projects

Recommended review order

RAG Mini Chat: staged build + retrieval logs.
Next-Day Direction Classifier with RAG: time-split evaluation + model insights.
Fake News Learning with ML: baseline -> improvements -> final evaluation.
Finance Spending Coach: transaction analytics + budgeting + coach-style insights.
Sales Inventory Dashboard: dashboarding + KPI reporting.

RAG Mini Chat

Staged RAG build (single-doc → multi-doc → logging) with grounding and run artifacts so answers are debuggable.

Python RAG Grounding Run logs

Chunking + retrieval (top-k) with reviewable outputs.
Grounded prompting to constrain answers to retrieved context.
Run logs: question → retrieved chunks → final answer.
Re-run flow: python ingest.py → python chat.py (adjust to repo).

Next-Day Prediction Classifier with RAG

Direction classifier that augments features with retrieved context and logs training/evaluation outputs for traceability and debugging.

Python Retrieval Embeddings Evaluation

Classifier trained and evaluated with saved outputs for review.
Artifacts: training output, evaluation report, confusion matrix, and model insight views.
Re-run flow: python train.py → python eval.py (adjust to repo).

Fake News Learning with ML

Three-stage NLP build: baseline pipeline → improved modeling → final evaluation and error review, with saved artifacts and reproducible runs. Models are trained on the Kaggle Fake News Detection dataset.

Python NLP scikit-learn TF-IDF

Dataset: Kaggle — Fake News Detection.
Stage 1: text cleaning + TF-IDF + baseline classifier.
Stage 2: tuning/iterations (n-grams, regularization, model comparisons).
Stage 3: final metrics + confusion matrix + error review artifacts.
Re-run flow: python train.py → python evaluate.py (update to match repo).

Finance Spending Coach

Three-stage build: scoring API + Swagger (Stage 1), front-end + Docker image (Stage 2), and a Stage 3 AI coach that uses a local Markdown KB and embeddings (RAG) for grounded tips.

Python FastAPI scikit-learn Docker RAG / KB

Stage 1 – ML service: FastAPI + scikit-learn logistic regression that scores a single monthly spending profile and returns overspending risk + rule-based suggestions.
Stage 2 – Front-end: lightweight HTML/CSS UI that posts JSON to the API, visualizes risk level, and shows “coach-style” explanations on top of the score.
Stage 3 – Local KB & RAG: markdown budgeting notes are embedded, indexed on disk, and retrieved to ground additional advice in concrete best-practice snippets.
DevOps: Dockerized service with a one-command build/run flow and auto-generated Swagger docs for all stages.
Privacy & reproducibility: uses synthetic transactions by default so the project can be demoed safely in a portfolio showcase.

Sales Inventory Dashboard

End-to-end sales + inventory reporting on a synthetic retail CSV dataset created for learning purposes: curated tables, KPI rollups, and trend views with filters for fast stakeholder-style review. Includes reproducible refresh steps and validation checks, with no real customer data used.

Python Pandas Dashboard BI / Reporting

Synthetic CSV dataset (store, item, daily sales, and stock) designed for analysis and learning — no production data.
ETL/cleanup into curated tables (sales, inventory, product, calendar/time).
Metrics layer: KPIs (revenue, units, margin, stock coverage) plus trend + breakdown views.
Dashboard layer with filters and drilldowns via Streamlit.
Saved outputs: KPI summary tables, charts, and refresh/validation instructions for repeatable runs.