Data Science · Machine Learning · Full-Stack

Manas Raman Acharya

Data Science Student Building AI-Powered Solutions

Transforming complex data into actionable insights through machine learning, statistical modeling, and full-stack development.

Manas Acharya, Data Science student at CU Boulder

Selected Work

Featured Projects

A mix of production systems, machine-learning research, and full-stack tools — built across coursework, internships, and side projects.

WhaleWatch

Early detection system for stocks about to move

Real-time SEC filing tracker analyzing 13D/13G, Form 4, and 8-K filings to detect pre-move stock signals before public awareness.

Tracks filings from 500+ institutions

v2 backtest engine in progress

PythonFlaskPostgreSQLSEC EDGAR APIRailway

Open

Zenith

AI-powered platform for consistent personal growth

Motivational platform combining ancient wisdom with modern AI, featuring daily personalized coaching and progress tracking.

Personalized coaching from curated wisdom database

Next.jsFastAPIPostgreSQLClaude APIPinecone

Open

JobFlow

AI-powered resume tailoring for every application

Intelligent job application tool with 8-step AI pipeline that generates ATS-optimized resumes and personalized cover letters from any job posting URL.

8-step pipeline with keyword matching, relevance scoring, and ATS optimization

React 19ViteClaude Haiku 4.5VercelServerless Functions

Open

Hilton Invoice Code Finder

Offline invoice classification web app

Pro bono offline-first web app for a Hilton corporate manager. Matches free-text invoice descriptions to 93 GL codes across 10 departments using TF-IDF + cosine similarity in the browser. iPhone-optimized UI; localStorage persistence for use in the field.

JavaScriptHTMLCSSTF-IDF

Open

DALL-E + SAM Image Editing

Generative pipeline: text → image → mask → inpaint

Three-stage generative-image pipeline combining OpenAI DALL-E 3 for generation, Meta's Segment Anything Model for region selection, and DALL-E's edit endpoint for mask-based inpainting. Demonstrated on a fashion design concept.

1024×1024 generation + SAM segmentation + 3-variant inpainting

PythonPyTorchOpenAI APIsegment-anything+1

Open

Time-Series Retail Sales Forecasting

Daily retail sales forecasting + descriptive analytics

End-to-end pipeline on a year of daily POS data for a small retail client. Compared naive, seasonal-naive, ARIMA, Random Forest, and Gradient Boosting; explored the global-pool approach from Montero-Manso & Hyndman (2021). Paired with a descriptive sales report (day-of-week, seasonal, by-department) delivered to the client.

Gradient Boosting: MAE $494, sMAPE 13.0% — 24% MAE reduction vs ARIMA

PythonScikit-learnPandasARIMA+1

Open

La Liga Ranking

Team ranking from match-result averaging

Ranked all 20 teams of the 2020-21 La Liga season from 760 match records. Encoded wins/draws/losses (1, 0.5, 0), pivoted into a team-vs-team matrix, and computed per-opponent average performance. Team project with Jenisha Shrestha and Bipin Bisural.

Top 4 (Atlético, Real Madrid, Barcelona, Sevilla) — exact match to actual 2020-21 standings

PythonpandasNumPy

Open

Bike-Sharing Usage Patterns

Classification model for weekday demand

Weekday demand classification on Seoul's hourly bike-share data, comparing Logistic Regression, Bagging (Random Forest), and SVM. Bagging won.

4.87% misclassification rate (Bagging)

RcaretrandomForeste1071

Open

Heart Disease Prediction

Clinical prediction model

Cardiovascular risk assessment on the UCI Heart Disease dataset. Team project with Jenisha Shrestha; compared multiple classifiers, with Logistic Regression emerging as the most promising.

0.152 misclassification rate (Logistic Regression)

RcaretrandomForestleaps

Open

About

Data science, applied from problem to production.

I'm a Junior Data Science student at the University of Colorado Boulder, graduating May 2027. My work spans machine learning, statistical modeling, and full-stack AI development — building applications end-to-end from problem definition through production deployment.

Four production AI applications are currently live: WhaleWatch (real-time SEC filing intelligence), Zenith (Claude + Pinecone RAG motivational platform), JobFlow (an 8-step Claude pipeline for ATS-optimized resumes), and Hilton Invoice Code Finder (browser-side TF-IDF classification). Academic projects cover time-series forecasting (Gradient Boosting on retail POS data, 24% MAE reduction vs ARIMA), ensemble classification methods, and multi-modal generative pipelines combining DALL-E with Meta's Segment Anything Model.

Coursework includes Time Series Analysis (APPM-STAT 4720/5720), Machine Learning, Deep Learning, and Statistical Modeling. I hold an IBM SkillsBuild Enterprise Design Thinking Practitioner certification (December 2025) and am currently pursuing an additional data science certification while advancing my SQL skills. Languages: Nepali, Hindi, and English.

Tech Stack

Languages

PythonProficient
RProficient
SQLIntermediate
ExcelProficient

Frameworks

Next.js
Flask
FastAPI
React

Data Tools

TableauIntermediate
Power BIIntermediate

ML / AI

Scikit-learn
Claude API
Pinecone

Databases

PostgreSQL
SQLite

Contact

Let's Connect

Open to data science internship opportunities, research collaborations, and interesting problems.