← all repositories
alexattia/Data-Science-Projects

A thousand-star study notebook nobody asked for

A grab-bag of Kaggle entries, scrapers, and half-finished experiments that accidentally became a popular reference.

1.2k stars Jupyter Notebook LearningML Frameworks
Data-Science-Projects
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does This repo collects one developer’s solutions to HackerRank algorithm puzzles and Kaggle machine-learning competitions, plus a few side projects: bike-share demand forecasting, NYC taxi trip duration prediction, Amazon deforestation tracking from satellite imagery, IMDB rating prediction, a Selenium-based Twitter scraper, and a dlib-powered face recognizer that needs only 20 photos per person.

The interesting bit The Twitter parser is the most opinionated piece: it bypasses the official API’s two-week limit by driving PhantomJS with Selenium, then emails the owner when new machine-learning flash cards drop. That’s not data science, that’s personal automation with extra steps — and it’s the only project here with a clear user.

Key highlights

  • Kaggle entries span regression (bike sharing, taxi trips), computer vision (Amazon satellite labeling with Keras), and NLP-adjacent scraping (movie ratings)
  • Face recognition leans on the dlib C++ library with HOG features, not a from-scratch neural net
  • One project includes a French-language explanatory PDF, suggesting these were actual coursework or self-study artifacts
  • Soccer data project is explicitly aimless: “I don’t know currently what’s the aim”

Caveats

  • Several projects are described as incomplete or exploratory; the soccer parser has no stated goal
  • PhantomJS is deprecated; the Twitter scraper may need headless Chrome or similar to function today
  • No tests, no requirements files visible, no reproducibility infrastructure mentioned

Verdict Worth a quick browse if you’re stuck on a classic Kaggle starter competition and want to see one person’s notebook style. Skip it if you need production-ready code or novel techniques — this is a learning diary, not a framework.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.