Is data-science-portfolio open source?

Yes — sajal2692/data-science-portfolio is open source, released under the MIT license.

What language is data-science-portfolio written in?

sajal2692/data-science-portfolio is primarily written in Jupyter Notebook.

How popular is data-science-portfolio?

sajal2692/data-science-portfolio has 1.2k stars on GitHub.

Where can I find data-science-portfolio?

sajal2692/data-science-portfolio is on GitHub at https://github.com/sajal2692/data-science-portfolio.

← all repositories

sajal2692/data-science-portfolio

A thousand-star recipe for the classic data-science portfolio

A curated collection of Jupyter notebooks and R Markdown files covering the standard ML curriculum, useful mostly as a reference for how to structure your own.

★1.2k stars Jupyter Notebook Learning ML Frameworks

View on GitHub ↗ Homepage ↗

Not currently ranked — collecting fresh signals.

star history

What it does This repo is a personal portfolio of data science projects—machine learning, NLP, data analysis, and visualization—built in Jupyter notebooks and R Markdown. It covers the usual suspects: Boston housing prices, Titanic survival, MNIST digit recognition, sentiment analysis, and a smattering of Kaggle-style exploratory work. The author also maintains a separate site for prettier browsing.

The interesting bit The portfolio is deliberately bilingual, splitting work between Python (scikit-learn, Keras, Pandas) and R (published via RPubs). That alone makes it a decent reference for anyone straddling both ecosystems. The “Disaster Message Classifier” stands out as a fuller-stack project, with an ETL pipeline, ML pipeline, and a Flask web app with Plotly visualizations—most entries are narrower notebook demos.

Key highlights

Covers supervised, unsupervised, reinforcement, and deep learning in one repo
Includes a cross-language information retrieval system (German queries, English documents)
R work is published externally at RPubs, not buried in the repo
“Micro Projects” section isolates single-algorithm walkthroughs (logistic regression, KNN, random forests)
Requirements.txt provided for local setup; data flagged as demonstration-only

Caveats

Several projects are explicitly labeled “very simple analysis” or “micro”—depth varies sharply
No visible tests, CI, or reproducibility infrastructure beyond a requirements file
Some R content lives entirely outside the repo; you’ll need to chase links

Verdict Good for early-career data scientists figuring out how to organize a portfolio, or hiring managers wanting a quick scan of a candidate’s range. Skip it if you’re after production code or novel methods—this is show-and-tell, not a framework.

Frequently asked

What is sajal2692/data-science-portfolio?: A curated collection of Jupyter notebooks and R Markdown files covering the standard ML curriculum, useful mostly as a reference for how to structure your own.
Is data-science-portfolio open source?: Yes — sajal2692/data-science-portfolio is open source, released under the MIT license.
What language is data-science-portfolio written in?: sajal2692/data-science-portfolio is primarily written in Jupyter Notebook.
How popular is data-science-portfolio?: sajal2692/data-science-portfolio has 1.2k stars on GitHub.
Where can I find data-science-portfolio?: sajal2692/data-science-portfolio is on GitHub at https://github.com/sajal2692/data-science-portfolio.