← all repositories
RubensZimbres/Repo-2017

A 2017 time capsule: 40+ ML experiments in raw Python

Before Keras 2.0 and PyTorch, a data scientist documented every technique he could get his hands on.

Repo-2017
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does This is a sprawling collection of standalone Python scripts covering classic machine learning and deep learning circa 2017. You’ll find CNNs for MNIST, GANs, VAEs, ResNet and SqueezeNet implementations, plus a heavy tilt toward NLP: Doc2Vec, Word2Vec, LDA topic modeling, sentiment classifiers, and Twitter/Facebook scrapers. There are also time-series forecasts (ARIMA, neural), ensemble methods, hyperparameter tuning via reinforcement learning, and various dimensionality reduction comparisons (t-SNE, PCA, LDA).

The interesting bit The README reads like a personal lab notebook rather than a polished framework — every script is a self-contained experiment with its own dataset and goal. The author wasn’t building tools; he was stress-testing ideas across Keras 1.1.0, Theano, Lasagne, and scikit-learn while they were all still relevant.

Key highlights

  • 40+ distinct scripts, each tackling a specific technique or dataset (Iris, Boston housing, movie reviews, Wikipedia pages)
  • NLP-heavy: includes scraping, tokenization, lemmatization, word embeddings, topic modeling, and real-time Twitter sentiment analysis
  • Several neural architectures from the 2015–2017 era: ResNet-2, SqueezeNet, probabilistic NNs, and autoencoders for audio compression
  • Hyperparameter tuning via reinforcement learning — an unusual addition for a personal repo of this era
  • Direct weight visualization in Theano/Lasagne models, which was more manual then than now

Caveats

  • Pinned to Keras 1.1.0 and Theano/Lasagne — expect significant bit-rot; running these today will require deliberate environment archaeology
  • No tests, no package structure, no shared utilities; each script is copy-paste standalone
  • Some entries (“NLP Twitter Streaming”) are explicitly marked “under development”

Verdict Worth a browse if you’re teaching ML history, reviving a legacy model, or want to see how much the tooling has improved. Skip it if you need production-ready code or modern PyTorch/TensorFlow 2.x patterns — this is a museum, not a framework.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.