← all repositories
deependersingla/deep_trader

When AlphaGo met Jesse Livermore: a trading bot's origin story

A 2016-vintage experiment in teaching reinforcement learning to "read the tape" on stock markets, frozen in time when its author started a real RL trading company.

1.5k stars Python AgentsDomain Apps
deep_trader
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

What it does

This repo trains DQN and policy-gradient agents to trade stocks using TensorFlow. The agent learns to hold, buy, or short across fixed-length episodes, receiving a terminal reward based on final portfolio value minus transaction costs. The author wanted to test whether an agent could learn to “read tape” — interpret price action the way old-school traders did.

The interesting bit

The README doubles as a genuine development journal, complete with the pivot from Chainer to TensorFlow because “all the cool kids even DeepMind (the gods) have started using TensorFlow.” The author also muses on why CNNs might suit price data (small input changes shouldn’t trigger trades) before sensibly settling on a two-layer feed-forward network to avoid normalization headaches. It’s refreshingly unpolished — a snapshot of someone thinking out loud while the 2016 RL hype wave was still building.

Key highlights

  • Two working implementations: DQN (dqn_model.py) and policy gradients (pg_model.py)
  • Episodic training design: terminal reward only, avoiding the complexity of per-step reward calculation in trading
  • Includes Google Drive links to Nifty/NSE futures data for immediate reproduction
  • Extensive reading list: Sutton’s RL book, David Silver’s lectures, AlphaGo papers
  • Author now runs an actual RL trading company and has (understandably) abandoned support

Caveats

  • The author explicitly states: “Leave other directories, I am not working on them for now” — only tensor-reinforcement/ is current
  • No visible test results, performance metrics, or profitability claims in the README
  • Data dependencies live on Google Drive links that may rot; some point to 4shared
  • The “deep thoughts” journal and Google Doc suggest this was a learning project, not a finished system

Verdict

Worth a skim if you’re researching the evolution of retail RL-for-finance experiments, or if you want to see how someone reasoned through network architecture choices in 2016. Skip it if you need production-ready trading infrastructure or current maintenance — this is a time capsule, not a product.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.