← all repositories
samre12/deep-trading-agent

Teaching a neural net to lose money on Bitcoin, faster

A Deep Q-Learning agent that trades Bitcoin using the DeepSense architecture, with Docker support and TensorBoard logging.

794 stars Python Domain AppsAgentsML Frameworks
deep-trading-agent
Velocity · 7d
+0.2
★ / day
Trend
steady
star history

What it does

This project trains a reinforcement learning agent to trade Bitcoin on a per-minute basis. It uses Deep Q-Learning with three possible actions per trading unit: neutral, long, or short. The agent receives rewards based on its current position and learns to maximize accumulated returns. A Docker image is provided that pulls fresh Coinbase transaction data, preprocesses it, and spins up TensorBoard on port 6006.

The interesting bit

The Q-function approximation uses DeepSense, an architecture originally designed for sensor fusion (think accelerometer + gyroscope data), adapted here for a single time series. The author also borrows the “unrealized PnL” reward concept from prior work, with plans to add exponential decay weighting to stabilize learning — though this remains unchecked on the todo list.

Key highlights

  • Preprocessing extracts 180-minute history windows from Coinbase per-minute data, filtering out gaps too short for training episodes
  • Docker image includes vim, screen, and auto-fetched Bitcoin price data at /deep-trading-agent/data/btc.csv
  • TensorFlow 1.1.0 implementation, adapted from existing DeepSense and DQN-tensorflow repos
  • Wiki documents dataset, architecture, and reward function details
  • Python 2.7 codebase (yes, really)

Caveats

  • Python 2.7 and TensorFlow 1.1.0 are frozen in amber; modern environments will need the Docker image or significant migration work
  • The “exponentially decayed weighted unrealized PnL” reward function — described as key to stabilizing learning — is listed as not yet implemented
  • Advanced preprocessing (gap-filling to increase usable training blocks) is also marked “to be implemented”
  • 588,000 raw blocks of continuous prices collapse to 887 usable blocks after filtering, suggesting the dataset is sparser than it first appears

Verdict

Worth a look if you’re studying reinforcement learning in financial time series and want a concrete, runnable baseline to dissect or modernize. Skip it if you need production-ready trading infrastructure or are allergic to legacy Python.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.