GPT as a runtime: what could go wrong?
A thin Python wrapper that asks GPT-4 to execute functions described by type hints and docstrings, then hopes the response parses.

What it does
ai_functions.py is a single function that takes a function signature as a string, a description, and arguments, ships them to OpenAI’s API, and returns whatever the model generates. It’s essentially a prompt engineering wrapper with JSON parsing on the back end. The project is inspired by Ask Marvin.
The interesting bit
The honesty in the README. The author includes a failure table showing GPT-4 flunks basic geometry (area of a triangle) and GPT-3.5-turbo can’t even format fake people correctly. Most “AI function” demos sweep this under the rug; here it’s the headline limitation.
Key highlights
- One function:
ai_function(function_string, args, description, model="gpt-4") - Relies entirely on prompt structure—no code generation, no sandboxing, no validation
- Includes test suite with explicit pass/fail matrix across models
- API key stored in
keys.pyor environment variable (no key management beyond that) - ~937 stars, heavy ChatGPT/GPT-4 topic tagging
Caveats
- Mathematical precision is explicitly called out as broken; GPT-4 hallucinates float values
- No retry logic, no type enforcement on outputs, no timeout handling visible
- “Clone the repository” install instructions reference
YourUsername/SuperSimpleAIFunctions—a copy-paste error suggesting low maintenance
Verdict
Worth a look if you’re prototyping LLM-as-a-service patterns and want a minimal reference point. Skip it if you need determinism, math, or anything resembling a production function call.