bin123apple/AutoCoder
A code generation LLM based on CodeQwen and DeepSeek-Coder that achieves 90.9% on HumanEval with an automatic code interpreter.

AutoCoder is a code generation model available in multiple sizes (33B, 7B, 6.7B) that outperforms GPT-4 Turbo on HumanEval benchmarks. The model includes a code interpreter feature that can automatically install required Python packages and execute generated code to verify correctness. It differs from similar open-source code interpreters by only running code when the user explicitly requests verification, similar to GPT-4 Turbo’s approach.