cpldcpu/MisguidedAttention
A collection of prompts that test whether large language models can reason through modified versions of thought experiments and riddles without falling back to familiar but incorrect solutions.

This repository contains prompts designed to evaluate LLM reasoning by presenting modified versions of classic riddles, paradoxes, and thought experiments. The prompts are structured to trigger recognition of familiar problems while requiring different solutions, testing whether models apply logical deduction or fall back to memorized responses. An evaluation framework tracks how different models perform on this benchmark over time, with interactive results available via a GitHub Pages deployment.