Now that OpenAI has rolled out GPT-5, I’ve been itching to see how it stacks up against GPT-4. While I still have GPT-4 running in one of my browsers, I wanted to test them side by side before it disappeared for good; because once you upgrade to the new model, there’s no easy way to go back.
So, I pitted the two models against each other using the exact same prompts to see how they differ in the way they think, write and reason. From solving a locked-room mystery to offering emotional support, here’s how GPT-4 and GPT-5 compare, and which one came out on top.
1. Chain-of-thought reasoning
Prompt: “You are a detective solving a mystery. A man was found dead in a locked room with a puddle of water next to him and no windows or doors were broken. Walk me through your thought process to det