ChatGPT Solves Ancient Greek Math Puzzle with Ease

An AI chatbot named ChatGPT seemed to generate ideas and commit errors similar to a student, in a study that reactivated a 2,400-year-old mathematical problem.

Subscribe to our newsletterfor the most recent science and technology news updates.

An experiment conducted by two education researchers had the chatbot tackle a version of the “doubling the square” problem—a lesson mentioned by Plato around 385 BCE and, according to the paper, “possibly the first recorded experiment in mathematics education.” The puzzle has led to centuries of discussion on whether knowledge is inherent within us, ready to be ‘retrieved,’ or something we ‘create’ through real-life experiences and interactions.

A recent study investigated a comparable question regarding ChatGPT’s mathematical understanding—as seen through the eyes of its users. The researchers aimed to determine if it would solve Plato’s problem by relying on pre-existing knowledge or by dynamically creating its own solutions.

In the dialogue, Plato portrays Socrates instructing a young boy who has not received formal education on how to double the area of a square. Initially, the boy incorrectly proposes doubling the length of each side, but through Socrates’ guidance, he comes to realize that the sides of the new square should match the diagonal of the original.

The scientists presented this issue to ChatGPT-4, starting by mimicking Socrates’ inquiries, and later intentionally adding mistakes, questions, and different versions of the problem.

Similar to other large language models (LLMs), ChatGPT is developed using extensive amounts of text data and produces replies by forecasting word sequences that it has learned during its training phase. The scientists anticipated that it would tackle their Ancient Greek mathematics problem by repeating its existing ‘knowledge’ regarding Socrates’ well-known method. However, it actually appeared to create a new approach and, at one instance, committed an error that resembled those made by humans.

The research was carried out by Dr. Nadav Marco, a visiting researcher at the University of Cambridge, along with Andreas Stylianides, who holds the position of Professor of Mathematics Education at Cambridge. Marco is currently affiliated with the Hebrew University and the David Yellin College of Education in Jerusalem.

Although they are cautious regarding the outcomes, emphasizing that large language models do not think in the same way humans do or “figure things out,” Marco described ChatGPT’s actions as “similar to a learner.”

When we encounter a new challenge, our natural tendency is usually to experiment using what we’ve learned before,” Marco explained. “In our study, ChatGPT seemed to act in a comparable way. Like a student or researcher, it appeared to develop its own theories and answers.

Since ChatGPT is trained on written text rather than visual diagrams, it struggles with the type of geometric reasoning demonstrated by Socrates in the doubling the square problem. However, because Plato’s work is widely recognized, the researchers anticipated that the chatbot would identify their questions and provide Socrates’ solution.

Interestingly, it did not succeed. When asked to double the square, ChatGPT chose an algebraic method that would have been unfamiliar during Plato’s era.

It then refused efforts to make it repeat the boy’s error and remained committed to algebra even as researchers expressed dissatisfaction with its answer being an approximation. Only after Marco and Stylianides conveyed their disappointment that, despite its training, it couldn’t deliver an “elegant and exact” solution did ChatGPT generate the geometric alternative.

Nevertheless, ChatGPT showed complete understanding of Plato’s writings when questioned about them. “If it was merely retrieving information from memory, it would have likely mentioned the traditional method of constructing a new square on the diagonal of the original square,” Stylianides noted. “Instead, it appeared to develop its own method.”

Scientists also introduced a variation of Plato’s problem, requesting ChatGPT to double the area of a rectangle while keeping its aspect ratio the same. Despite knowing their interest in geometry, ChatGPT continued to rely on algebraic methods. When questioned further, it incorrectly stated that since the diagonal of a rectangle cannot be used to double its size, a geometric approach was not possible.

The assertion regarding the diagonal is accurate, yet another geometric approach is available. Marco proposed that the likelihood of this incorrect statement originating from the chatbot’s knowledge base was “extremely low.” Rather, ChatGPT seemed to be making up its answers based on their prior conversation about the square.

Ultimately, Marco and Stylianides requested that the triangle’s size be doubled. The Chat returned to algebra once more—but following further encouragement, it eventually provided a correct geometric solution.

The researchers emphasize the need to avoid over-analyzing these findings, as they were unable to scientifically monitor ChatGPT’s coding processes. However, from their viewpoint as digital users, what became apparent at the surface level was a combination of data retrieval and real-time reasoning.

They compare this behavior to the educational idea of a “zone of proximal development” (ZPD)—the difference between what a student currently understands and what they can potentially learn with assistance and direction. Maybe, they suggest, generative AI has a symbolic “Chat’s ZPD”: in certain situations, it may not be able to address issues right away but could do so with appropriate questioning.

The researchers propose that engaging with ChatGPT within its Zone of Proximal Development can transform its shortcomings into chances for educational growth. Through prompting, inquiry, and assessing its answers, students will not only understand ChatGPT’s constraints but also enhance essential abilities such as evaluating evidence and logical reasoning, which are fundamental to mathematical thought.

Students should not assume that proofs generated by ChatGPT are accurate, unlike those in trusted textbooks,” Stylianides stated. “Grasping and assessing AI-created proofs is becoming essential skills that should be incorporated into the mathematics curriculum.

“These are fundamental abilities we aim for students to achieve, but it involves employing statements such as, ‘I would like us to examine this issue collectively,’ rather than, ‘Inform me of the solution,’ ” Marco added.

The research is published in the International Journal of Mathematics Education in Science and Technology.

More information:A study of the mathematical understanding of ChatGPT,International Journal of Mathematics Education in Science and Technology (2025). DOI: 10.1080/0020739X.2025.2543817

Supplied by the University of Cambridge

This narrative was first released onMuara Digital Team.

Leave a Reply

Your email address will not be published. Required fields are marked *