We’re Entering Uncharted Territory for Math
Length: • 1 min
Annotated by howie.serious
howie.serious: 在《我们进入数学的未知领域》中,作者深入探讨了人工智能与人类在学习和问题解决上的根本差异。AI,尽管在处理复杂任务时表现出色,但其思考过程更像是一个“平庸的研究助理”,缺乏真正的理解与创造力。相比之下,人类的学习能力不仅仅是积累知识,更在于如何将这些知识应用于新情境中,推动创新与发现。AI可以高效地完成常规任务,但无法替代人类在创造性思维和情感驱动下的学习过程。通过将AI的计算能力与人类的独特创造力相结合,我们或许能够在新兴的“工业级数学”领域中开辟出前所未有的可能性。这一切提醒我们,AI是辅助工具,而非替代者,未来的成功在于我们如何利用这一工具,提升自身的学习与创造能力。 背景: GPT对人类学习的影响与启示 🧠 AI与人类在学习和解决问题的方式上存在根本差异,因此可以将AI视为一种补充工具。结合AI和人类的优势,将为完成任务提供更具前景的解决方案。 💡 AI可以处理例行的任务,但缺乏创造力和想象力。我们应当认识到AI的局限性,尤其是在需要创新和深度理解的领域。 📊 在数学和科学任务中,AI能够模拟人类的思维过程,但其“推理”能力仍然有限。我们需要谨慎使用这些工具,明确它们的作用与不足,以便更好地指导研究与学习。 ### newsletter 在《我们正进入数学的未知领域》中,陶哲轩探讨了人工智能与人类学习和解决问题的根本区别。他指出,虽然AI在处理常规任务时表现出色,但它的创造力和理解能力仍显不足,这使得它更像是一个“平庸但并非完全无能”的研究助理。未来,AI与人类的协作将为数学研究开辟新的可能性,推动“工业规模的数学”发展。
Terence Tao, a mathematics professor at UCLA, is a real-life superintelligence. The “Mozart of Math,” as he is sometimes called, is widely considered the world’s greatest living mathematician. He has won numerous awards, including the equivalent of a Nobel Prize for mathematics, for his advances and proofs. Right now, AI is nowhere close to his level.
But technology companies are trying to get it there. Recent, attention-grabbing generations of AI—even the almighty ChatGPT—were not built to handle mathematical reasoning. They were instead focused on language: When you asked such a program to answer a basic question, it did not understand and execute an equation or formulate a proof, but instead presented an answer based on which words were likely to appear in sequence. For instance, the original ChatGPT can’t add or multiply, but has seen enough examples of algebra to solve x + 2 = 4: “To solve the equation x + 2 = 4, subtract 2 from both sides ...” Now, however, OpenAI is explicitly marketing a new line of “reasoning models,” known collectively as the o1 series, for their ability to problem-solve “much like a person” and work through complex mathematical and scientific tasks and queries. If these models are successful, they could represent a sea change for the slow, lonely work that Tao and his peers do.
After I saw Tao post his impressions of o1 online—he compared it to a “mediocre, but not completely incompetent” graduate student—I wanted to understand more about his views on the technology’s potential. In a Zoom call last week, he described a kind of AI-enabled, “industrial-scale mathematics” that has never been possible before: one in which AI, at least in the near future, is not a creative collaborator in its own right so much as a lubricant for mathematicians’ hypotheses and approaches. This new sort of math, which could unlock terra incognitae of knowledge, will remain human at its core, embracing how people and machines have very different strengths that should be thought of as complementary rather than competing.
This conversation has been edited for length and clarity.
Matteo Wong: What was your first experience with ChatGPT?
Terence Tao: I played with it pretty much as soon as it came out. I posed some difficult math problems, and it gave pretty silly results. It was coherent English, it mentioned the right words, but there was very little depth. Anything really advanced, the early GPTs were not impressive at all. They were good for fun things—like if you wanted to explain some mathematical topic as a poem or as a story for kids. Those are quite impressive.
Wong: OpenAI says o1 can “reason,” but you compared the model to “a mediocre, but not completely incompetent” graduate student.
Tao: That initial wording went viral, but it got misinterpreted. I wasn’t saying that this tool is equivalent to a graduate student in every single aspect of graduate study. I was interested in using these tools as research assistants. A research project has a lot of tedious steps: You may have an idea and you want to flesh out computations, but you have to do it by hand and work it all out.
Wong: So it’s a mediocre or incompetent research assistant.
Tao: Right, it’s the equivalent, in terms of serving as that kind of an assistant. But I do envision a future where you do research through a conversation with a chatbot. Say you have an idea, and the chatbot went with it and filled out all the details.
It’s already happening in some other areas. AI famously conquered chess years ago, but chess is still thriving today, because it’s now possible for a reasonably good chess player to speculate what moves are good in what situations, and they can use the chess engines to check 20 moves ahead. I can see this sort of thing happening in mathematics eventually: You have a project and ask, “What if I try this approach?” And instead of spending hours and hours actually trying to make it work, you guide a GPT to do it for you.
With o1, you can kind of do this. I gave it a problem I knew how to solve, and I tried to guide the model. First I gave it a hint, and it ignored the hint and did something else, which didn’t work. When I explained this, it apologized and said, “Okay, I’ll do it your way.” And then it carried out my instructions reasonably well, and then it got stuck again, and I had to correct it again. The model never figured out the most clever steps. It could do all the routine things, but it was very unimaginative.
One key difference between graduate students and AI is that graduate students learn. You tell an AI its approach doesn’t work, it apologizes, it will maybe temporarily correct its course, but sometimes it just snaps back to the thing it tried before. And if you start a new session with AI, you go back to square one. I’m much more patient with graduate students because I know that even if a graduate student completely fails to solve a task, they have potential to learn and self-correct.
Wong: The way OpenAI describes it, o1 can recognize its mistakes, but you’re saying that’s not the same as sustained learning, which is what actually makes mistakes useful for humans.
Tao: Yes, humans have growth. These models are static—the feedback I give to GPT-4 might be used as 0.00001 percent of the training data for GPT-5. But that’s not really the same as with a student.
AI and humans have such different models for how they learn and solve problems—I think it’s better to think of AI as a complementary way to do tasks. For a lot of tasks, having both AIs and humans doing different things will be most promising.
Wong: You’ve also said previously that computer programs might transform mathematics and make it easier for humans to collaborate with one another. How so? And does generative AI have anything to contribute here?
Tao: Technically they aren’t classified as AI, but proof assistants are useful computer tools that check whether a mathematical argument is correct or not. They enable large-scale collaboration in mathematics. That’s a very recent advent.
Math can be very fragile: If one step in a proof is wrong, the whole argument can collapse. If you make a collaborative project with 100 people, you break your proof in 100 pieces and everybody contributes one. But if they don’t coordinate with one another, the pieces might not fit properly. Because of this, it’s very rare to see more than five people on a single project.
With proof assistants, you don’t need to trust the people you’re working with, because the program gives you this 100 percent guarantee. Then you can do factory production–type, industrial-scale mathematics, which doesn't really exist right now. One person focuses on just proving certain types of results, like a modern supply chain.
The problem is these programs are very fussy. You have to write your argument in a specialized language—you can’t just write it in English. AI may be able to do some translation from human language to the programs. Translating one language to another is almost exactly what large language models are designed to do. The dream is that you just have a conversation with a chatbot explaining your proof, and the chatbot would convert it into a proof-system language as you go.
Wong: So the chatbot isn’t a source of knowledge or ideas, but a way to interface.
Tao: Yes, it could be a really useful glue.
Wong: What are the sorts of problems that this might help solve?
Tao: The classic idea of math is that you pick some really hard problem, and then you have one or two people locked away in the attic for seven years just banging away at it. The types of problems you want to attack with AI are the opposite. The naive way you would use AI is to feed it the most difficult problem that we have in mathematics. I don’t think that’s going to be super successful, and also, we already have humans that are working on those problems.
The type of math that I’m most interested in is math that doesn’t really exist. The project that I launched just a few days ago is about an area of math called universal algebra, which is about whether certain mathematical statements or equations imply that other statements are true. The way people have studied this in the past is that they pick one or two equations and they study them to death, like how a craftsperson used to make one toy at a time, then work on the next one. Now we have factories; we can produce thousands of toys at a time. In my project, there’s a collection of about 4,000 equations, and the task is to find connections between them. Each is relatively easy, but there’s a million implications. There’s like 10 points of light, 10 equations among these thousands that have been studied reasonably well, and then there’s this whole terra incognita.
[Read: Science is becoming less human]
There are other fields where this transition has happened, like in genetics. It used to be that if you wanted to sequence a genome of an organism, this was an entire Ph.D. thesis. Now we have these gene-sequencing machines, and so geneticists are sequencing entire populations. You can do different types of genetics that way. Instead of narrow, deep mathematics, where an expert human works very hard on a narrow scope of problems, you could have broad, crowdsourced problems with lots of AI assistance that are maybe shallower, but at a much larger scale. And it could be a very complementary way of gaining mathematical insight.
Wong: It reminds me of how an AI program made by Google Deepmind, called AlphaFold, figured out how to predict the three-dimensional structure of proteins, which was for a long time something that had to be done one protein at a time.
Tao: Right, but that doesn’t mean protein science is obsolete. You have to change the problems you study. A hundred and fifty years ago, mathematicians’ primary usefulness was in solving partial differential equations. There are computer packages that do this automatically now. Six hundred years ago, mathematicians were building tables of sines and cosines, which were needed for navigation, but these can now be generated by computers in seconds.
I’m not super interested in duplicating the things that humans are already good at. It seems inefficient. I think at the frontier, we will always need humans and AI. They have complementary strengths. AI is very good at converting billions of pieces of data into one good answer. Humans are good at taking 10 observations and making really inspired guesses.