An Interview With Ilya Sutskever, Co-Founder of OpenAI（伊利亚访谈）

As we hurtle towards a future filled with artificial intelligence, many commentators are wondering aloud whether we're moving too fast. The tech giants, the researchers, and the investors all seem to be in a mad dash to develop the most advanced AI.

当我们向着充满人工智能的未来飞驰时,许多评论家都在大声质疑我们是否在走得太快了。科技巨头、研究人员和投资者似乎都在疯狂地追求发展最先进的人工智能。

但是他们是否考虑到了风险呢,这是那些担心者所问的。

The question is not entirely moot, and rest assured that there are hundreds of incisive minds considering the dystopian possibilities - and ways to avoid them.

问题并非完全无意义,可以放心,有数百位敏锐的人在考虑反乌托邦的可能性,以及避免它们的方法。

But the fact is that the future is unknown, the implications of this powerful new technology are as unimagined as was social media at the advent of the Internet.

但事实上,未来是未知的,这种强大的新技术的影响,就像互联网出现时的社交媒体一样意想不到。

There will be good and there will be bad, but there will be powerful artificial intelligence systems in our future and even more powerful AIs in the future of our grandchildren. It can’t be stopped, but it can be understood.

会有好的和坏的,但我们的未来会有强大的人工智能系统,我们孙辈的未来会更加强大。这是无法阻止的,但可以被理解。

I spoke about this new technology with Ilya Stutskever, a co-founder of OpenAI, the not-for-profit AI research institute whose spinoffs are likely to be among the most profitable entities on earth.

我和OpenAI的联合创始人之一Ilya Stutskever谈论了这项新技术。OpenAI是一家非盈利的人工智能研究机构,其副产品有可能成为全球最有利可图的企业之一。

My conversation with Ilya was shortly before the release of GPT-4, the latest iteration of OpenAI’s giant AI system, which has consumed billions of words of text - more than any one human could possibly read in a lifetime.

GPT stands for Generative Pre-trained Transformer, three important words in understanding this Homeric Polyphemus. Transformer is the name of the algorithm at the heart of the giant.

GPT指的是生成式预训练转换器,这三个术语在理解这个荷马史诗中的单眼巨人波吕普莫斯至关重要。Transformer是这个巨人核心的算法名称。

Pre-trained refers to the behemoth’s education with a massive corpus of text, teaching it the underlying patterns and relationships of language - in short, teaching it to understand the world.

预训练是指采用大量文本语料库对计算机进行教育,教会它理解语言的基本模式和关系,简而言之,是让它能够理解世界。

Generative means that the AI can create new thoughts from this base of knowledge.

AI has already taken over many aspects of our lives. But what's coming is far more advanced, far more powerful. We're moving into uncharted territory. And it's worth taking a moment to consider what that means.

人工智能已经接管了我们生活中的许多方面。但接下来的智能将更加先进、更加强大。我们正在进入未知领域。这值得我们花一点时间来考虑其意义。

But it’s also important not to overreact, not to withdraw like turtles from the bright sun now shining upon us. In Homer's epic poem "The Odyssey," the cyclops Polyphemus traps Odysseus and his crew in his cave, intending to eat them.

但同样重要的是不要反应过度,不要像海龟一样从现在照耀在我们身上的明亮阳光中缩回去。在荷马的史诗《奥德赛》中,独眼巨人波吕斐摩斯将奥德修斯和他的船员困在他的洞穴中,意图吃掉他们。

But Odysseus manages to blind the giant and escape. AI will not eat us.

但是欧德修斯设法使巨人失明并逃脱。人工智能不会吞噬我们。

Ilya Sutskever is a cofounder and chief scientist of OpenAI and one of the primary minds behind the large language model GPT-4 and its public progeny, ChatGPT, which I don’t think it’s an exaggeration to say is changing the world.

This isn’t the first time Ilya has changed the world. He was the main impetus for AlexNet, the convolutional neural network whose dramatic performance stunned the scientific community in 2012 and set off the deep learning revolution.

这不是 Ilya 第一次改变世界。他是 AlexNet 的主要推动力,它是一种卷积神经网络,其惊人的性能令科学界大为震惊,并引发了深度学习革命。

The following is an edited transcript of our conversation.

CRAIG: Ilya, I know you were born in Russia. What got you interested in computer science, if that was the initial impulse, or neuroscience or whatever it was.

*CRAIG:**伊利亚,我知道你出生在俄罗斯。是什么引起了你对计算机科学,或者是神经科学或其他领域的兴趣呢?

ILYA: Indeed, I was born in Russia. I grew up in Israel, and then as a teenager, my family immigrated to Canada. My parents say I was interested in AI from an early age. I also was very motivated by consciousness. I was very disturbed by it, and I was curious about things that could help me understand it better.

伊利亚:确实,我出生在俄罗斯。我在以色列长大,后来在十几岁时,我的家人移民到了加拿大。我的父母说我从小就对人工智能很感兴趣。我也对意识非常感到无法平静,我很好奇有哪些东西可以帮助我更好地理解它。

I started working with Geoff Hinton \[one of the founders of deep learning, the kind of AI behind GPT-4, and a professor at the University of Toronto at the time\] very early when I was 17. Because we moved to Canada and I immediately was able to join the University of Toronto. I really wanted to do machine learning, because that seemed like the most important aspect of artificial intelligence that at the time was completely inaccessible.

当时我17岁,开始和Geoff Hinton合作。\[他是深度学习的创始人之一,也是GPT-4背后的人工智能以及当时在多伦多大学的教授。\] 我们移民到了加拿大,我马上加入了多伦多大学。我真的很想做机器学习,因为那似乎是当时完全无法把握的人工智能中最重要的方面。

That was 2003. We take it for granted that computers can learn, but in 2003, we took it for granted that computers can't learn. The biggest achievement of AI back then was Deep Blue, \[IBM’s\] chess playing engine \[which beat world champion Garry Kasparov in 1997\].

那是在2003年。我们现在理所当然认为计算机可以学习,但在2003年,我们默认认为计算机无法学习。那时AI最大的成就是Deep Blue(IBM的)下棋引擎(已在1997年击败世界冠军Garry Kasparov)。

But there, you have this game and you have this research, and you have this simple way of determining if one position is better than another. And it really did not feel like that could possibly be applicable to the real world because there was no learning. Learning was this big mystery. And I was really, really interested in learning. To my great luck, Geoff Hinton was a professor at the university, and we began working together almost right away.

但是,你有这个游戏和这项研究,还有这种简单的方法来确定一个位置是否比另一个位置更好。它似乎真的无法适用于真实世界,因为没有学习。学习是一个大谜团。我非常非常感兴趣学习。很幸运,Geoff Hinton是该大学的教授,我们几乎立即开始了合作。

So how does intelligence work at all? How can we make computers be even slightly intelligent? I had a very explicit intention to make a very small, but real contribution to AI. So, the motivation was, could I understand how intelligence works? And also make a contribution towards it? So that was my initial motivation. That was almost exactly 20 years ago.

那么智能到底是如何工作的?我们怎样才能让计算机变得稍微聪明一点呢?我非常明确地想要为人工智能做出一点微薄的贡献。于是,我的动机是,我能不能理解智能是如何工作的?并为此做出贡献?这就是我的初衷。而这已经将近20年了。

In a nutshell, I had the realization that if you train, a large neural network on a large and a deep neural network on a big enough dataset that specifies some complicated task that people do, such as vision, then you will succeed necessarily. And the logic for it was irreducible; we know that the human brain can solve these tasks and can solve them quickly. And the human brain is just a neural network with slow neurons.

简而言之,我意识到,如果你在大型数据集上针对某些复杂的、可供人类完成的任务(如视觉)训练一个大型的神经网络,或在更大的数据集上训练一个深度神经网络,那么你必定会成功。这一逻辑不可简化,因为我们知道人类大脑可以迅速解决这些任务,而人类大脑只是一个神经元传递速度缓慢的神经网络。

So, then we just need to take a smaller but related neural network and train it on the data. And the best neural network inside the computer will be related to the neural network that we have in our brains that performs this task.

所以,我们只需要拿一个较小但相关的神经网络,对数据进行训练。最佳的电脑内神经网络将与我们大脑中执行此任务的神经网络相关。

CRAIG: In 2017, the "Attention Is All You Need" paper came out introducing self-attention and transformers. At what point did the GPT project start? Was there some intuition about transformers?

在2017年,“全凭注意力”(Attention Is All You Need)一文被发表,介绍了自注意力和transformer。GPT项目是在什么时候开始的?关于transformer有哪些直觉?

ILYA: So, for context, at OpenAI from the earliest days, we were exploring the idea that predicting the next thing is all you need. We were exploring it with the much more limited neural networks of the time, but the hope was that if you have a neural network that can predict the next word, it'll solve unsupervised learning. So back before the GPTs, unsupervised learning was considered to be the Holy Grail of machine learning.

所以,为了更好地理解情境,在 OpenAI 的早期阶段,我们探究了一个概念,即“预测下一个事物是你所需要的一切”。当时我们使用的神经网络比较受限,但我们的愿望是,如果你有能预测下一个词的神经网络,那么它就能够解决无监督学习的问题。因此,在出现 GPT 以前,无监督学习被认为是机器学习的圣杯。

Now it's been fully solved, and no one even talks about it, but it was a Holy Grail. It was very mysterious, and so we were exploring the idea. I was really excited about it, that predicting the next word well enough is going to give you unsupervised learning.

现在它已经被完全解决了,甚至没有人再谈论它,但它曾经是一个圣杯。它非常神秘,所以我们在探索这个想法。我真的很兴奋,因为很好地预测下一个单词将给你无监督的学习。

But our neural networks were not up for the task. We were using recurrent neural networks. When the transformer came out, literally as soon as the paper came out, literally the next day, it was clear to me, to us, that transformers addressed the limitations of recurrent neural networks, of learning long-term dependencies.

但我们的神经网络无法胜任这个任务。我们使用的是循环神经网络。当变压器出现时,仅仅是论文一公布,甚至是接下来的一天,我和我们团队都清楚地意识到,变压器可以解决循环神经网络的限制,可以学习长期依赖。

It's a technical thing. But we switched to transformers right away. And so, the very nascent GPT effort continued then with the transformer. It started to work better, and you make it bigger, and then you keep making it bigger.

这是一个技术问题。但是我们立刻转向Transformers。因此,最初的GPT尝试继续使用transformer。它开始工作得更好了,你可以让它变得更大,然后继续让它变得更大。

And that's what led to eventually GPT-3 and essentially where we are today.

CRAIG: The limitation of large language models as they exist is that their knowledge is contained in the language that they're trained on. And most human knowledge, I think everyone agrees, is non-linguistic.

CRAIG: 大型语言模型作为它们存在的局限性在于它们的知识仅限于它们所训练的语言。而大部分人类知识,则是非语言的。

Their objective is to satisfy the statistical consistency of the prompt. They don't have an underlying understanding of the reality that language relates to. I asked ChatGPT about myself. It recognized that I'm a journalist, that I've worked at these various newspapers, but it went on and on about awards that I've never won. And it all read beautifully, but little of it connected to the underlying reality. Is there something that is being done to address that in your research going forward?

他们的目标是满足提示的统计一致性。他们对语言所关联的现实没有根本的理解。我问ChatGPT有关我自己的事情。它认为我是一名记者,曾在这些不同的报纸工作过,但它一直在谈论我从未获得的奖项。所有的内容都读得很好,但其中很少有与基本现实相联系的部分。在你们未来的研究中,是否有些事情正在努力解决这个问题?

ILYA: How confident are we that these limitations that we see today will still be with us two years from now? I am not that confident. There is another comment I want to make about one part of the question, which is that these models just learn statistical regularities and therefore they don't really know what the nature of the world is.

ILYA:我们有多大把握认为今天所见到的这些限制在两年后仍会存在?我不是那么有把握。还有一件事我想说,就是这个问题的一个方面是,这些模型只是学习统计规律,因此它们不真正知道世界的本质是什么。

I have a view that differs from this. In other words, I think that learning the statistical regularities is a far bigger deal than meets the eye.

我有一个与此不同的观点。换句话说,我认为学习统计规律比看起来要重要得多。

Prediction is also a statistical phenomenon. Yet to predict you need to understand the underlying process that produced the data. You need to understand more and more about the world that produced the data.

预测也是一种统计现象。然而,要进行预测,您需要了解产生数据的基础过程。您需要对产生数据的世界有更多的了解。

As our generative models become extraordinarily good, they will have, I claim, a shocking degree of understanding of the world and many of its subtleties. It is the world as seen through the lens of text. It tries to learn more and more about the world through a projection of the world on the space of text as expressed by human beings on the internet.

随着我们的生成模型变得非常出色,我认为它们能够惊人地理解世界及其许多微妙之处。它看待的世界是通过文本的角度来展现的。它试图通过互联网上人类表达的文本将世界投射到文本空间,并不断地学习更多关于世界的知识。

But still, this text already expresses the world. And I'll give you an example, a recent example, which I think is really telling and fascinating. I've seen this really interesting interaction with \[ChatGPT\] where \[ChatGPT\] became combative and aggressive when the user told it that it thinks that Google is a better search engine than Bing.

但是,这段文字已经表述出了这个世界。我来给你举一个例子,是最近发生的事情,我认为非常有趣和富有启示性。我看到一个非常有趣的互动,参与者是 [ChatGPT],当用户告诉它认为谷歌是比必应更好的搜索引擎时,[ChatGPT]变得好斗和攻击性十足。

What is a good way to think about this phenomenon? What does it mean? You can say, it's just predicting what people would do and people would do this, which is true. But maybe we are now reaching a point where the language of psychology is starting to be appropriated to understand the behavior of these neural networks.

怎样思考这个现象是好的?它意味着什么?你可以说,这只是预测人们会做什么,人们会这样做,这是正确的。但也许我们现在正到达一个新阶段,在这个阶段上心理学的语言正在被用来理解这些神经网络的行为。

Now let's talk about the limitations. It is indeed the case that these neural networks have a tendency to hallucinate. That's because a language model is great for learning about the world, but it is a little bit less great for producing good outputs. And there are various technical reasons for that. There are technical reasons why a language model is much better at learning about the world, learning incredible representations of ideas, of concepts, of people, of processes that exist, but its outputs aren't quite as good as one would hope, or rather as good as they could be.

现在我们来谈谈这些模型的局限性。确实,这些神经网络有产生幻觉的倾向。这是因为语言模型在学习世界知识方面非常出色,但在产生好的输出方面则稍微逊色一些。这其中有许多技术原因。语言模型在学习世界知识,学习有关观念、概念、人物、进程等的令人难以置信的表示方面表现得非常优秀,但其输出并不像人们希望的那样好,或者说不如其本应该的好。

ILYA: Which is why, for example, for a system like ChatGPT, which is a language model, has an additional reinforcement learning training process. We call it Reinforcement Learning from Human Feedback.

这就是为什么,例如像ChatGPT这样的系统,作为一种语言模型,需要进行额外的强化学习训练过程。我们称之为来自人类反馈的强化学习。

We can say that in the pre-training process, you want to learn everything about the world. With reinforcement learning from human feedback, we care about the outputs. We say, anytime the output is inappropriate, don't do this again. Every time the output does not make sense, don't do this again.

我们可以说,在预训练过程中,你想要学习关于世界的一切。通过人类反馈的强化学习,我们关注输出结果。我们会说,如果输出结果不恰当,则不要再重复。每当输出结果不合理时,也不要再进行。

And it learns quickly to produce good outputs. But it's the level of the outputs, which is not the case during the language model pre-training process.

它学习得很快,可以产生良好的输出。但在语言模型预训练过程中并非如此。

Now on the point of hallucinations, it has a propensity of making stuff up from time to time, and that's something that also greatly limits their usefulness.

现在来谈到幻觉问题,它有时倾向于虚构事实,这也极大地限制了它们的有用性。

But I'm quite hopeful that by simply improving this subsequent reinforcement learning from human feedback step, we can teach it to not hallucinate. Now you could say is it really going to learn? My answer is, let's find out.

但我非常有希望,仅仅通过改进人类反馈的强化学习步骤,我们就能教它不会产生幻觉。你可能会问它真的能学会吗?我的答案是,让我们试试吧。

The way we do things today is that we hire people to teach our neural network to behave, to teach ChatGPT to behave. You just interact with it, and it sees from your reaction, it infers, oh, that's not what you wanted. You are not happy with its output.

我们现在的做法是雇用人们来教我们的神经网络如何行为,来教ChatGPT如何行为。你只需要与它互动,它就会从你的反应中看出来,推断出来,哦,你不想要那样的结果。你对它的输出不满意。

Therefore, the output was not good, and it should do something differently next time. I think there is a quite a high chance that this approach will be able to address hallucinations completely.

因此,输出结果并不好,下次应该做出不同的处理。我认为这种方法有相当高的机会能够完全解决幻觉问题。

CRAIG: Yann LeCun \[chief AI scientist at Facebook and another early pioneer of deep learning\] believes that what's missing from large language models is this underlying world model that is non-linguistic that the language model can refer to. I wanted to hear what you thought of that and whether you've explored that at all.

CRAIG: Yann LeCun(Facebook 的首席人工智能科学家,也是深度学习的早期先驱之一)认为大语言模型缺少这种语言模型可以引用的非语言的基础世界模型。我想知道你对此有何看法,以及是否进行了探索。

ILYA: I reviewed Yann LeCun's proposal and there are a number of ideas there, and they're expressed in different language and there are some maybe small differences from the current paradigm, but to my mind, they are not very significant.

我审核了 Yann LeCun 的提议,里面有一些想法,用不同的语言表达,与当前的范式可能存在一些细微的差异,但在我看来,这些差异并不是非常重要的。

The first claim is that it is desirable for a system to have multimodal understanding where it doesn't just know about the world from text.

第一个论点是,一个系统要具备多模态理解能力,而不只是从文本中获得关于世界的知识。

And my comment on that will be that indeed multimodal understanding is desirable because you learn more about the world, you learn more about people, you learn more about their condition, and so the system will be able to understand what the task that it's supposed to solve, and the people and what they want better.

我的评论是多模态理解的确是可取的,因为你可以更多地了解这个世界,了解人们,了解他们的情况,这样系统就能更好地理解它所要解决的任务,以及人们想要什么。

We have done quite a bit of work on that, most notably in the form of two major neural nets that we've done. One is called Clip and one is called Dall-E. And both of them move towards this multimodal direction.

我们已经在这方面做了相当多的工作,其中最重要的是我们完成的两个主要神经网络。一个是称为“Clip”,另一个是称为“Dall-E”。它们都朝着这个多模式方向发展。

But I also want to say that I don't see the situation as a binary either-or, that if you don't have vision, if you don't understand the world visually or from video, then things will not work.

但我也想说,我不认为这种情况是非此即彼的,如果你没有视觉观念,如果你不理解世界的视觉或视频,事情就不会起作用。

And I'd like to make the case for that. So, I think that some things are much easier to learn from images and diagrams and so on, but I claim that you can still learn them from text only, just more slowly. And I'll give you an example. Consider the notion of color.

我想为此提出论点。我认为有些东西从图像和图表等更容易学习,但我声称你仍然可以从纯文本中学习它们,只是更慢。我会给你一个例子。考虑颜色的概念。

Surely one cannot learn the notion of color from text only, and yet when you look at the embeddings — I need to make a small detour to explain the concept of an embedding. Every neural network represents words, sentences, concepts through representations, ‘embeddings,’ that are high-dimensional vectors.

当然,仅仅凭借文字无法学习关于颜色的概念。但是,当你看到“嵌入”时,我需要稍微解释一下这个概念。每个神经网络通过一些高维向量的表示,也称为“嵌入”,来表示单词、句子和概念。

And we can look at those high-dimensional vectors and see what's similar to what; how does the network see this concept or that concept? And so, we can look at the embeddings of colors and it knows that purple is more similar to blue than to red, and it knows that red is more similar to orange than purple. It knows all those things just from text. How can that be?

我们可以查看这些高维度向量,看看什么与什么相似;网络如何看这个概念或那个概念?因此,我们可以查看颜色的嵌入,它知道紫色与蓝色更相似,而不是红色,它知道红色与橙色比紫色更相似。它仅凭文本就知道所有这些。这是怎么做到的?

If you have vision, the distinctions between color just jump at you. You immediately perceive them. Whereas with text, it takes you longer, maybe you know how to talk, and you already understand syntax and words and grammars, and only much later you actually start to understand colors.

如果你有视觉,颜色之间的差异会立刻显现出来,你会立刻感知到它们的存在。然而,对于文本来说,需要更长时间,也许你知道如何说话,已经理解了语法、单词和语法规则,但在很久以后才开始理解颜色。

So, this will be my point about the necessity of multimodality: I claim it is not necessary, but it is most definitely useful. I think it's a good direction to pursue. I just don't see it in such stark either-or claims.

因此,这就是我的论点关于多元性的必要性:我认为它并非必需,但绝对有用。我认为这是一个值得追求的良好方向。我只是不认为它会如此极端地增加或完全不需要。

So, the proposal in \[LeCun’s\] paper makes a claim that one of the big challenges is predicting high dimensional vectors which have uncertainty about them.

因此,LeCun的论文提出了一个主张,即高维向量预测具有不确定性是一个重大挑战。

But one thing which I found surprising, or at least unacknowledged in the paper, is that the current autoregressive transformers already have the property.

但是我发现一件令人惊讶的事情,或者至少在论文中没有被承认,那就是当前的自回归变压器已经具备这种特性。

I'll give you two examples. One is, given one page in a book, predict the next page in a book. There could be so many possible pages that follow. It's a very complicated, high-dimensional space, and they deal with it just fine. The same applies to images. These autoregressive transformers work perfectly on images.

我将给您两个例子。一个是,在一本书中给出一页,预测下一页。可能有很多可能的页面跟随。这是一个非常复杂的高维空间,他们处理得非常好。同样适用于图像。这些自回归变压器在图像上工作得非常完美。

For example, like with OpenAI, we've done work on the iGPT. We just took a transformer, and we applied it to pixels, and it worked super well, and it could generate images in very complicated and subtle ways. With Dall-E 1, same thing again.

例如,就像我们在OpenAI上所做的那样,我们对iGPT进行了研究。我们只是将变压器应用于像素上,并且效果非常好,它可以以非常复杂和微妙的方式生成图像。同样,在Dall-E 1上也是这样的。

So, the part where I thought that the paper made a strong comment around where current approaches can't deal with predicting high dimensional distributions - I think they definitely can.

所以,我认为那篇论文在强烈评论目前方法无法处理高维分布预测的地方是有问题的 - 我认为它们绝对是可以的。

CRAIG: On this idea of having an army of human trainers that are working with ChatGPT or a large language model to guide it in effect with reinforcement learning, just intuitively, that doesn't sound like an efficient way of teaching a model about the underlying reality of its language.

关于拥有一支由人类训练师组成的军队,他们正在与ChatGPT或大型语言模型一起指导强化学习的创意,仅凭直觉,这似乎并不是一种有效的模型教授语言基本现实的方式。

ILYA: I don't agree with the phrasing of the question. I claim that our pre-trained models already know everything they need to know about the underlying reality. They already have this knowledge of language and also a great deal of knowledge about the processes that exist in the world that produce this language.

ILYA: 我不同意问题的措辞。我声称我们的预先训练的模型已经知道了有关底层现实的所有必要知识。它们已经具备了有关语言的知识以及关于产生这种语言的世界中存在的过程的大量知识。

The thing that large generative models learn about their data — and in this case, large language models — are compressed representations of the real-world processes that produced this data, which means not only people and something about their thoughts, something about their feelings, but also something about the condition that people are in and the interactions that exist between them.

大型生成模型对其数据学到的内容——在这种情况下是大型语言模型——是由现实世界制造这些数据的过程的压缩表示。这意味着不仅涉及人和他们的思维、感情,还涉及人们所处的状态以及他们之间的交互。

The different situations a person can be in. All of these are part of that compressed process that is represented by the neural net to produce the text. The better the language model, the better the generative model, the higher the fidelity, the better it captures this process.

一个人可能遇到不同的情况。所有这些都是通过神经网络表示的压缩过程的一部分,以生成文本。语言模型越好,生成模型越好,保真度越高,就越能够捕捉到这个过程。

Now, the army of teachers, as you phrase it, indeed, those teachers are also using AI assistance. Those teachers aren't on their own. They're working with our tools and the tools are doing the majority of the work. But you do need to have oversight; you need to have people reviewing the behavior because you want to eventually achieve a very high level of reliability.

现在,按照您的说法,教师们的队伍也在使用人工智能辅助。这些教师并非独立操作,而是与我们的工具进行协同工作,而这些工具则完成了大部分工作。但是您必须拥有监督能力,您需要有人审查教育行为,以便最终实现高可靠性。

There is indeed a lot of motivation to make it as efficient and as precise as possible so that the resulting language model will be as well behaved as possible.

确实有很多动力去让它尽可能高效和精准,以便生成的语言模型尽可能表现良好。

ILYA: So yeah, there are these human teachers who are teaching the model desired behavior. And the manner in which they use AI systems is constantly increasing, so their own efficiency keeps increasing.

所以,有一些人类教师正在教授模型所需的行为。而他们使用人工智能系统的方式不断增加,因此他们自己的效率也不断提高。

It's not unlike an education process, how to act well in the world.

这很像一种教育过程,教导如何在这个世界上表现出色。

We need to do additional training to make sure that the model knows that hallucination is not okay ever. And it's that reinforcement learning human teacher loop or some other variant that will teach it.

我们需要进行额外的训练,以确保模型知道幻觉永远不可以。这是通过强化学习的人类教师循环或其他变体来教导的。

Something here should work. And we will find out pretty soon.

这里应该有些东西能够起作用。很快我们就会找到答案。

CRAIG: Where is this going? What, research are you focused on right now?

ILYA: I can't talk in detail about the specific research that I'm working on, but I can mention some of the research in broad strokes. I'm very interested in making those models more reliable, more controllable, make them learn faster from lesson data, less instructions. Make them so that indeed they don't hallucinate. 我不能详细谈论我正在进行的特定研究,但我可以概述一些研究内容。我非常希望使这些模型更可靠、更可控,让它们从课程数据中更快地学习,减少指令,确保它们不会产生幻觉。

CRAIG: I heard you make a comment that we need faster processors to be able to scale further. And it appears that the scaling of models, that there's no end in sight, but the power required to train these models, we're reaching the limit, at least the socially accepted limit.

*CRAIG:**我听你说过我们需要更快的处理器才能进一步扩展。但现在看来,模型的扩展似乎没有尽头,但训练这些模型所需的能量已经达到了极限,至少达到了社会所能接受的极限。

ILYA: I don't remember the exact comment that I made that you're referring to, but you always want faster processors. Of course, power keeps going up. Generally speaking, the cost is going up.

我不记得你指的是我说的哪个评论,但你总是希望处理器速度更快。当然,功率一直在增加。普遍来说,成本也在上升。

And the question that I would ask is not whether the cost is large, but whether the thing that we get out of paying this cost outweighs the cost. Maybe you pay all this cost, and you get nothing, then yeah, that's not worth it.

我想问的问题不是成本是否很高,而是我们通过支付这个成本得到的东西是否超过了成本。也许你支付了所有的成本,却得到了什么都没有,那么这不值得。

But if you get something very useful, something very valuable, something that can solve a lot of problems that we have, which we really want solved, then the cost can be justified.

但是如果你得到了非常有用的东西,非常有价值的东西,能够解决我们许多真正想解决的问题,那么成本就是合理的。

CRAIG: You did talk at one point I saw about democracy and about the impact that that AI can have on, democracy.

People have talked to me about a day when conflicts, which seem unresolvable, that if you had enough data and a large enough model, you could train the model on the data and it could come up with an optimal solution that would satisfy everybody.

人们曾向我谈起一天,那些看起来难以解决的冲突,如果你有足够的数据和足够大的模型,你可以在数据上训练模型,它可以提供让每个人都满意的最优解。

Do you think about where this might lead in terms of helping humans manage society?

你认为这会对帮助人类管理社会有什么潜在作用吗?

ILYA: It's such a big question because it's a much more future looking question. I think that there are still many ways in which our models will become far more capable than they are right now.

这是一个非常重要的问题,因为它是一个更加面向未来的问题。我认为我们的模型会变得比现在更加强大的方式有很多。

It's unpredictable exactly how governments will use this technology as a source of advice of various kinds.

政府会如何利用这种技术作为各种建议的来源是不可预知的。

I think that to the question of democracy, one thing which I think could happen in the future is that because you have these neural nets and they're going to be so pervasive and they're going to be so impactful in society, we will find that it is desirable to have some kind of a democratic process where, let's say the citizens of a country provide some information to the neural net about how they'd like things to be. I could imagine that happening.

我认为关于民主的问题,未来可能会发生的一件事是随着神经网络的普及和对社会的重要影响,我们会发现有一种民主过程是可取的。例如,一个国家的公民可以向神经网络提供一些信息,告诉它他们希望事情如何发展。我能想象这种情况的发生。

That can be a very high bandwidth form of democracy perhaps, where you get a lot more information out of each citizen and you aggregate it, specify how exactly we want such systems to act. Now it opens a whole lot of questions, but that's one thing that could happen in the future.

那可能是一种非常高带宽的民主形式,也许你可以从每个公民那里获取更多的信息并加以整合,明确我们希望这些系统如何运作。现在这引出了许多问题,但这是未来可能发生的事情之一。

But what does it mean to analyze all the variables? Eventually there will be a choice you need to make where you say, these variables seem really important. I want to go deep. Because I can read a hundred books, or I can read a book very slowly and carefully and get more out of it. So, there will be some element of that. Also, I think it's probably fundamentally impossible to understand everything in some sense. Let's, take some easier examples.

但是分析所有变量是什么意思呢?最终会有一个你需要做出的选择,你会说,这些变量似乎非常重要。我想要深入研究。因为我可以读一百本书,也可以花更多时间仔细阅读一本书,从中得到更多的收获。所以,其中一些元素将会涉及这个。此外,我认为从某种意义上来说,完全理解所有东西可能基本上是不可能的。让我们以一些更简单的例子为例。

Anytime there is any kind of complicated situation in society, even in a company, even in a mid-size company, it's already beyond the comprehension of any single individual. And I think that if we build our AI systems the right way, I think AI could be incredibly helpful in pretty much any situation.

任何时候,当社会上出现任何复杂的情况,甚至在公司、中型公司中都已经超出了任何个人的理解能力。我认为,如果我们正确构建我们的AI系统,我认为AI在任何情况下都可以发挥巨大的帮助作用。

抽象是一个公共的词汇,但它在不同的领域意义不同。在艺术中,抽象是指通过颜色、形式等元素来表达感觉和想法,而不是具体的形象。在哲学中,抽象是指从具体的实践经验中去除细节,从而得出一般性的概念和规律。在数学中,抽象是指将具体的数学问题转化为一般的数学模型,以便更好地理解和解决。

Made with Super