My infant year as an AI researcher — Moving from physics to AI
我作为人工智能研究员的幼年岁月——从物理学转向人工智能
Shortly after I left Berkeley postdoc and joined Anthropic, I was planning to write a short article, mostly as a note for myself, about my thought process behind leaving physics and join AI research.
Yet, I have never got time to write those down due to the intense work at Anthropic :) Until last Friday(Sept.19), I resigned from Anthropic and got a week’s break before I joined Google DeepMind.
Mostly because I want to find a direction that have more chances for young people. Theoretical physics is an amazing field for training: it is intellectual challenging, deep and require technics from wide variety of fields including math, computer science(eg.complexity theory) and of course, physics itself. Yet, this field has running out of experiments for many years. A field without experiments can be problematic in many different ways, for example, it will be hard to judge objectively the importance of a theoretical work. It will also be hard to unblock disagreements/confusions just by systematical experiments.
Then it mainly comes down to AI or QC(quantum computing). Although I believe QC will become important in the future, my impression is the bottleneck now is mainly experimental platforms. Thus I choose AI, which is interestingly similar to physics research as follows:
In some sense, it is similar to research on thermodynamics during the 17th century. Back then, people didn’t even know what was heat: in fact people still believed in Phlogiston theory. But this does not stop people from experimenting scientifically. For example, Boyle’s law tells the relationship between pressure and volume when temperature is fixed. Thus by designing experiments systematically, people still learnt enough ‘laws’, which guided the invention/study of heat engine that changed the word.
From my naive point of view, it is similar in large scale AI models. On one hand, we still don’t have reliable theory or models describing the behavior of large neural networks. On the other hand, systematical research start to tell us lots of valuable lessons, eg scaling law. (And having those systematical research is becoming an essential element for making constant progress at large scale.)
Even though I left anthropic, I still view ant as (one of) the best place for physicists(maybe also other STEM background PhD) to start their journey in AI research. I joined anthropic on Oct.1st 2024, when we start to do research for the later called Claude 3.7 sonnet. After being a physicist for many years, it was so exciting to see your research getting impact on the frontier model capability immediately, and witnessing people’s way of interacting with AI changes as new capabilities emerge.
尽管我离开了 Anthropic,我仍然认为 Anthropic 是物理学家(也可能包括其他理工科背景博士)开始 AI 研究之旅的最佳去处之一。我在 2024年10月1日加入Anthropic,当时我们开始为后来被称为Claude 3.7 Sonnet的模型做研究。作为多年的物理学者,看到自己的研究立即对前沿模型能力产生影响,并目睹随着新能力出现人们与AI互动方式的改变,令我感到非常兴奋。
Yet, I decided to leave due to two main reasons:
然而,我决定离开主要有两个原因:
~40% of the reason: I strongly disagree with the anti-china statements Anthropic has made. Especially from the recent public announcement, where China has been called “adversarial nation”. Although to be clear, I believe most of the people at anthropic will disagree with such a statement, yet, I don’t think there is a way for me to stay.
The remaining 60% is more complicated. Most of them contains internal anthropic informations thus I can’t tell.
其余的 60%则更为复杂。其中大部分涉及 Anthropic 的内部信息,因此我不能透露。
Time to move on!
是时候继续前进了!
Relative to physics, AI moves insanely fast and looking back I am surprised by how much has happened in the past one year. It was a great honor to see Claude getting better from 3.7 to 4.5, and I personally learnt a lot. Yet it is time to move on.
From a personal perspective, Anthropic was my first, and the only, AI job, thus I don’t want my experience/knowledge being biased by a specific lab.(Especially because nowadays core-research do not write paper anymore.)
就个人而言,Anthropic是我的第一份也是唯一一份 AI 工作,因此我不希望我的经验/知识被某一家实验室所偏颇。(尤其是因为现在核心研究不再写论文了。)
So Ant, it was good with you, but it is better without you :)