Mechanize builds sophisticated reinforcement learning environments to simulate realistic software engineering tasks (feature development, debugging, refactoring, reliability testing) for frontier AI labs. Our mission is to automate software engineering first, then all economically valuable work. We're growing quickly, working with leading AI labs, and backed by investors like Nat Friedman, Daniel Gross, Patrick Collison, and Jeff Dean. Featured in NYT and TechCrunch.
TLDR: the H100, lower precision, and other advances lead to a big jump in computational performance. We're in for a wild ride when the next generation of models is trained on 100x more compute in 2024 and 2025.
The result about recent compute trends is different from the recent trends described by OpenAI. In particular, they find a 3.5-month doubling time over the Deep Learning Era, whereas the paper finds a 6-month doubling time.
I think the Large-Scale Era does point to a new phenomenon that emerged pretty discontinuously, which is that there are now 'two lanes' in ML scaling. Prior to 2015, academic and industry would train roughly similarly compute intensive models. Since then, a small number of industry players frequently train models with 10-100x more compute than what the typical researcher uses.
The thing is that the advent of deep learning was a very big change in the sense you had a general purpose method appear that you could use to throw computing power at many/most problems (and tune a bit but still) and get results that previously you couldn't get (and when did get results, you required domain experts). No doubt we have changes within the trajectory of this escalating brute force solutions. But relative changes in this paradigm seem fundamentally different than the initial advent of the paradigm.
Apply at: https://jobs.ashbyhq.com/mechanize
Mechanize builds sophisticated reinforcement learning environments to simulate realistic software engineering tasks (feature development, debugging, refactoring, reliability testing) for frontier AI labs. Our mission is to automate software engineering first, then all economically valuable work. We're growing quickly, working with leading AI labs, and backed by investors like Nat Friedman, Daniel Gross, Patrick Collison, and Jeff Dean. Featured in NYT and TechCrunch.