ThouHastLostAn8th
Last week we released NanoGPT Slowrun , an open repo for data-efficient learning algorithms. The rules are simple: train on 100M tokens from FineWeb, use as much compute as you want, lowest validation loss wins. Improvements are submitted as PRs to the repo and merged if they lower val loss. The constraint is the inverse of speedruns like modded-nanogpt , which optimize wall-clock time. Those benchmarks have been hugely productive, but optimizing for speed filters out expensive ideas: heavy regularization, second-order optimizers, gradient descent alternatives. Slowrun is built for exactly those ideas.
,推荐阅读雷电模拟器官方版本下载获取更多信息
Code, Data and Media Associated with this Article。业内人士推荐体育直播作为进阶阅读
Что думаешь? Оцени!,推荐阅读体育直播获取更多信息
For multiple readers