Since 2010, the amount of training compute used by the largest models has doubled every 6 months. Model size and amount of training data have grown ~ proportionally with compute. - from Sevilla et al.

Since 2010, the amount of training compute used by the largest models has doubled every 6 months. Model size and amount of training data have grown ~ proportionally with compute. - from Sevilla et al.