Google this week pushed back against claims by earlier research that large AI models can contribute significantly to carbon emissions. In a paper coauthored by Google AI chief scientist Jeff Dean, researchers at the company say that the choice of model, datacenter, and processor can reduce carbon footprint by up to 100 times and that “misunderstandings” about the model lifecycle contributed to “miscalculations” in impact estimates.
Carbon dioxide, methane, and nitrous oxide levels are at the highest they’ve been in the last 800,000 years. Together with other drivers, greenhouse gases likely catalyzed the global warming that’s been observed since the mid-20th century. It’s widely believed that machine learning models, too, have contributed to the adverse environmental trend. That’s because they require a substantial amount of computational resources and energy — models are routinely trained for thousands of hours on specialized hardware accelerators in datacenters estimated to use 200 terawatt-hours per year. The average U.S. home consumes about 10,000 kilowatt-hours per year, a fraction of that total.
This latest Google-led research, which was conducted with University of California, Berkeley researchers and focuses on natural language model training, defines the footprint of a model as a function of several variables. They include the choice of algorithm, the program that implements it, the number of processors that run the program, the speed and power of those processors, a datacenter’s efficiency in delivering power and cooling the processors, and the energy supply mix — for example, renewable, gas, or coal.
The coauthors argue that Google engineers are often improving the quality of existing models rather than starting from scratch, which minimizes the environmental impact of training. For example, the papers suggests that Google’s Evolved Transformer model, an improvement upon the Transformer, uses 1.6 times fewer floating point operations per second (FLOPS) and takes 1.1 to 1.3 times less training time. Another improvement — sparse activation — leads to 55 times less energy usage and reduces net carbon emissions by around 130 times compared with “dense” alternatives, according to the researchers.
The paper also makes the claim that Google’s custom AI processors, called tensor processing units (TPUs), enable energy savings in the cloud far greater than previous research has acknowledged. The average cloud datacenter is roughly twice as energy efficient as an enterprise datacenter, the coauthors posit, pointing to a recent paper in Science that found that global datacenter energy consumption increased by only 6% compared with 2010, despite computing capacity increasing by 550% over the same time period.
“Reviewers of early [research] suggested that … any tasks run in a green datacenter simply shift other work to dirtier datacenters, so there is no net gain,” the coauthors wrote. “It’s not true, but that speculation reveals many seemingly plausible but incorrect fallacies: datacenters are fully utilized, cloud centers can’t grow, renewable energy is fixed and can’t grow, Google … model training competes with other tasks in the datacenter, training must run in all datacenters, [and] there is no business reason to reduce carbon emissions.”
The coauthors evaluated the energy usage and carbon emissions of five recent large natural language processing models, using their own formulas for the calculations. They concluded that:
The thoroughness of the paper belies the conflict of Google’s commercial interests with viewpoints expressed in third-party research. Many of the models the company develops power customer-facing products, including Cloud Translation API and Natural Language API. Revenue from Google Cloud, Google’s cloud division that includes its managed AI services, jumped nearly 46% year-over-year in Q1 2021 to $4.04 billion.
While the Google-led research disputes this, at least one study shows that the amount of compute used to train the largest models for natural language processing and other applications has increased 300,000 times in 6 years — a higher pace than Moore’s law. The coauthors of a recent MIT study say that this suggests that deep learning is approaching its computational limits. “We do not anticipate [meeting] the computational requirements implied by the targets … The hardware, environmental, and monetary costs would be prohibitive,” the MIT coauthors said.
It’s been established that impoverished groups are more likely to experience significant environmental-related health issues, with one study out of Yale finding low-income communities and those comprised predominantly of minorities experienced higher exposure to air pollution compared to nearby white neighborhoods. A more recent study from the University of Illinois at Urbana-Champaign shows that Black Americans are subjected to more pollution from every source, including industry, agriculture, all manner of vehicles, construction, residential sources, and even emissions from restaurants.
Gebru’s work notes that while some of the energy supplying datacenters comes from renewable or carbon credit-offset sources, the majority is not sourced from renewable sources, and many sources in the world aren’t carbon neutral. Moreover, renewable energy sources are still costly to the environment, Gebru and coauthors note, and datacenters with increasing computation requirements take away from other potential uses of green energy.
“When we perform a risk/benefit analyses of language technology, we must keep in mind how the risks and benefits are distributed, because they do not accrue to the same people,” Gebru and coauthors wrote. “Is it fair or just to ask, for example, that the residents of the Maldives (likely to be underwater by 2100) or the 800,000 people in Sudan affected by drastic floods pay the environmental price of training and deploying ever-larger English language models, when similar large-scale models aren’t being produced for Dhivehi or Sudanese Arabic?”
“When developing a new model, much of the research process involves training many model variants on a training set and performing inference on a small development set. In such a setting, more efficient training procedures can lead to greater savings,” scientists at the Allen Institute for AI, Carnegie Mellon University, and the University of Washington wrote in a recent paper. “[Increasing] the prevalence of ‘green AI’ [can be accomplished] by highlighting its benefits [and] advocating a standard measure of efficiency.”
© 2021 LeackStat.com
2025 © Leackstat. All rights reserved