A growing body of research attempts to put a number on energy use and AI—even as the companies behind the most popular models keep their carbon emissions a secret.
A problem is that the information is not in the hands of the company selling the AI. The actual hardware is often owned by service providers and independent data centers.
Well, I work at an AI hyperscaler. I can tell you how much my facility uses, and how much each rack uses, but don’t have any way to determine what the customer is doing on that server. Or even which servers a given customer is using. Is it being used heavily for queries? How many? Of what kind? We don’t know. Only what the rack/row/pod/hall is consuming.
Also, does the network gear overhead count? How do you apportion that?
We have no visibility into the customer workload. Some of our customers use our systems for scientific research. Drugs, etc. How do you tally that?
I’m not saying that it is impossible, just that if the customer won’t pay for that report, we’re not going to spend money to build the systems to produce it.
Im sure they can do the simple math of: we pay for x power, we have y customers. x / y would be a rough but probably pretty accurate number if we are talking tens of thousands to millions of customers.
You can produce a remarkably good estimate by looking at CPU and GPU utilization out of procfs and profiling a handful of similar machines power use with similar utilization and workloads.
Network is less than 5% of power use for non-GPU loads; probably less for GPU.
A problem is that the information is not in the hands of the company selling the AI. The actual hardware is often owned by service providers and independent data centers.
They know exactly what the power consumption of that hardware is though. This isnt tough to figure out just because you use a cloud provider
Well, I work at an AI hyperscaler. I can tell you how much my facility uses, and how much each rack uses, but don’t have any way to determine what the customer is doing on that server. Or even which servers a given customer is using. Is it being used heavily for queries? How many? Of what kind? We don’t know. Only what the rack/row/pod/hall is consuming.
Also, does the network gear overhead count? How do you apportion that?
We have no visibility into the customer workload. Some of our customers use our systems for scientific research. Drugs, etc. How do you tally that?
I’m not saying that it is impossible, just that if the customer won’t pay for that report, we’re not going to spend money to build the systems to produce it.
Do I agree? No. But I’m just a grunt.
Im sure they can do the simple math of: we pay for x power, we have y customers. x / y would be a rough but probably pretty accurate number if we are talking tens of thousands to millions of customers.
You can produce a remarkably good estimate by looking at CPU and GPU utilization out of procfs and profiling a handful of similar machines power use with similar utilization and workloads.
Network is less than 5% of power use for non-GPU loads; probably less for GPU.