this post was submitted on 19 Jun 2025
305 points (99.0% liked)

Technology

71986 readers
4256 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
 

Archived copies of the article:

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 7 points 1 week ago (1 children)

A problem is that the information is not in the hands of the company selling the AI. The actual hardware is often owned by service providers and independent data centers.

[–] [email protected] 8 points 1 week ago (1 children)

They know exactly what the power consumption of that hardware is though. This isnt tough to figure out just because you use a cloud provider

[–] [email protected] 4 points 1 week ago (2 children)

Well, I work at an AI hyperscaler. I can tell you how much my facility uses, and how much each rack uses, but don't have any way to determine what the customer is doing on that server. Or even which servers a given customer is using. Is it being used heavily for queries? How many? Of what kind? We don't know. Only what the rack/row/pod/hall is consuming.

Also, does the network gear overhead count? How do you apportion that?

We have no visibility into the customer workload. Some of our customers use our systems for scientific research. Drugs, etc. How do you tally that?

I'm not saying that it is impossible, just that if the customer won't pay for that report, we're not going to spend money to build the systems to produce it.

Do I agree? No. But I'm just a grunt.

[–] [email protected] 2 points 1 week ago (1 children)

You can produce a remarkably good estimate by looking at CPU and GPU utilization out of procfs and profiling a handful of similar machines power use with similar utilization and workloads.

Network is less than 5% of power use for non-GPU loads; probably less for GPU.

[–] [email protected] 2 points 1 week ago (1 children)

Sure, you can do that at an aggregate level, but then how do you divide it by customer? And even then, some setups will be more efficient than others, so you'd only get that setup's usage.

And even if you do that and can narrow it down to a single user and a single prompt, you can still only roughly predict how long it will think and how long the response will be.

[–] [email protected] 1 points 1 week ago

By customer is easy: they're each renting specific resources. A fractional cloud instance (excepting the sma burst able ones) is tied to specific CPUs and GPUs. And there are records of who rented which one when being kept already.

You might not be able to break out specific individual queries, but computing averages is completely straightforward

[–] [email protected] 2 points 1 week ago

Im sure they can do the simple math of: we pay for x power, we have y customers. x / y would be a rough but probably pretty accurate number if we are talking tens of thousands to millions of customers.