Download document () of 20

Dive into the electrifying world of AI-driven data centers with Eaton’s vp and chief architect, Joshua Buzzell, and power quality and multicircuit meters product manager Karen Cheung. Discover what’s fueling the explosive growth of digital infrastructure, why “load bursting” is shaking up the grid, and how cutting-edge metering technology is helping data centers become better grid citizens. From sub synchronous oscillations to edge-based detection, this episode is packed with insights that power innovation. Tune in to learn how data centers go from grid to chip – making them smarter, faster, and more resilient than ever.

Question 1: What’s driving the growth of data centers? 

Question 2: How have data centers changed over the last 1-2 years?

Question 3: What are the challenges that AI Data centers face?

Question 4: How do you balance the power needed for the grid at AI Data Centers?

Question 5: What is load bursting in the context of AI data centers, and why is it significant?

Question 6: How can data centers address this risk?

Question 7: How does having meters help?

Question 8: Why does edge-based processing improve the detection and management of sub-synchronous events compared to other methods?

Question 9: What happens next?  What are some target solutions?

Question 10: Where do we start with all of this?  How does somebody implement this?

spacer
  • JP Buzzell

JP (Joshua) is the data center chief architect at Eaton. In this role, he works closely with Cyrille Brisson, Eaton’s data center segment leader, to influence and support the company’s global segment strategy. This includes overseeing the development of next generation reference designs and assisting customers with full data center design and deployment across all physical layer dimensions: power, cooling, network and software. He also works closely with Eaton’s product and functional support teams, ensuring that the right processes and offers are available globally to serve evolving customer needs, such as supporting and scaling AI growth, and developing carbon reduction strategies.

JP is one of the foremost thought leaders in the data center industry, especially on the topic of liquid cooling. His passion brought him to Eaton to develop liquid-cooling technology and continue to push the limits of what is possible, enabling the growth and economics of quantum, AI and data centers at scale.

JP’s data center experience began at Facebook (now Meta), where he focused on data center construction, commissioning and operations. He later joined Oracle Cloud Infrastructure (OCI) as a regional director of operations for data centers. While at OCI, he led whitespace design globally and was responsible for developing air-cooled and liquid-cooled standard designs. He also orchestrated the coordination and control of electrical, mechanical and network technologies to enable clusters to be scaled from a few hundred GPUs to hundreds of thousands.

JP spent the first part of his career in nuclear power. His experience spans hands-on technical maintenance work—including electrical, mechanical, controls and quality assurance—to supervising nuclear reactor operations. JP holds a Bachelor of Science in nuclear energy engineering technology (ABET) from Thomas Edison State University and a Master of Science in technology management from Georgetown University. He is based in Virgina, United States.

  • Karen Cheung    

Karen is a product manager for advanced meters at Eaton, with a robust background in new product development and introduction.  Karen has guided the development and launch of multiple new products, including Eaton’s flagship meter platform PXQ.

With over a decade of industry experience in metering and communications, Karen has developed a deep understanding of power quality monitoring and root cause analysis for critical power customers.  Her technical expertise and commitment to innovation establish her as a subject matter expert and thought leader in metering technology.

Karen holds a Bachelor of Science in Electrical Engineering from the University of Pittsburgh and an MBA from West Texas A&M University. 

Data center growth drivers    

Karen: Hey, JP, you ready to go this morning?

JP: Always a fantastic Friday today.

Karen: Absolutely. All right. So let's start for the high level for our listeners tuning in. We've seen explosive growth in data center recently, especially with the growing demand for AI. What can you tell us about the drivers behind this growth?

JP: Yeah, I think other than the fact that we're running out of power, is the thing. There's a lot of factors that are increasing that demand. First and foremost. Right. Everybody loves our cat videos online, but it's more than that. Globally, we're growing on the demand for the AI so that AI is creating a significant, base for more and more companies getting into the cloud to help out, enabling their businesses.

And that is an economic driver. And although the United States is the largest market, the rest of the world has a much larger population, right? Interesting fact India has the highest ChatGPT utilization, but only 4% market penetration. So even though we've seen this growth in the United States, we expect to see it outside the US as well.

Karen: That's incredible. So with all of this, do you feel like that's changed how modern data center design and architecture is?

JP: Well, I think the change is a little bit. It's more of the same for a certain portion of the compute. So you have the standard compute that's going to be continuing to be growing at a rapid rate. But there's new thing called high performance compute, which is this training algorithm that everybody's been hearing about has been increasing the size of that cluster.

JP: A large cluster, maybe 4 or 5 years ago might have been 1 to 5MW, but now a large cluster is considered up to 500MW. That's a half a gigawatt. And we're getting up to the 1.21GW for the Avalon cluster.

Karen: 1.21 was wow, that was something else. So with all of this change, the shift, the high compute clusters, what specific challenges are our data centers facing today?

JP: Well, other than the fact that we need to find a flux capacitor which is out of stock right now, we need to figure out how to build them fast enough. So typically it's a much more demand cycle. For that. We have to have the speed. Is the currency in the market, right. So if you delay a cluster that's only 20MB in size, it's about 10,000 GPUs.

JP: If it's in each 100, that can cost up to $3.4 million a week in lost revenue if you delay it a week. Right. So speeds really is the currency in the market. And that power this is the other piece of it right now. Instead of having maybe three kW rack or a 17 kW rack, that rack is now 120 kW with the potential to go all the way up to, a megawatt, a rack.

JP: So it's really higher density and the speed to get that cluster up.

Karen: So all of this power density that's changing how you actually architecting and construct these data centers. Yeah.

JP: Yeah, it's changing quite a bit. So I think if you're talking about like the grid balancer and looking into that, traditionally data centers have all the eyeballs or the person or the group of people. So I used to hear something, the term called we're just going to build an NFL city. Right. So large metropolitan have the eyeballs and that's based on latency.

JP: The next piece is where the data gravity is. That data gravity then plays into the network conductor and then finally the grid connection. Right. And we're struggling right now in that grid connection because we're running out of the transmission ability. So the point of use in point of concern. Generation have been separated for the longest time in use transmission.

JP: But now that transmission takes 14 years to build the transmission line. That's a big concern. Additionally to that, though, these data centers have gone being, like I mentioned before, five megawatts, a cluster to 500MW of cluster. So the size of a small city spinning up and down all at once. So things like energy aware and being able to take care of this load bursting is a significant play.

Karen: Okay. So that's that's key. But oh we should explain what low bursting means. So do you want to take that and tell our listeners what load bursting is and why it's significant.

JP: Yeah. Like so low bursting. So if you think about like a cluster or even a, 100 megawatt cluster, if you think back 20 years, that would have been 20 years ago or five years ago, a mega cluster like that was 150MW. But they're all different loads. Meaning like you had 150 people inside of a concert hall, all playing a different song, but now we're all playing the same song at the same time.

JP: So ramping up and ramping down at the same time. So you might be 10% or ten megawatts and then all the way up to 150MW. So imagine the entire city of Richmond all got home at the same time and turn on their dishwasher at the same time, and got a burst in power and a burst in water demand.

JP: Right? So typically the horse infrastructure that's been designed has been allowed to have disparate. So we need to start thinking about how do we be good citizens in applying, certain new technologies in order to combat that? So one of the items is like how to measure this? And I think you have some work on that. You want to, talk through that?

Karen: Yeah, that's something that my team and I have been looking at. So we've talked a bit about the amount of power that's being used all at once. But in these cases, it's not just the quantity of power being used, it's also the power quality that's important. So like they say, it's it's quality over quantity. Right. But running these different GPUs for these data centers, it creates a very specific pattern.

Karen: And doing it at such a high volume is becoming concerning to some utilities. Because at these levels, this particular pattern has the ability to potentially impact generators. And, you know, at a worst case, create power disruptions and outages. So the first step to solving this problem is you have to know you have a problem. Right. That's what they say.

So you have to be able to monitor and detect before we can protect the data center, the grid, the community around it, be a good grid, cities and that sort of thing. And one of the ways that data centers can do that is that they can use the metering solutions that they're already using today. They're specifying in the rates space and sometimes in their white space to be able to identify these problematic power patterns.

So they're just like phenomenal. And they're just specific term that we that you'll hear associated with that. So it's synchronous oscillation. So but yeah, that's something something that we would want to try to look for.

JP: Yeah. That that's subsequence isolation was kind of interesting. And like the metering seems like it's really important. But how does metering help that sub synchronous oscillation.

Karen: So the way the data centers are using meters today, they have them installed and they're using them to look at the power coming into their building, making sure that that's stable and a good quality. And also they're looking at the interactions as that power flows through the building and power flash in the data center. Right. So what we want this to do is rather than standard metering, which is usually looking at 60Hz, which is your electrical fundamentals.

Karen: And then above. So you talk about things like harmonics and things like that. You actually want to look below 60Hz. That's your your sub synchronous. Right. And then you want to try to diagnose down there. So you'll need something that samples and processes data very very quickly and with a high resolution. But one thing to watch out for as you're looking for a solution for this is watch out for a meter that that maybe relies a little too heavily on data streaming to solve this problem.

That's an approach that we don't recommend, just because it leaves a lot up to the data center to receive the data and figure it out, which, you know, that's that's challenging and that's not really helpful. So instead, what we recommend is an edge based solution, which means that all the data is getting processed onboard the meter in real time, and the meter can act as guidance for you.

JP: Oh, that's pretty cool. So Wi-Fi is laggy. So I've heard. So, you want to do it in the edge and process there so you don't miss anything. And will be in the edge. And since the flux capacitor doesn't exist yet or is out of stock before it arrives, arise. Are the parts. What happens next? What are some of the solutions besides the flux capacitor that we can use to solve for this?

Karen: Yeah. So doing it at the edge makes a lot of sense because that helps your solutions to be faster and more accurate. Just like you said, Wi-Fi is lagging. You know, when you transfer data that is always going to be a bottleneck for you. So even a really impressive rate that is a fraction of what a meter has and is looking at inside of itself.

Karen: So if you're directly in the device, you get to look at more data, you get a more precise result, and it's faster because the meter itself is processing. You're not sending data back and forth. The only thing that gets sent is, hey, there's there's something that needs your attention. There might be a problem, you know, please check it out.

And that way it ends up being a lot less expensive from a total cost of ownership perspective or data center, because then they don't have to procure or install maintain all this additional computing to deal with gobs and gallons of data coming from firehose. Right. Now, that's not to say they can't be streaming and still do streaming as an option, but now this becomes more of a backup.

So that way you can have, you know, your main and you have a failsafe, you can have a flexibility to do this as much or as little as you want, where you want. It's not the world's out your oyster. You know. So let me ask you this, JP know we've talked a lot about detection and that's where we start.

But that's really only half the battle. So once we know that there's a problem what next. Do you have any suggestions for solutions that a data center can target?

JP: Yeah I think I think knowing the problem is the first step. Right. So that's half the battle. That's one now that we have this metering technology and then you have the ability to stop it or you know, not worry about Wi-Fi streaming and then you're on the edge. But then like how do you send that signal to something that can actually control it.

JP: So there's a couple different solutions that we can use, such as the up systems and energy aware can then mitigate that and sort of be a buffer for the grid. We can use, super caps, because if we're using the UPS as a buffer, we might be degrading the battery technology. But it's a it's the batteries itself, but it's still a viable solution as a stopgap.

JP: Or if you don't think it's happening often, if you think is gonna happen a lot more super caps have better characteristics and they can charge and discharge much more rapidly, and they won't degrade over time as quickly. So you can conserve that CapEx inside of the battery itself. And then using that energy aware system itself, the overall shape of the waveform.

So you can do peak shaving and wave formation just once again to protect that grid and be good citizen.

Karen: Okay, so ups's super caps, things like energy or other options, it sounds like it can be really overwhelming for someone who isn't as familiar with the field. How would somebody go and implement all these different things that we've talked about?

JP: Yeah, I think it's first of all, we were worry. Eaton's guy has a great ecosystem. And if you think about it, we truly have the technologies go all the way from the grid down to the chip. So 500, kilovolts all the way down to one volt DC. We can actually support that whole system and ecosystem with our engineering.

JP: And then the services we have associated with that, we have engineering consultation services, and we have the technology that we can help mitigate that waveform and help a customer wherever they're at. Right. So if you're a super sophisticated customer, like a Hyperscaler, we've got technical expertise that we can meet you there. Or if you're like, hey, I just need a turnkey solution.

We have solutions there as well. So it's really a matter of helping you out wherever you are and then meeting you in the middle and those AC and DC solutions are pretty exciting because we mentioned before, we have Ups's, but Investees and direct DC grids are coming into play. We have that technology as well. One, that just got publicly announced and I'm super excited about is the Siemens Energy Eaton solution that we've basically partnered with Siemens Energy to support the high voltage to medium voltage transition, adding a battery energy source system there and onsite generation.

We're not going to have all the solutions. It's all about proper partnership for prosperity for all.

Karen: Gotcha. I think that will really help. And having some some guidance to be able to to meet the customer where they're at, I think that's useful. Well, awesome. Thanks for playing that. And thanks for sitting down with me today JB.

JP: Yeah, it was it was a blast. I always get to see Karen.

spacer
spacer

Listen now

Listen on your preferred podcast directory