Cisco Systems, Inc. (NASDAQ:CSCO) Bank Of America Securities Global A.I. Conference 2023 Call September 11, 2023 11:00 AM ET
Eyal Dagan – Executive Vice President, Common Hardware Group
Rakesh Chopra – Cisco Fellow, Common Hardware Group
Conference Call Participants
Tal Liani – Bank of America
Ladies and gentlemen, the program is about to begin. Reminder that you can submit questions at any time via the Ask Questions tab on the webcast page. At this time, it is my pleasure to turn the program over to your host, Tal Liani.
[Technical Difficulty] joining us. We have today — in the morning, you heard about contract manufacturers that make the products for the big cloud companies, then you heard from cybersecurity, and you heard from others.
Now, I’d like to welcome both Eyal Dagan and Rakesh Chopra to our conference. Eyal is Executive VP of the Common Hardware Group at Cisco. He is responsible for delivering silicon optics and hardware across Cisco switching, routing, optical and IoT portfolios. Eyal has an extensive background in the industry prior to joining Cisco as the Co-Founder and CEO of Leaba Semiconductor, a semiconductor company that was acquired by Cisco in 2016. Rakesh is a Cisco Fellow in the Common Hardware Group and has been with Cisco since ’97. Rakesh runs a system architecture team that focuses on hardware platforms and also owns the business and customer engagement for selling Cisco Silicon One to external partners. So, thank you, Eyal and Rakesh, for joining us today.
And maybe before we start, I know that the distinguished IR team of Cisco has a forward-looking statement to read to us.
Unidentified Company Representative
Thanks, Tal. This webcast is educational in nature with no new financial information being given. We will be making forward-looking statements. The actual results may differ materially from those forward-looking statements and are subject to the risks and uncertainties found in our SEC documents, the 10-Q and the 10-K.
I’ll turn it over to Tal and Rakesh.
Great. So maybe, Rakesh, before you start, I’ll just set the stage. So, we — when we think about AI deployments, we think about the infrastructure, the underlying infrastructure, and there are many questions around it, will the infrastructure change? What about InfiniBand versus Ethernet? What does it mean that the investments of any company? Does it mean the investment needs to go up because the capacity and the data you need to deal with goes up? We always say — and Cisco has been saying it for 20, 25 years, you need to have a good network. You always need to have a good network. At the bottom of it, you want to deliver services, you have to have a great network. So, having Cisco today is very important because we want to discuss the network. We want to discuss basically the underlying demands for generative AI, as well as AI, as well as the different architectures and different deployment schemes, et cetera.
So with no further ado, I’ll pass it on to Rakesh. We’re going to have about a 20-minute presentation and then open it for Q&A. As always, send me the questions via the portal, please. Rakesh?
Awesome, thank you very much for the introduction, Tal. And as usual, you’ve sort of captured the entire essence of what we’re talking about today. So with that, why don’t we go ahead and jump directly into it. If we could advance the slide, that would be wonderful. One more, please.
So, I wanted to start off before we get into talking about the networking infrastructure to just sort of normalize this all. AI/ML is, of course, the buzzword of the day. It’s what everybody is out talking about. And for a company like Cisco, that’s actually quite an interesting proposition because at the end of the day, unlike many companies in the industry, Cisco is a very, very large company. We build a bunch of different types of products.
So, when we’re talking about AI/ML, I’d like to break it down into sort of two basic categories. The first is you can imagine that Cisco uses AI to improve our products and services that we offer to our customers. So, for example, using Desk Pro or Webex, there’s noise canceling that is sort of AI-powered. That right now is sort of filtering out the huge amount of construction noise that is happening directly by my ears. It does an amazing job. This is a really important piece of AI for Cisco, but that’s not really what I’m here to talk about today. I’m here to talk about the right side of this picture, which is we also sort of sell our products to enable others to build AI networks. And that’s again where we’re sort of focusing today.
So, if we can sort of advance to the next slide? When we think about different networking architectures, you can think about trying to understand the data center and various different sort of roles and responsibilities. And we’ve tended to focus on what’s called the front-end network. The front-end network is designed to take sort of general purpose compute x86 or ARM, interconnect them together through top-of-rack switches and spine switches, and they connect to the outside world via DCR or data center interconnect routing topologies to the wide area network.
This is really where Cisco has played historically, and this is really all powered by Ethernet. So, if I break that network down into two roles, there’s roles within the network that is primarily switching. And you can imagine that silicon system and optics is technologies that Cisco sells into that area. Towards the upper end of this picture is that the routing roles. Here again, we sell silicon system and optics.
Now, this area here is not what we consider the sort of AI/ML network. Clearly, this network needs to connect to AI/ML infrastructure. But when people typically talk about AI/ML networks, they’re actually talking about the bottom part of this picture, the back-end network. So what actually ends up getting created is, there’s a dedicated network that is built to connect AI and ML compute infrastructures. Some people might call these GPUs, some people might call them specialized compute, but it’s a network designed to allow these devices to communicate at very high bandwidths. That has historically been InfiniBand-based technologies. And what we strongly believe here at Cisco is that this technology will migrate towards Ethernet.
Now, we’ve seen this play out in the past already. So it used to be, for example, that storage was all done on back-end networks. But as technology evolved, storage moved to actually the front-end network with RoCE, or RDMA over Converged Ethernet, riding the bandwidth of the Ethernet infrastructure.
Now what’s interesting, though, about this back-end network for us here at Cisco is this is actually a brand-new market opportunity for us. So, it’s a very high bandwidth network, a very critical network. And we at Cisco, we believe we’re very well positioned here to sell silicon systems and optics here. This again is an additive network TAM for us rather than replacing the front-end TAM.
So, if we can advance to the next slide, please? So, why do I make the claim that it is moving towards Ethernet. At the end of the day, it is simply because it is built today on InfiniBand for historical reasons. There’s infrastructure built for high-performance compute, which was based on sole source GPU, sole source technology for switching and InfiniBand interconnect. As we all know, AI/ML is exploding.
There is multiple customers now building GPUs, whether that’s vendors like Intel or AMD, or actually end customers like Meta and Google are all public about the fact that they are building their own GPU infrastructure. As they do that, they will end up having to use somebody’s interconnect. Are they going to use a sole source interconnect provided by somebody else or they’re going to use something like Ethernet, which is widely available. So, if you ask me, Ethernet is actually an inevitable answer in terms of this transition.
We could advance to the next slide. So, as we move towards Ethernet, the question becomes what’s interesting about Cisco? Why would somebody pick Cisco rather than somebody else for Ethernet-based technologies? At the end of the day, if I oversimplify it for a second, Cisco has the silicon technologies, we have the systems that we build around it, and we have the optics technology. So, we’re actually uniquely positioned in this market that we have all the key building blocks to enable an AI/ML based network. And I’ll go into more details in this as we get along.
If we could advance to the next slide? The other key point here is that it’s not just about technology, it’s about how we engage our customers. So, you’ve always been able to buy a full system from Cisco. So that’s the silicon, the hardware systems and the operating systems together shown on the right of this. What’s interesting is that back in December of 2019, we announced disaggregated business models, and so now customers can buy components directly from us, whether it’s silicon, gray optics or coherent optics, you can buy that equipment directly from Cisco, build your hardware, write your software on top of that. Or you could also buy white boxes where we build the hardware platforms on top of the silicon and you bring your own software. So again, another unique thing here about Cisco is that we have all the different business models allowing us to engage with our customers on the terms that they want to engage with us on.
So, if we can move to the next slide, please? Now, jumping into a bit the silicon piece, I want to talk about Cisco Silicon One. So, we announced Cisco Silicon One back in December of 2019. We made a big splash about it then, and we’ve been iterating on that technology ever since coming back to the market twice a year or so with new advancements of the technology. At its heart, Cisco Silicon One proposition is to erase the hard boundaries that exist in the network and focus on having one converged architecture that can be deployed across your network, across form factors. But we realized that convergence isn’t enough. We have to be able to be the absolute best technology in every single one of these roles. So, regardless of it’s a top-of-rack switch or a core router, if you think about the key priorities for those individual roles, we want to be the best at each one of those things. We then take this converged architecture with incredible performance, and we offer to our customers in multiple different business models, giving our customers sort of one network and one experience regardless of how they consume it.
If we could jump to the next slide, please? So, as we think about Cisco Silicon One, the way we sort of think about driving this innovation strategy is, first of all, we have to have differentiated products. We have to have the best products in the industry. The second thing is convergence not only helps our customers deploy our technology, it actually allows us to leverage our investment and sort of double down in terms of driving innovation at a lower cost, allowing us to do further innovations, but also how we build silicon today is very different than how we built it in the past. Cisco has always built silicon in what’s known as the ASIC model, that’s us doing the design and then working with a back-end partner to do it. The other way that we ship products is we use third-party silicon providers, which, of course, has a margin stack on top of that. What we do now inside Cisco is we’re a true fabless semiconductor. It’s what’s called a COT or customer-owned tooling model. That allows us to get our cost down of development significantly. We own all of the IP. We drive our own road maps and our own technology transition points.
We then take this technology and we go to the market and we try and we win customers. Now, we have engaged very heavily in web scale over the recent years. And the reason that is, is because web scale drives technology transitions at the high end. And so, we have to engage with these customers to make sure that we have the best products, both for the web scalers, but also for the rest of the market. They also have an amazing ability to drive very large volume, and that maximizes our revenue set.
Now, we go to those customers and we sell them either silicon only, white boxes or full systems as we talked about before. We’re the only vendor in the industry who offers all of these business models. Taken together, this drives our volumes way up, which drives our cost way down, and then we can take that additional margin dollars and reinvest it into this innovation cycle and sort of continue the process forward.
If we can advance to the next slide, please? So, as we continue to sort of think about how this plays out, we have recently sort of announced our Silicon One G200 device. Now we’re very well aware that we’re not the only 51.2 terabit Ethernet switch on the market, but to reiterate what I said before, we are the best 51.2 terabit piece of switching silicon on the market, and it is built specifically to optimize AI/ML networks. We have a lot of technology in this device in terms of load balancing, link failure avoidance. We’ve managed to halved the latency of our devices. We’ve doubled the performance with keeping the power exactly the same. So, we are doing things in this industry that nobody else is doing.
And importantly enough, we’ve also announced in the industry that we’re building our own 112 gig SerDes. This is an incredibly important piece of IP. It takes a huge amount of effort to do, and we’re very, very proud of the performance. To the end customer, this means that you can build cheaper networks, specifically for AI. We have capabilities in this device to allow for long passive copper cable, linear drive optics, copackage optics, all are possible with the G200 SerDes that we’ve invented.
Now, if we could advance to the next slide? The other piece of G200, and this goes back to this notion of how efficient our Silicon One architecture is. Everyone else is starting to throw things overboard in order to fit in the silicon die allocated based on technology. One of the things people are throwing overboard is Ethernet MACs. That might not sound very exciting, and not sound very interesting, but if you look at how to build an AI/ML network, what you’re able to do to onboard up to 32,000 by 400-gig switches is you can do that with two layers of networking with Silicon One where others require three layers. Now why does that matter? It matters simply because it requires 50% less optics, 40% less switches, a third of the network layers, and that saves a megawatt for every one of these clusters. Very, very, very significant savings as a consequence of the efficiency of Cisco Silicon One.
If we could jump to the next slide, please? Now thinking about this, again, sort of one level more, our notion of convergence, but being best of breed that we started outside of AI/ML applies very much to AI/ML as well. So, as the industry moves to Ethernet, what we’re seeing is that different customers make different value propositions. And what we can give our customers is that flexibility of choice because we have a converged architecture. Customers can use fully generic Ethernet, giving them the ultimate incompatibility in terms of which Ethernet switches they use within their network, Cisco or other all fully interoperable.
The far end of that spectrum is number three, what we call fully scheduled Ethernet. This guarantees a non-blocking networking performance that gives you ultimate performance with a very low job completion time for AI/ML networks. And then, we also had the middle ground of enhanced Ethernet, which is taking IP that we created for fully scheduled Ethernet and bringing in and layer it on top of generic Ethernet, giving people a middle ground. So again, what we’re finding here is that our customers each want a different answer, and we’re here to meet the customers where they want to be in that.
We could advance again one more slide, please. Now, here’s just a summary slide to give you a sense of how do we stack up and how do we compare against InfiniBand and how does Ethernet, enhanced Ethernet and fully scheduled Ethernet play out? InfiniBand was great for high-performance compute. It was great for non-multi-tenancy or single-job performance. But as you think about AI/ML infrastructure, you’ve got to worry about multi-job performance and you have to worry about the pace of bandwidth improvements.
As you move to the right in this picture, you’re getting better and better performance. And again, because we have a technology that can do Ethernet, enhanced Ethernet or fully scheduled Ethernet, we work with our customers in an open and honest way for them to understand how these things actually are alike and compare. Everybody else in the industry is picking one or two of these and trying to sort of convince customers that that’s the right answer.
If could advance one more time, please? Now, all of this wouldn’t matter at the end of the day if the performance gains between Ethernet, enhanced Ethernet and fully scheduled Ethernet weren’t significant. If we’re talking about 1% or 5% or 10% mover, nobody would care. But at the end of the day, what you end up seeing is a difference of performance that we can do between fully scheduled Ethernet and Ethernet is incredibly significant. So there’s a parameter called job completion time, or JCT, which measures the amount of time the jobs take to complete on AI/ML infrastructure, and what you see is that fully scheduled Ethernet is about two times faster than generic Ethernet.
Now, what that means is you can complete your jobs quicker with the same network using fully scheduled Ethernet or you can build half the size network at the same job completion time. And again, what we’ve done is we’ve taken a bunch of those technologies, layered them on top of generic Ethernet for what we call enhanced Ethernet, and that gets about 1.5 times improvement over generic Ethernet. So again, very, very significant movement in performance of AI/ML workloads based on sort of Cisco technology.
And if we could advance one more, please? Now, one of the questions we get a lot is how have we been doing in the web scale business with Cisco Silicon One? What’s the impact of what we’ve done? Today, Cisco Silicon One is available in the Cisco 8000 with IOS XR. It’s available in our Catalyst 9500X and 9600X with the IOS XE operating system, and it’s available on Nexus via NX-OS, as well as multiple third-party hardware builds with multiple operating systems like SONiC, FBOSS and others.
Today, we’re happy to announce that we’ve actually penetrated five of the six global Tier 1 web scalers with Cisco Silicon One. Now some might assume that we’re talking about routing roles when we make that claim. What we’re talking about here is actually deployment within actual data centers. So these are some of the hardest customers to penetrate, some of the longest evaluation cycles, and we’ve had exceptional results based on Cisco Silicon One. It’s a real testament, I think, to what we’ve managed to do as an organization.
And then the second piece I wanted to highlight is, it’s very easy to talk about power efficient, too, which we do a lot with Cisco Silicon One. It’s much better to have an external customer write a reference about how much power they saved. There’s a point to here towards a press release that DT has announced. They managed to drop their power bill by 92% by adopting Cisco Silicon One. So again, these are not 5% or 10% movers. These are huge needle movers that we’re actually talking about.
And if we could advance the slide one more time? So, as we’ve been sort of evolving this, we are moving faster than anybody else in the industry. Since December 2019, we’ve come out with 14 different devices. It’s about a pace that’s 11 times faster than any other competitor in the network, and we’re continuing to push that forward because we have a converged architecture, because we’ve invested so much in this, we’re now just sort of enjoying the fruits of all of that work that we’ve done over the last eight years.
And one more time, please. So, at the end of the day, why do we end up winning with Cisco Silicon One?? We end up winning because we have the right technology, we have the right investments, we have the right scale, we have the right cost points, and we have the right business models. And if you think about us versus others, we’re the only company which have all of the technology from silicon, hardware, optics and software. That allows us to innovate across all of these hard dividing lines to come up with optimal final solutions. And we’re also the only company who’s got all the business models from silicon-only, white box to full system. It’s about meeting our customers where they want to be met and enabling them to be successful.
Last slide. Tal, over to you for questions.
Q – Tal Liani
Yes. Perfect. Thank you. I’m just going to start with the question that I got from the audience, and then I’ll go back to my questions. I’m just going to read from the screen. Assume AI clusters use Ethernet. In an AI cluster, when scaling up the number of GPUs, does the cost of networking goes up in a linear fashion or higher or lower?
So, Eyal, do you want to take that one or do you want me to take that?
Please take it. I’ll do the calculation in my head.
That’s good. The way these networks are built out is what we call a clos topology. So that is layers of networking that get aggregated by the layer above it. If you’re scaling the network within layers, so let’s say, for example, you have a two-layer clos network and you’re scaling it horizontally, that is a perfectly linear scale of cost, right? As you have to add a layer, as you run out of the radix of the chip, that’s sort of what I was talking about a little bit before, was because we have a higher radix chip, we can build wider flatter networks. As you add a layer, there’s a cost continuity which happens. And then from that point, it’s linear again.
So, you can think of it almost as a step function of increasing cost. I think if we compare it to InfiniBand as a statement, those are also built out of similar topologies and so they have a similar step function. The difference is being Ethernet gives you better radix, gives you higher port speeds, gets you better cost per bit. And so, although they’re both step functions, there are different levels of step functions.
Eyal, do you want to add anything to that?
Just in terms of port calculation, okay, i.e., the number of the switching ports or the optics, in most of the AI clusters, it’s about 1.5x, okay? Because you can think about what Rakesh just described, 1x for the first layer. Of course, you add more GPUs, you need to connect them. So that’s the 1x. And then you need another layer to connect those top-of-the-rack switches, and that give you the 0.5x. So it’s more or less 1.5x. In computer science, it’s linear.
Another question is, is there a risk with the fabless business? Or what is the risk with the fabless business? Do you have diverse geographic and commercial partners for manufacturing of your silicon?
Eyal, go ahead.
Okay. Maybe I will take it. Just the question is, call it correctly. As Rakesh said, we moved from an ASIC model to a fabless semiconductor business. And that gives us a lot of benefits. And it doesn’t matter if you work in an ASIC model or in a fabless semiconductor. Eventually, there are fabs that manufacture your devices underneath. Those fabs today are mostly TSMC of the world, although we experiment with others as well, and TSMC have their own geographical distribution, but we’re exposed to that as I believe most of the industry.
If I may add one thing here, Tal, and I would like to relate to the fact that we are fabless semiconductor, it’s not an easy thing. Up until five, six years ago, as Rakesh said, Cisco was an ASIC producer, but it was not a fabless semiconductor. So for example, people are asking us why we didn’t have those business models, silicon-only components, white box and full systems 10 or 15 years ago, because the web scalers, that’s really what they wanted. It’s not a decision about their business model. Cisco could have made a decision time 10, 15 years ago or any other that we will open that up. But to offer a semiconductor business model or a white box, it is not just a decision, you have to have the right technology. You have to have a silicon that you do in a fabless way.
As Rakesh said, and maybe I will just give you an example. Let’s just say that a silicon costs out of TSMC $1,000. So if you work directly with TSMC, like the fabless guys or like we are doing today, it costs you $1,000. If you work in an ASIC model, which you still define the chip, you write the code, but you have the guy that is doing — providing the IPs, some of them are important, like the SerDes doing the back-end and work with the fab, this is called the ASIC model, you will pay $2,000 for that service, more or less, okay? And if you buy off-the-shelf silicon that the merchant vendor did — the fabless merchant vendors did, you’re going to pay $3,000 for that, okay?
Now when you go to those web customers, they are buying their silicon directly from fabless. So, if Cisco would like to compete there, they have to compete with a cost structure of $1,000. We cannot be ASIC model, and we cannot do off the shelf, of course, a merchant silicon. So, we have to keep it in mind. And what we did and that a big transformation for Cisco in the last six, seven years, we transformed the team and we build those muscles, those capabilities, and there a lot of them, starting from the back end, from the manufacturing, from the testing, from the IP, Cisco today, Rakesh mentioned that, is developing their own the critical IP SerDes. We developed our own 112 gig SerDes. So I think it will be no surprise if we said that we are developing our own 224 gig SerDes, and it’s considered to be best in the industry type of pod. So that’s on that.
You touched on the issue of Ethernet versus InfiniBand. And the question is what would drive it? What would drive the migration? In your view, Ethernet eventually provides a good solution types of Ethernet maybe but provides a good solution or as good or even better. What drives the change? Because right now, we have NVIDIA presenting later, and we know at least by report that right now, InfiniBand is the way to go for the back-end data centers. What drives the migration to Ethernet? What kind of benefits — how can Ethernet or what you offer replace what InfiniBand is providing today?
So, a few thoughts from my perspective. So I think you’re absolutely right, Tal, that if you look at the market today, and you look at amount of back-end interconnect, that is InfiniBand versus Ethernet, I think you’ll see a percentage that is quite heavily weighted towards InfiniBand rather than Ethernet. I think at the end of the day, it comes a little bit in terms of where technology has grown up from, i.e., for a long time, high-performance compute was sort of the big thing that was being built in terms of disaggregated computing infrastructure. So you would put GPUs, you connect them together with InfiniBand, you would use a large distributed compute structure to run, for example, a weather prediction algorithm on top of it to figure out where the next tornado is going to strike. As a sort of gold rush towards AI/ML has happened, these GPUs have been identified as being a key piece of technology in terms of solving AI/ML infrastructure. There’s already at-scale deployments with InfiniBand for high-performance compute. It is actually the natural thing to just migrate a proven entity and try and scale it up. I think that’s what we’re sort of seeing today.
I think the question I would ask is maybe a slightly different one than you asked, Tal, which is if you forward project, why is the interface actually InfiniBand? What advantages does InfiniBand have over something like Ethernet or enhanced Ethernet or fully scheduled Ethernet that we talked about before? And I would contend that except for the backwards legacy of where it came from and being deployed, from a technology perspective, it actually struggles, i.e., it’s a sole-source technology.
And what does that really mean? Sole-source technology means slightly limited investment structure. It means pace of innovation comes down, which means that the radix, the bandwidth of the switching infrastructure comes down. And to the question that came in earlier about the cost linearity of the scale as the radix and the bandwidth of your chip shrinks, you need to add more layers of the network in order to connect all of those GPUs and you start magnifying any sort of cost differences.
Third is about resiliency, similar to the question that was asked before about fab technologies, which is there’s a single vendor providing the infrastructure and all of a sudden has to scale to the number of AI/ML-based deployments. Are you really going to risk all of that infrastructure on a single vendor? Or do you want multi-vendor to give yourself supply chain resiliency?
And fourth is what we talked about a little bit before, which is if you start having multiple people build GPUs, which we’re already seeing today, right, they’re no longer going to be tying themselves to InfiniBand-based interconnect. So this transition to Ethernet seems really quite inevitable to me. And actually, the fact that you see even vendors who build InfiniBand-based switches coming out with Ethernet AI interconnect alternatives is again a sign that, that trend is happening. I think what we need to realize that it’s not an overnight transition, right? It will take a few years before it becomes the sort of de facto standard for AI/ML interconnect. But from where I sit from a technology perspective, it seems quite clear.
Eyal, I don’t know if you want to add anything to that.
No, I think you said it well. If I try to summarize in my head, in my mind, I don’t see — we don’t see, being deep in technology, any advantage of InfiniBand versus Ethernet. In fact, with the pace of innovation that is currently happening over Ethernet, Ethernet, I will contend, will surpass in terms of performance and InfiniBand technology. Second one, the ecosystem and the competitive nature where we are. Rakesh just showed 51.2T switch, and it comes no surprise that — it was just introduced this year, and everybody is working already on the next generation. And I will contend again that, that will be presented next year.
So, in one year, everybody is in a race to do 51.2 to 102. And that’s not going to end there while there is an application that consume bandwidth, we love bandwidth, okay? There will be such a rapid pace that not anyone which is — any technology with a single source will be very hard to compete with. And you can see the cost structure we live in today when you compare the two. The competitive landscape is just too hard to beat.
Got it. Who is the customer who are the infrastructure equipment for AI? Is it just the cloud titans? Or are we seeing large enterprises, telecom and service providers kind of also deploying AI infrastructure?
It’s a great question, Tal. Several thoughts on that one. So, the short answer is, I think you’ll see AI/ML-based deployments across the board, from large-scale enterprises to service providers to web scale or web titans or hyperscalers depending on the terminology. But I think part of the problem that we all struggle with is AI/ML is such a high-level term, and means many different things.
So, I’ll sort of break it down one level more is, I think if you look at the largest training infrastructure, the things which are required to train something like a ChatGPT style model, right, the amount of infrastructure necessary to train a model like that, I think aligns well in terms of cost to deploy, cost to maintain, facilities to manage, that type of thing, I think, will live in the large-scale hyperscalers or web scalers. So the largest training models, I think will live there. I think there’ll be a lot of inference, which happens there as well, smaller training models, multi-tenancy, people will use it as a paid service to run that infrastructure in hyperscale.
But that’s not to say that if we, for example, look at enterprise data center or service provider data centers that they’re not going to do a smaller scale training or retraining models that they acquire for their specific use cases in infrastructure and also lots of inference. Now I think one of the other interesting points that sort of plays into Cisco’s strength here is — if you think about what each one of those customers end up needing, it’s all slightly different mixture.
So, on the hyperscaler, they might just want silicon and optics. As we talked about before we offer that as a components business. They might want white box hardware where they write their own OS on top of it. We offer that infrastructure. If they want to buy full systems with OS, whether it’s our own OS or our open source operating system like SONiC, we offer that as well. They’re really definitely going to do at least their own orchestration software on top of that.
As you move down towards enterprise campus and service provider, I think they’re going to want more of a CAN solution. So they are 1 — in addition to full system infrastructure, I suspect most of them are now buying silicon-only, but full system infrastructure, along with network orchestration software to manage that full solution, right? They want something that’s easier to deploy, self-contained passion.
Got it. My next question [Technical Difficulty] they’re connected, so I’m going to ask them together. Historically, many years ago, Cisco didn’t have a good position with hyperscalers. But recently, you announced some $500 million in orders from AI. So first thing is what changed? How do you manage to change position? And then can you dig a little deeper into the $500 million, not from a numbers perspective, but just to understand, what is the composition of it? What kind of applications and just more information about where Cisco is positioned now with hyperscalers?
Eyal, do you want to take it or do you want me to take that?
I can start and then you can add. First, let’s just start on the $500 million. And the composition is both silicon and Cisco 8000, and it’s being deployed directly for AI clusters. So it’s not the front-end that is serving AI cluster, this is really for the back-end AI clusters where the GPUs are connected. That’s number one.
Second, I will say this is just the start. And what we see in the market, and let’s just be completely honest, what we see in the market. The guys that are really starting are the hyperscalers [indiscernible]. What they’re buying today, in 90% of the cases, they are buying an NVIDIA cluster with everything in it, with InfiniBand, with optics and with the GPUs from NVIDIA. And there’s a saying that everybody talks about AI, but the particular people that are making money is NVIDIA.
But why we are optimistic? Because on top of those people who are buying the closed out-of-the-box clusters, all the big AI clusters — sorry, hyperscalers are playing in piloting and some of them are really in initial deployments of clusters where the fabric and the networking is Ethernet, okay? So, we certainly see them experimenting. We are part of it, okay? The $500 million is part of that. But the redeployments, we believe, for Ethernet going to start ’24 and ’25. And we’re going to pick up from there. And that’s why we believe, because the previous discussion, it’s going to be 75% or 80% of the market in three, four years will be Ethernet.
Now similarly, the same thing will start happening in enterprises or service providers. Actually, people asked me three months ago, how much time it’s going to take for them to catch up for the AI training and all that. And I said it’s going to take probably a year or two. I was a little bit surprised to see in the last two weeks I saw already deals of banks, service providers, okay, buying out-of-the-box clusters, smaller ones, that they’re going to train their data on top of it. And based on that pace, I believe, that they will open it up and we’ll move to Ethernet probably also much faster than what we expected. That’s on that.
Rakesh, do you have anything to add?
No, I think you covered the second part of the question first. I just want to roll back to Tal’s first part of the question, which is what changed and how all of a sudden is Cisco relevant to the hyperscalers? And again, it all sort of plays on top of each other, like none of these things are in isolation. But I think it’s similar to what we talked about before, which is Cisco to some degree, again, if we’re honest with ourselves, did miss the transition a little bit in terms of what scalers bandwidth growth and the desire to move to sort of a components-based model, right? What we started investing in back in 2017, 2016, 2018 timeframes was trying to get the right set of technologies inside of Cisco, whether that is silicon with Cisco Silicon One or optics, gray or coherent optics, right, making sure that we have the right technology that exists that would want to be consumed by a web scaler. That was case one.
Case two is once we have that right technology, we have to get the cost points down to the point where we can engage them on the terms that they want from a components-only business. That goes back to the COT model that we talked about before in terms of buying wafers directly or moving margins actually and getting our costs down. Once we have the right technology and the right cost points, it allows Cisco to begin offering these business models of silicon-only white box over systems. We go and we engage these customers to talk with them about what our technology is. We offer flexible business models. And what we’re finding at the end of the day with our engagement with these customers is a real appetite for alternatives and the fact that Cisco Silicon One really gives them something that nobody else really can.
Now, why do I say that? These customers end up building their own operating systems. They end up building their own hardware. If they adopt Cisco Silicon One, they can adopt the technology that can be deployed everywhere in the network. Everybody else is coming them with point solutions, so they do all of this work and then they deploy it in one location. Then they had to do all this work again and deploy it in another location. We offer them this notion that once you adopt Cisco Silicon One, you can deploy it across your entire network, and we have actual hyperscalers who from top-of-rack switch all the way through their data center network, all the way through their WAN, and across their edges, all based on Cisco Silicon One end-to-end. That is a huge leverage if you think about trying to deploy things at scale — trying to deploy things at scale with easy maintenance and lower cost like everything about a web scaler and their desires aligns to the value proposition that we have with Cisco Silicon One. And that’s why it’s so appealing, I think, to that customer base.
And what we’re actually finding now is, what’s fascinating about offering all of these business models is as we engage these customers, we work with them to expose what the business models are. They pick the set that is right for them and the deployments are thinking about. But what we’re finding is migration up and down. So they might start with silicon-only for one location, they might buy white boxes from us at another location and they might buy full systems for us in a third location. And conversely, they might start with full systems in one location and end at silicon-only, right? And so, we’re seeing this migration that once they get used to Cisco Silicon One and understand that it really does what we say it’s going to do, they get quite excited about it and start finding out ways to put it in other portions of their network. So I think that’s really how we become relevant to them. And because it’s not just silicon, it’s silicon and optics and systems, we can have a conversation with them like other companies, frankly, can’t.
Got it. I got two questions that are on the same topic. I’m just going to combine them. And the question is about the competitive landscape for Silicon One with Broadcom, with AMD, even smaller companies that are out there. How and what are you competing on basically?
Eyal, do you want to take it or you want me to take it?
No, I can take it. Let’s just look at the marketplace. So with Silicon One, we are competing for the networking for the AI. So, there are two other off-the-field fabless semiconductor guys that are providing that solution. Broadcom and Marvell. NVIDIA/ Mellanox are offering something hybrid, they are offering a box, not directly a silicon in most of the cases. And that’s the landscape. AMD doesn’t have the network. For us, they are a partner. I believe that we are partner for them as well. They are providing the GPUs. Similarly Intel, okay? So in terms of our competitive landscape is the Broadcom, is the Marvell and the NVIDIA to some degree because of that. That’s it.
Great. Another question came up is when you think about the specs and you think about the Tomahawk 5 of Broadcom and Silicon One, can you speak about the kind of how it stacks up versus competition for the use cases of the cloud titans?
Yeah. I think as sort of I mentioned before, I think we’re very well aware that there’s other 51.2 terabit silicon on the market, right? You mentioned Broadcom. There’s others as well. And so there are multiple people sort of vying for the same business at end customers. If I think about what our value add is, why do we win? It breaks out into a few different things. One is the converged architecture that we already talked about, right, which is once you adopt the Silicon One architecture, you can deploy it in multiple different places of our network. So there’s a huge benefit to our end customers for efficiency in deploying technology. That’s one statement.
The second is, I think most people, if not everyone else in the 51.2 terabit realm is what I would call struggling a little bit to fit all of the complexities of high-bandwidth silicon and are starting to drop things overboard. So, we talked a bit before about the radix of the chip. I think that’s something that we have in Silicon One that is unique. It goes back to how efficient is our underlying architecture. The fact that it’s so incredibly efficient allows us to have space and silicon to do other things. The fact that it’s so efficient also allows us to have very, very good power efficiency. And if you think about what that means in deploying networks at scale, that is a huge lever. If you ask me, I actually think that power is the fundamental limit of networking as a whole, as an industry. And we took power efficiency to heart when we came up with the Silicon One architecture, and it goes into every decision that we make in the silicon.
The other piece is, I think, again, it goes a little bit back to the efficiency vector is we have a very flexible, capable 51.2 terabit device. So, we have programmability while still being incredibly efficient. We’ve taken latency as a high, high order bit in the G200 architecture. And we’ve actually managed to have the latency of our 51.2 terabits. We believe that we have an industry-leading latency offering with a Silicon One G200 device. And that matters when you think about inference clusters. It matters when you think about determinism for large inference clusters.
And finally, how we design our silicon because we’re also a systems company. The way we think about the problem is quite different from a standalone silicon company. We think about optimizing the final deployment in the final product, not just the silicon. Now what that actually means is if somebody takes our piece of silicon and somebody else’s piece of silicon and builds a box around it, and if we pretend for a second that those two have identical specs, by the time you end up looking at the full system, you’ll end up with a lower cost, more power-efficient final solution. And again, when you think about the scale of the networks that are being built for AI/ML, it’s incredibly impactful if you can end up with a cheaper and more efficient final solution.
Eyal, anything else you want to add to that?
That’s a different perspective. And if you take a step back, Cisco four, five years ago, had zero market — close to 0 market share in the data center — inside the data centers. Yes, we were selling systems to connect data centers. But inside the data center to connect the servers, not to talk about AI today, we didn’t have the right ingredients. Rakesh mentioned five out of six. It’s gradually growing. And for us, it’s all additive. Add to that, the AI clusters, which is a new opportunity, and currently, we are well positioned because we are transformers to a fabless semiconductor company to some respect with white box, with systems, we believe we can capture that as well, and that’s going to add more to the growth engine that we currently have.
And one last point. People ask me about the competitive landscape. Cisco is in a unique position because I mentioned that different model. COT is $1,000 for silicon. ASIC model is $2,000, off-the-shelf merchant silicon is $2,000. Of course, if you sell silicon or white box — if you sell silicon of course, you cannot do ASIC model or off-the-shelf, because the web scalers, if the chip cost $1,000, they expect something less than $2,000 goes to them. So, you cannot do it in ASIC or in our merchant. But even a white box, okay, when you look at the white box, white box, the silicon content of the white box is about 25% to 40%. So, if you work with an ASIC model or a merchant, your penalty is so big that I doubt you’re going to win any long-term or mid-term deal based on that. And if you look at the system, again, same economics, okay? Anyone that’s going to sell systems, midterm and long terms, is gross margin going to be impacted. That’s why I believe that we are in a unique position for the midterm and the long term.
Great. Eyal and Rakesh, I feel that we just started to warm up, but we ran out of time. Thank you so much for the presentation. Great presentation. I learned a lot, and thanks very much for the Q&A session. And for the audience, if I didn’t answer your question, please send it to me in an e-mail. If I don’t know the answer, I’ll ask Cisco crew to help me with the answer. Thanks so much. Have a great day.
Thank you. Appreciate the time. It’s fun. Bye-bye.