11 Comments

The role of Grace is worth watching, indeed. How is its memory going to be used? It is much cheaper than HBM per GB, so makes sense for Nvidia to load it up, but it will be interesting to see if AI gets creative using this slower but local tier. Or is it just a glorified IO buffer space, plus space for Grace to run system management code, accumulate metrics?

But for sure, there don't seem to be any "classic" CPUs or servers in the GTC rendering of the 32,000 GPU cluster. Maybe a few hidden in some of the racks.

Expand full comment

Thanks, great article. Any idea re ARM's royalty per Grace CPU? Tae mentioned $100+ per Cobalt 100 CPU but that's a 128-core and MSFT is buying a subsystem from ARM (2x royalty rate). Assume 74-core CPU that is not a subsystem would attract a lower royalty? Appreciate Nvidia is taking V2 cores vs. N2 for MSFT. Would still think MSFT pays more on a per core basis vs. Nvidia. Any pushback?

Expand full comment

Very interesting question. The chiplet MSFT is buying (CSS N2) has higher dollar content for ARM but it is not royalty I think. ARM is paying TSMC to fab the chiplet like other normal situations (not sure on this).

V2 has much higher die area and "value/quality" compared to N2. It could go either way honestly. The CSS chiplet commands a lot more value even though the N2 CPU cores themselves are less valuable than V2.

One idea I want to bring up is incentives. Microsoft/Cobalt and Amazon/Graviton have incentives that are opposed to ARM (LTD). They want to make a cheap chip for internal-use only and thus would like the fake/theoretical ASP to be low. Nvidia is selling Grace horizontally and has incentive to sell it at the highest price possible, aligning them with ARM (LTD). ASP for Grace has hard data. Cobalt and Graviton ASP is some made-up number from imbalanced negotiations.

Expand full comment

Thanks, yes I missed your last point and that's definitely valid. Was wondering how much Grace would cost (guess HSD $k at most?). Unclear if CPU pricing + V2 will outweigh lower core count and Nvidia not buying CSS from ARM's perspective ($ per CPU). But it's probably closer than I'd originally thought.

Expand full comment

I usually try to calculate using gross-margin and die size. (public) https://twitter.com/Locuza_/status/1663217786812878848?lang=en

It seems to be near recital limit with no extra CPU cores for yield. Some rough estimates... I get ~474 USD/die and let's bias that to say 500 USD for safety. Package, LPDDR, and PCB are not counted for ARM's cut. Safe to assume Nvidia is targeting 70-75% gross margins so slightly lower than GPU GM. $1.6K to $2K grace ASP at $500 COGS from TSMC?

Expand full comment

Very interesting thanks. Can't argue with the logic, assuming you have the right cost/mm2 at N4P for Nvidia. My guess of HSD $k at best was based on datacentre CPUs mentioned in Nvidia's benchmarking exercises in their white papers e.g. Milan 7763 (EPYC), 2 socket Xeon Platinum 8480+ or 2 socket EPYC 9654 (prices quoted online could be too high, not sure how much scaled buyers pay!). Slide 2 here https://resources.nvidia.com/en-us-grace-cpu/data-center-datasheet

Expand full comment

Yea enterprise volume pricing is varied and difficult to pin down. Public leaks of wafer prices + quarterly GM + public die-shots are how I guesstimate.

I used public TSMC wafer pricing rumors + an online yield calculator. Can redo the exercise with AMD EPYC 7763. Intel Xeon 8480 (Saphire Rapids) has public die size but nothing on wafers. Chiplets make it take a bit longer and prob need to make a diagram to explain well. Can make a simple estimation on the Intel wafer cost and cite pubic sources as references..

This is a fun idea. Will look into it and write something up. Rather busy ATM. Appreciate the interesting discussion.

Expand full comment

Thanks for your insights, looking forward to the article

Expand full comment

Yeah, pretty much agree.

Expand full comment

What metrics did you use to rank CPUs? Qualcomm Orion isn’t even out yet

Expand full comment

Overall quality and relative market success. At best, Orion gains 1-2% share in 2024 and moves to 3-5% in 2025. This is a 300M unit/year market with ~$125 ASP. Die size implies much better margins compared to QCT average but won't really move the needle.

AMD Strix is expected to be very good based on the leaks.

Also, the Qualcomm event last year was using Linux benchmarks in the fine print.

Expand full comment