PCIe 4.0

As the first commerical x86 server CPU supporting PCIe 4.0, the I/O capabilities of second generation EPYC servers are top of the class. One PCIe 4.0 x16 offers up to 32 GB/s in both direction, so each socket offers up to 256 GB/s in both directions, for a full 128 PCIe 4.0 lanes per CPU. 

Each CPU has 8 x16 PCIe 4.0  links available which can be split up among up to 8 devices per PCIe root, as shown above. There is also full PCIe peer-to-peer support both within a single socket and across sockets.

With the previous generation, in order to enable a dual socket configuration, 64 PCIe lanes from each CPU were used to link them together. For EPYC, AMD still allows for 64 PCIe lanes to be used, but these are PCIe 4.0 lanes now. There is also another feature that AMD has here - socket-to-socket IF link bandwidth management - which allows OEM partners to design dual-socket systems with less socket-to-socket bandwidth and more PCIe lanes if needed. 

We also learned that there are in fact 129 PCIe 4.0 lanes on each CPU. On each CPU there is one extra PCIe lane for the BMC (the chip that controls the server). Considering we are living in the age of AI acceleration, the EPYC 7002 servers will be great as hosts for quite a few GPUs or TPUs. Density has never looked so fun.

Zen 2 and Rome: SMILE For Performance The BIG LIST of Rome CPUs: Core Counts and Frequencies
Comments Locked

180 Comments

View All Comments

  • wrkingclass_hero - Sunday, August 11, 2019 - link

    What does AMD have to do to get a Gold or Platinum recommendation?
  • oRAirwolf - Thursday, August 15, 2019 - link

    This is a good question
  • imaskar - Sunday, August 11, 2019 - link

    Single thread performance is very important for those who lives in cloud. A quick example: suppose I provision 2 core/4gig vm (this is of course hyperthreads). And on AWS I have a choice - m5 and m5a, where AMD is cheaper. What do I sacrifice? Not really throughput, because you don't run your prod workloads at 100% CPU. But there is the latency. If those cores clocked lower, I would get the same amount of responses, but slower. And since in microservice world you have a chain of calls, you get this decrease 10 times. Is it worth it?
    That was the case for 1st gen EPYC. Would 2nd gen have latency parity?
  • notashill - Sunday, August 11, 2019 - link

    It's hard to say until the cloud instances actually launch.

    The current m5a instances are using a custom SKU which is clocked at 2.5GHz max boost.

    Rome's IPC is ~15% higher and clock speeds are all around higher so single threaded performance should be quite a bit better, but ultimately the exact numbers will depend on which SKUs the cloud vendors decide to use and how high they clock.
  • duploxxx - Tuesday, August 13, 2019 - link

    did you actually ever work with hypervisors?

    there are other things than raw clock speed.... its all about scheduling and when there are more cores / socket available the scheduling is more relaxed, less ready time..... EPYC generation 1 is already awesome for hypervisor way better choice than most Intel counter parts for sure if you look at socket cost... but than again I am probably talking to a typical retard ****
  • JoeBraga - Wednesday, August 14, 2019 - link

    Can you Explain better? But the license isn't bought by the quantity of coresor Per socket?
  • imaskar - Wednesday, August 14, 2019 - link

    He probably talks about VmWare, which is licensed per socket, not per core. So with EPYC gen2 you need twice less licenses for the same cloud capacity (assuming cores are equal).
  • JoeBraga - Wednesday, August 14, 2019 - link

    Now I understood
  • imaskar - Wednesday, August 14, 2019 - link

    Rather than calling others retards, you could first dig a little deeper into an issue. No, I don't work with hypervisors directly, I'm from the other side. I write software and I want good latency (not insane one like for HFT, but still a good one). Because for throughput we could just spin one more instance. You can't buy latency horizontally.
    I'm not taking numbers out of the blue. There is a benchmark for AMD instances vs Intel instances on AWS. I'm not sure if we are allowed to post links to other resources here. Put this string into Google and you will surely find it: "A Look At The AMD EPYC Performance On The Amazon EC2 Cloud". Despite this article being very enthusiastic about those instances, you can really see that per core performance on Intel is better, meaning better latencies for web apps.
    I will probably write my own set of benchmarks, because that one seems to completely ignore web servers. I am very enthusiastic about AMD instances, but they are definitely not a no-brainer.
  • quadibloc - Tuesday, August 13, 2019 - link

    The new Ryzen chips compete well with what Intel is currently producing. But while they doubled AVX 2 support, so as to match what Intel has, Ice Lake will double that - as has been known for some time. So if this is what AMD thought would be competitive with Ice Lake, as Forrest Norrod said, AMD was not trying hard enough - and they're just lucky Ice Lake was late. AMD's position relative to Intel with its previous generations of Ryzens seems to be the limit of their ambitions. Combine that with Intel reacting to its current issues, and it looks to me that AMD will have to rethink some aspects of its strategy to avoid Intel being ahead when it comes time for next year's chips from both companies.

Log in

Don't have an account? Sign up now