04:54PM EDT - Hot Chips has gone virtual this year! Lots of talks on lots of products, including Tiger Lake, Xe, POWER10, Xbox Series X, TPUv3, and a special Raja Koduri Keynote. Stay tuned at AnandTech for our live blogs as we commentate on each talk. Intel recently had its own Architecture Day 2020, with Raja Koduri and other Intel specialists disclosing details about process and products. It will be interesting to see if Raja discusses anything akin to roadmaps in this keynote.

04:58PM EDT - Raja M. Koduri, Senior Vice President, Chief Architect, and General Manager of Architecture, Graphics, and Software, Intel'

05:01PM EDT - The title of the talk is 'No Transistor Left Behind'. Raja has had it on a t-shirt at a number of events

05:04PM EDT - 'Raja has spent his career enhancing accelerate compute in the technology industry, across graphics, vector compute, consoles, and semi-custom designs'

05:05PM EDT - First, paying tribute to Frances Allen, who recently passed away

05:08PM EDT - The balance of software abstraction and performance hardware execution is the boundary that Frances worked on and we still work on today

05:09PM EDT - A little of 20 years, Intel senior architects (hardware and software) got together to discuss heterogenity in Intel's roadmap and software roadmaps

05:09PM EDT - They all knew each other, but many of them were meeting each other for the first time

05:10PM EDT - That discussion is where the phrase 'No Transistor Left Behind' comes from

05:10PM EDT - David Blythe is Xe senior architect

05:11PM EDT - The role of hardware/software in our lives

05:11PM EDT - COVID has shown how vital the progress of the decades of tech improvements has become

05:11PM EDT - Technology has led disruptions

05:12PM EDT - Predicting the future is tough, but we expect to see 100 billion devices - intelligent computing

05:12PM EDT - Accessing data and compute from anywhere - exascale for everyone

05:12PM EDT - like electricity

05:12PM EDT - 10x growth opportunity for the industry

05:14PM EDT - A balance of performance vs general purpose

05:14PM EDT - Leveraging data to build intelligence - data that isn't analyzed isn't useful

05:14PM EDT - We need more capacity and more bandwidth at every level

05:14PM EDT - We need bandwidth to achieve exponential growth

05:15PM EDT - Gaps between what we have for memory today for AI vs what we need

05:15PM EDT - We need superhuman-style computing

05:15PM EDT - Now Moore's Law

05:16PM EDT - People have predicted the end of Moore's Law for decades

05:16PM EDT - Moore's Law is how we've built the last two eras of computing

05:16PM EDT - It has been harder and harder to deliver the required metrics

05:16PM EDT - But it's definitely not over yet

05:17PM EDT - There is plenty of room at the top

05:17PM EDT - Software helps us to get there as much as hardware does

05:17PM EDT - Python vs AVX512

05:17PM EDT - Over 100x perf on the same CPU with software updates

05:17PM EDT - New AI workloads allows vector optimization opportunities that weren't there before

05:18PM EDT - Transistor scaling though isn't helping as much as it used to

05:18PM EDT - Whatever we call the Moore Law in the modern age, we believe transistor density 50x easily

05:18PM EDT - 3x in FinFET itself

05:19PM EDT - x2 in Nanowire

05:19PM EDT - Nanowire stackeds for another 3x

05:19PM EDT - This is regular pitch scaling might stop

05:19PM EDT - then wafer-to-wafer stacking for 2x

05:19PM EDT - Then die on wafer stacking for 2x

05:19PM EDT - All of this is happening in labs across the world

05:19PM EDT - The vision will play out over the next decade or more

05:20PM EDT - Heat dissipation is a challenge too

05:20PM EDT - Room at voltage, capacity scaling, new pacakinbg, frew scaling, new architectures

05:21PM EDT - Also packaging - the future of Foveros is hybrid bonding

05:21PM EDT - Simpler interconnects with lower capacitance and lower power

05:21PM EDT - Stacked SRAM test chip recently taped out

05:22PM EDT - Significant investment allows Intel to drastically adjust its view on next gen packaging for end-user product

05:22PM EDT - Now memory hierarchy

05:23PM EDT - (the dreaded pyramid of optane)

05:23PM EDT - And the inverse next gen pyramid

05:23PM EDT - Need 10x improvement across the board

05:24PM EDT - Brainstorm next gen requirements with Tim Sweeney about next gen MMO

05:24PM EDT - Support 1000s of users or more at once with Hardware and Software

05:24PM EDT - But also make general purpose and accessible to everyone

05:25PM EDT - First, this is how hardware companies think:

05:25PM EDT - This is the concept we were thinking

05:25PM EDT - 25 cores per CPU - with density, go up 100x - 4x boards, then racks for 1million cores

05:26PM EDT - It's all about the interconnect!

05:26PM EDT - Now software

05:26PM EDT - The grumpy person reminds Raja of Jim Keller

05:27PM EDT - This contract between hardware/software is what matters

05:27PM EDT - All about ISA + OS developers

05:27PM EDT - It's all about performance and generality

05:28PM EDT - Rich software stack on x86 today

05:28PM EDT - The more abstraction, the more developers

05:28PM EDT - Abstractions are very leaky

05:29PM EDT - It's a Sisyphean effort

05:29PM EDT - What are the hardware/software contracts of the future?

05:29PM EDT - x86, Arm, RISC-V, AI, GPU, Memory, Network

05:30PM EDT - Intel is adding heterogenity in the CPU socket

05:30PM EDT - Beware of beyond Cooper Lake

05:31PM EDT - 3-5 years to see adoption of new hetero ISA extensions

05:31PM EDT - That's a broad software ecosystem statement

05:31PM EDT - The key to this is to give developers performance at every level

05:32PM EDT - Ninja developers at the low level can offer non-linear improvements higher up the stack

05:32PM EDT - Any abstraction needs to be scalable - open and accessible to all, Have to retain productivity at all levels while also maintaining perf

05:32PM EDT - Misconception that python isn't used for performance

05:33PM EDT - Ninja programmers are rare, but very important for performance

05:33PM EDT - Important to support ninjas

05:33PM EDT - Scaling across every product

05:33PM EDT - Level sub-zero

05:34PM EDT - OneAPI

05:34PM EDT - Still early days

05:34PM EDT - OneAPI beta available on Intel Dev Cloud

05:36PM EDT - Scale from sensors to edge to cloud

05:36PM EDT - Where will be in 2021

05:36PM EDT - milliwatts to Megawatts

05:37PM EDT - XeHP GPU !

05:37PM EDT - 1000x in compute by 2025

05:38PM EDT - Exascale for everyone

05:38PM EDT - Now time for Q&A

05:39PM EDT - Actually a few more comments first

05:45PM EDT - More complex hardware in the future

05:45PM EDT - Now Q&A

05:48PM EDT - Q: Integration between CPU and GPU A: We've been doing a lot time for the PC space, what hasn't been done yet is in the DC and at scale. The key is figuring out the programming model that scales - at the moment we see them a scalar/vector/matrix and it's all about combining them and building the programming model. Physical integration is also key, at high performance.

05:48PM EDT - Q: Does intel plan the open source Xe dGPU code, as with the Gen11, or will it be closed stack? A: We are pretty active in Linux open source. Xe drivers in Linux will be Open Source.

05:51PM EDT - Q: Does ISA matter in a future of accelerators? A: Great Question. It's the central thesis of the talk. DSA - do you need an ISA, or not? My thesis on ISA is important is for the general purpose, for the mass install base - architectural impact based on that hardware software contract. Lots of us have worked on DSAs, but when you talk generality, today, ISA still matters. If you move the contract up the stack, is that in the form of an ISA, how does it look? It's a trillion dollar question, I'm not proposing that I have an answer, but my talk is that we are working on it, and we will share what we find through OneAPI, and in some ways it's a call to action for the whole community. It has to cover the whole industry, not just one vendor or architecture.

05:53PM EDT - Q: Security HW vs SW, direction in industry vs academia? A: Great Question. I could have spent more time on Security if I had more time and Intel's vision! It's super important. The surface area we are generating over all these layers of hierarchy - the security attack surface area is growing more than exponential. It's scary! It's a big call to action for the community to. Security is hardware as we move forward, not easier. Architecture opportunities and simplications, in both hw and sw is daunting.

05:56PM EDT - Q: ML revolution - libraries or GP stack? A: Great question. We already have special paths in TF and pytorch - the inner loops have been phenominally accelerated in the last few years. As we do the analysis in the workload, we are seeing the bottlenecks shifting around as we optimize. The algorithm rate of change is quite high - with the community and our customers, I have a lot of conversations about generality of future approaches. It's a whack-a-mole. Right now the need for generality software stack is potent and there are lots of discussions that are active. Is there an API that develops a better scalable contract? It's hard to tell.

06:00PM EDT - Q: Other approaches of wide purpose compute like OpenCL haven't succeeded. What makes OneAPI work? A: At one point there was abstractions of the GPU hardware, and even with all the limiotations a decade ago. I don't believe at OpenCL it wasn't really taken a step back to look at the overall compute problem. If you go back more than two decades ago, the work in high performance computing systems in languages there is a lot of golden nuggets and answers that sit on those infrastructures. That's one of the things we look at. Personally I also have been a big fan of abstraction of Apple's grand central dispatch. I know swift and concurrency models in swift have made some amazing progress, then there's Apache Spark too. If you look at those models on those software frameworks, there is somehting there for us a hardware community to pay attention to. I won't say OpenCL is a great example is a great example to cover all forms of parallelism (dense, sparse, async, task) and memory heterogenity is a big deal - how do we cover that? That's a harder problem in my opinion.

06:01PM EDT - Q: About mem power efficiency how do you see 10x required BW/power scaling? A: The opportunities I see is that we have to get compute closer to memory (or memory closer to compute). As I alluded to, we're doing so interesting things with new products, like Rambo cache which we've announced. If you look at both capacity and latency at every level, you do see that 10x opportunity. 10x doesn't seem to hard, but memory is hard, because memory isn't just a hardware/tech problem, it's also a big business problem!

06:04PM EDT - That's the end of the Q&A. Now onto the third session. First up, a RISC-V talk

Comments Locked

19 Comments

View All Comments

  • LordSojar - Monday, August 17, 2020 - link

    Oh goody, we can watch him drive Intel's GPU dreams into the ground, and probably whatever last shred of hope they have for their failing business. When your company is run by MBAs and the guy who brought AMD 8 years of GPU drought, one can expect nothing less.

    I normally would be excited, but anything Intel says at this point is vaporware until it actually shows up, 2-3 years late and 30% too expensive.
  • JayNor - Monday, August 17, 2020 - link

    I believe David Blythe is credited as chief architect for Xe GPUs.
  • Dehjomz - Monday, August 17, 2020 - link

    Dont be such a Debbie downer. Intel innovation is back. So get used to it. The next few years will be exciting. I’m looking forward to seeing what AMD, Arm, Intel, and NVIDIA will do.
  • FreckledTrout - Monday, August 17, 2020 - link

    "Intel innovation is back." Lets not go that far either. Its coming back but it inst back not until they fix the manufacturing side of the house.
  • albertmamama - Monday, August 17, 2020 - link

    Don't count them out yet, but time is running out.
  • Smell This - Wednesday, August 19, 2020 - link


    Smells like the same ol' HD Graphics to me . . . they have tried to 'pretty it up' and added the double ring bus.

    It is still a 10+ year design with roots going all the way back to GMA and the i740 discreet!
  • WaltC - Tuesday, August 18, 2020 - link

    Interesting how everything is "the future" as opposed to "now, now, now"...;) Seems to me Raja's had plenty of time and resources to market something now. Talk about a "presentation of Xe that wasn't"...;) In those last two images of him, Raja seems a tad disgruntled. Frying pan into the fire, eh, Raja?
  • FreckledTrout - Monday, August 17, 2020 - link

    The saying 'No Transistor Left Behind' is taken from no child left behind. The irony is that the no child left behind act made so that the said child had to pass a standardized test or they would be um well left behind. I find the irony of this statement and Intels manufacturing woes to be pretty damn funny.
  • frbeckenbauer - Monday, August 17, 2020 - link

    No X left behind is much older than the No child left behind act
  • tipoo - Monday, August 17, 2020 - link

    No child left behind is based off much older phrases, so I doubt that. No man left behind etc, are all much longer standing than that program.

Log in

Don't have an account? Sign up now