GCN 1.2: Geometry Performance & Color Compression

Instruction sets aside, Radeon R9 285 is first and foremost a graphics and gaming product, so let’s talk about what GCN 1.2 brings to the table for those use cases.

Through successive generations of GPU architectures AMD has been iterating on and improving their geometry hardware, both at the base level and in the case of geometry generated through tessellation. This has alternated between widening the geometry frontends and optimizing the underlying hardware, with the most recent update coming in the GCN 1.1 based Hawaii, which increased AMD’s geometry processor count at the high end to 4 processors and implemented some buffering enhancements.

For Tonga AMD is bringing that 4-wide geometry frontend from Hawaii, which like Hawaii immediately doubles upon Tahiti’s 2-wide geometry frontend. Not stopping there however, AMD is also implementing a new round of optimizations to further improve performance. GCN 1.2’s geometry frontend includes improved vertex reuse (for better performance with small triangles) and improved work distribution between the geometry frontends to better allocate workloads between them.

At the highest level Hawaii and Tonga should be tied for geometry throughput at equivalent clockspeeds, or roughly 2x faster than Tahiti. However in practice due to these optimizations Tonga’s geometry frontend is actually faster than Hawaii’s in at least some cases, as our testing has discovered.

Comparing the R9 290 (Hawaii), R9 285 (Tonga), and R9 280 (Tahiti) in TessMark at various tessellation factors, we have found that while Tonga trails Hawaii at low tessellation factors – and oddly enough even Tahiti – at high tessellation factors the tables are turned. With x32 and x64 tessellation, the Tonga based R9 285 outperforms both cards in this raw tessellation test, and at x64 in particular completely blows away Hawaii, coming close to doubling its tessellation performance.

At the x64 tessellation factor we see the R9 285 spit out 134fps, or equivalent to roughly 1.47B polygons/second. This is as compared to 79fps (869M Polys/sec) for the R9 290, and 68fps (748M Polys/sec) for the R9 280. One of the things we noted when initially reviewing the R9 290 series was that AMD’s tessellation performance didn’t pick up much in our standard tessellation benchmark (Tessmark at x64) despite the doubling of geometry processors, and it looks like AMD has finally resolved that with GCN 1.2’s efficiency improvements. As this is a test with a ton of small triangles, it looks like we’ve hit a great case for the vertex reuse optimizations.

Meanwhile AMD’s other GCN 1.2 graphics-centric optimization comes at the opposite end of the rendering pipeline, where the ROPs and memory controllers lie. As we mentioned towards the start of this article, one of the notable changes between the R9 280 and R9 285 is that the latter utilizes a smaller 256-bit memory bus versus the R9 280’s larger 384-bit memory bus, and as a result has around 27% less memory bandwidth than the R9 280. Under most circumstances such a substantial loss in memory bandwidth would result in a significant performance hit, so for AMD to succeed Tahiti with a smaller memory bus, they needed a way to be able to offset that performance loss.

The end result is that GCN 1.2 introduces a new color compression method for its ROPs, to reduce the amount of memory bandwidth required for frame buffer operations. Color compression itself is relatively old – AMD has had color compression in some form for almost 10 years now – however GCN 1.2 iterates on this idea with a color compression method AMD is calling “lossless delta color compression.”

Since AMD is only meeting us half-way here we don’t know much more about what this does. Though the fact that they’re calling it delta compression implies that AMD has implemented a further layer of compression that works off of the changes (deltas) in frame buffers, on top of the discrete compression of the framebuffer. In this case this would not be unlike modern video compression codecs, which between keyframes will encode just the differences to reduce bandwidth requirements (though in AMD’s case in a lossless manner).

AMD’s own metrics call for a 40% gain in memory bandwidth efficiency, and if that is the average case it would more than make up for the loss of memory bandwidth from working on a narrower memory bus. We’ll see how this plays out over our individual games over the coming pages, but it’s worth noting that even our most memory bandwidth-sensitive games hold up well compared to the R9 280, never losing anywhere near the amount of performance that such a memory bandwidth reduction would imply (if they lose performance at all).

Tonga’s Microarchitecture – What We’re Calling GCN 1.2 GCN 1.2 – Image & Video Processing
Comments Locked


View All Comments

  • chizow - Thursday, September 11, 2014 - link

    No issues with Boost once you slap a third party cooler and blow away rated TDP, sure :D

    But just as I said, AMD's rated specs were bogus, in reality we see that:

    1) the 285 is actually slower than the 280 it replaces, even in highly overclocked factory configurations (original point about theoretical performance, debunked)
    2) the TDP advantages of the 285 go away, at 190W target TDP AMD trades performance for efficiency, just as I stated. Increasing performance through better cooling results in higher TDP, lower efficiency, to the point it is negligible compared to the 280.

    Its obvious AMD wanted the 285 to look good on paper, saying hey look, its only 190W TDP, when in actual shipping configurations (which also make it look better due to factory OCs and 3rd party coolers), it draws power closer to the 250W 280 and barely matches its performance levels.

    In the end one has to wonder why AMD bothered. Sure its cheaper for them to make, but this part is a downgrade for anyone who bought a Tahiti-based card in the last 3 years (yes, its nearly 3 years old already!).
  • bwat47 - Thursday, September 11, 2014 - link

    I think the main reason for this card, was to bring things up to par when it comes to features. the 280 (and 280x), were rebadged older high end cards (7950, 7970), and whole these offered great value when it came to raw performance, they are lacking features that the rest of the 200 series will support, such as freesync and trueaudio, and this feature discrepancy was confusing to customers. It makes sense to introduce a new card that brings things up to par when it comes to features. I bet they will release a 285x to replace the 280x as well.
  • chizow - Thursday, September 11, 2014 - link

    Marginal features alone aren't enough to justify this card's cost, especially in the case of FreeSync which still isn't out of proof-of-concept phase, and TrueAudio, which is unofficially vaporware status.

    If AMD released this card last year at Hawaii/Bonaire's launch at this price point OR released it nearly 12 months late now at a *REDUCED* price point, it would make more sense. But releasing it now at a significant premium (+20%, the 280 is selling for $220, the 285 for $260) compared to the nearly 3 year old ASIC it struggles to match makes no sense. There is no progress there, and I think the market will agree there.

    If it isn't obvious now that this card can't compete in the market, it will become painfully obvious when Nvidia launches their new high-end Maxwell parts as expected next month. The 980 and 970 will drive down the price on the 780, 290/X and 770, but the real 285 killer will be the 960 which will most likely be priced in this $250-300 range while offering better than 770/280X performance.
  • Alexvrb - Thursday, September 11, 2014 - link

    You just can't admit to being wrong. It maintains boost fine - end of story. That's what I disagreed with you on in the first place. No boost issues - the 290 series had thermal problems. Slapping a different cooler doesn't raise TDP, it just removes obstacles towards reaching that TDP. With an inadequate cooler, you're getting temp-throttled before you ever reach rated TDP. Ask Ryan, he'll set you straight.

    On top of this, depending on which model 285 you test, some of them eat significantly less power than an equivalent 280. Not all board partners did an equal job. Look at different reviews of different models and you'll see different results. Also, performance is better than I figured it would be, and in most cases it is slightly faster than 280. Which again, I never figured would happen and never claimed.
  • chizow - Friday, September 12, 2014 - link

    Who cares what you disagreed with? The point you took issue was a corollary to the point I was making, which turned out to be true about the theoreticals being misstated and inaccurate when basing any conclusion about the 285 being faster than the 280.

    As we have seen:

    1) The 285 barely reaches parity but in doing so, it requires a significant overclock which forces it to blow past its rated 190W TDP and actually draws closer to the 250W TDP of the 280.

    2) It requires a 3rd party cooler similar to the one that was also necessary in keeping the 290/X temps in check in order to achieve its Boost clocks.

    As for Ryan setting me straight, lmao, again, his temp tests and subtext already prove me to be correct:

    "Note that even under FurMark, our worst case (and generally unrealistic) test, the card only falls by less than 20Mhz to 900MHz sustained."

    So it does indeed throttle down to 900MHz even with the cap taken off its 190W rated TDP and a more efficient cooler. *IF* it was limited to a 190W hard TDP target *OR* it was forced to use the stock reference cooler, it is highly likely it would indeed have problems maintaining its Boost, just as I stated. AMD's reference specs trade performance for efficiency, once performance is increased that efficiency is reduced and its TDP increases.
  • Alexvrb - Friday, September 12, 2014 - link

    Look, I get it, you're an Nvidia fanboy. But at least you admitted you were wrong, in your own way, finally. It sustains boost fine. Furmark makes a lot of cards throttle - including Maxwell! Whoops! Should we start saying that Maxwell can't hold boost because it throttles in Furmark? No, because that would be idiotic. I think Maxwell is a great design.

    However, so is Tonga. Read THG's review of the 285. Not only does it slightly edge out the 280 on average performance, but it uses substantially less power. Like, 40W less. I'm not sure what's Sapphire (the card reviewed here) is doing wrong exactly - the Gigabyte Windforce OC is fairly miserly and has similar clocks.
  • chizow - Saturday, September 13, 2014 - link

    LOL Nvidia fanboy, that's rich coming from the Captain of the AMD Turd-polishing Patrol. :D

    I didn't admit I was wrong, because my statement to any non-idiot was never dependent on maintaining Boost in the first place, but I am glad to see not only is 285 generally slower than the 280 without significant overclocks, it still has trouble maintaining Boost despite higher TDP than the rated 190W and a better than reference cooler.

    You could certainly say Maxwell doesn't hold boost because it throttles in Furmark, but that would prove once and for all you really have no clue what you are talking about since every Nvidia card they introduced since they invented Boost has no problems whatsoever hitting their rated Boost speeds even with the stock reference blower designs. The difference of course, is that Nvidia takes a conservative approach to their Boost ratings that they know all their cards can hit under all conditions, unlike AMD which takes the "good luck/cherry picked" approach (see: R290/290X launch debacle).

    And finally about other reviews lol, for every review that says the 285 is better than the 280 in performance and power consumption, there is at least 1 more that echo the sentiments of this one. The 285 barely reaches parity and doesn't consume meaningfully less power in doing so. But keep polishing that turd! This is an ASIC only a true AMD fanboy could love some 3 years after the launch of the chip it is set to replace.
  • chizow - Saturday, September 13, 2014 - link

    Oh and just to prove I can admit when I am wrong, you are right, Maxwell did throttle and fail to meet its Boost speeds for Furmark, but these are clearly artificially imposed driver limitations as Maxwell showed it can easily OC to 1300MHz and beyond:


    Regardless, any comparisons of this chip to Maxwell are laughable, Maxwell introduced same performance at nearly 50% reduction in TDP or inversely, nearly double the performance at the same TDP all at a significantly reduced price point on the same process node.

    What does Tonga bring us? 95-105% of R9 280's performance at 90% TDP and 120% of the price almost 3 years later? Who would be happy with this level of progress?

    Nvidia is set to introduce their performance midrange GM104-based cards next week, do you think ANYONE is going to draw parellels between those cards and Tonga? We already know what Maxwell is capable of and it set the bar extremely high, so if GTX 970 and 980 come anywhere close to those increases in performance and efficiency, this part is going to look even worst than it does now.
  • Alexvrb - Saturday, September 13, 2014 - link

    You were wrong about it being unable to hold boost, you claimed that GCN 1.1 can't hold boost despite clear evidence to the contrary. Silly. Then you were wrong about Maxwell and Furmark - though you kind of admitted you were wrong.

    Regarding that being a "driver limitation" you can clock a GPU to the moon, and it's fine until it gets a heavy load. However most users won't even know they're being throttled. I had this same discussion YEARS ago with a Pentium 4 guy. You can overclock all you want - when you load it up heavy, it's a whole new game. In that case the user never noticed until I showed him his real clocks while running a game.

    Tonga averages a few % higher performance, dumps less heat into your case, and uses less power. Aside from this Sapphire Dual X, most 285 cards seem to use quite a bit less power, run cool and quiet. With all that being said, I think the 280 and 290 series carry a much better value in AMD's lineup. I'm certainly not a fanboy, you're much closer to claiming that title. I've actually used mostly Nvidia cards over the years. I've also used graphics from 3dfx, PowerVR, and various integrated solutions. My favorite cards over the years were a trusty Kyro II and a GeForce 3 vanilla, which was passively cooled until I got ahold of it. Ah those were the days.
  • chizow - Monday, September 15, 2014 - link

    No, I said being a GCN 1.1 part meant it was *more likely* to not meet its Boost targets, thus overstating its theoretical performance relative to the 280. And based on the GCN 1.1 parts we had already seen, this is true, it was MORE LIKELY to not hit its Boost targets due to AMD's ambiguous and non-guaranteed Boost speeds. None of this disproved my original point that the 285's theoreticals were best-case and the 280's were worst-case and as these reviews have shown, the 280 would be faster than the 285 in stock configurations. It took an overclocked part with 3rd party cooling and higher TDP (closer to the 280) for it to reach relative parity with the 280.

    Tonga BARELY uses any less power and in some cases, uses more, is on par with the part it replaces, and costs MORE than the predecessor part it replaces. What do you think would happen if Nvidia tried to do the same later this week with their new Maxwell parts? It would be a complete and utter disaster.

    Stop trying to put lipstick on a pig, if you are indeed as unbiased as you say you are you can admit there is almost no progress at all with this part and it simply isn't worth defending. Yes I favor Nvidia parts but I have used a variety in the past as well including a few highly touted ATI/AMD parts like the 9700pro and 5850. I actually favored 3dfx for a long time until they became uncompetitive and eventually bankrupt, but now I prefer Nvidia because much of their enthusiast/gamer spirit lives on and it shows in their products.

Log in

Don't have an account? Sign up now