Intel's Core i9 and Skylake-X parts deliver up to 18 cores on the desktop

Intel has been accused of incrementalism in the CPU world—the idea that the company is holding back in order to increase profits and stretch out upgrades. Whether that's true is sort of irrelevant, though if it is, I can only wonder what would have happened to AMD in the past decade. The harsh reality of microprocessor design is that it becomes increasingly more difficult to improve performance each succeeding generation. I discussed some of the reasons behind this in our Processors 101 overview of pipelining and superscalar architectures. Today, Intel takes the shrink-wrap off its latest and greatest collection of CPUs, the Skylake-X and Kaby Lake-X line of enthusiast processors, all of which will run on the Basin Falls platform, which includes the X299 chipset and socket LGA2066.

Since the Extreme Edition of the Pentium 4 first showed up in 2003 (aka the Extremely Expensive P4), there have only been a few instances where Intel's best was topped by AMD. The 2003-2006 era was the last time AMD held the pole position. The launch of the Core 2 back in 2006 put Intel back on top, and it has stayed there—that's eleven years now, though I still have an old Core 2 CPU kicking around the house! After the inauguration of the Core era of Intel CPUs, the best AMD has been able to do is to win in a few specific benchmarks. But all that changed with Ryzen earlier this year.

AMD's Ryzen can offer a legitimate performance alternative to Intel's Core i5 and Core i7. No longer are we talking about the value prospect or why the FX series isn't that bad. Ryzen is actually good—and AMD has plans for Ryzen to scale all the way to 32-core/64-thread per socket on its Epyc server parts. Dropping down a notch, Ryzen Threadripper will deliver up to 16-core/32-thread on AMD's new HEDT (High-End Desktop) platform, X399. That's a true threat to Intel's existing X99 products, and it might have even surpassed Skylake-X as originally planned. The result is that Intel is ready to pull out its big guns, with new Core i9 branding.

I was briefed on Skylake-X and Kaby Lake-X last week, as a preview of what Intel is showing at Computex. There have also been leaks aplenty going around, with the Core i9 name leaking weeks ahead of the official reveal. But what Intel didn't let out, or perhaps hadn't quite planned as far in advance, is that Core i9 is going to be much more than just a couple of extra cores relative to the existing Broadwell-E i7-6950X. The leaks mentioned up to 12-core parts, but that wasn't the whole lineup—there will be Core i9 parts with up to 18-cores/36 threads. Yowza!

Intel claims to have held these details close to its chest to prevent additional leaks…but I'm inclined to think it has more to do with AMD's reveal of the 16-core/32-thread Ryzen Threadripper brand. Intel initially showed us slides that topped out with 12-core processors, even though the presenters knew that up to 18-core would be coming—the slides just "weren't ready in time," suggesting some changes late in the game. A few days later, we received an updated slide deck…

…but even now, Intel hasn't fully detailed the specs for the Skylake-X parts. We know pricing, we know core counts and cache sizes, but clock speeds are still a bit fuzzy. Again, this suggests some relatively recent changes to the product stack to keep Intel ahead of AMD on all fronts.

As with the previous LGA2011 based Core i7 processors, all the new i5, i7, and i9 parts will be fully multiplier unlocked. That means we'll be able to try and push clockspeeds higher than the official spec, but I suspect we'll see maximum all-core clocks on the 18-core part drop quite a bit from what we'll achieve with the 8/10/12-core parts. To offset this, Intel has Turbo Boost Max 3.0, which has received some improvements since the Broadwell-E launch.

Turbo Max 3.0 initially allowed the system to determine which CPU core was 'best'—meaning, able to hit the highest stable clockspeed with the lowest power draw. Working with a CPU driver, under single-threaded workloads, the CPU could then exceed the normal Turbo Boost 2.0 frequency on that core, with up to 4.0GHz available on the i7-6950X and i7-6900K. In practice, it was a lot more problematic, as you needed to enable the feature in your motherboard BIOS (which wasn't always fully enabled), and then you needed to install and run Intel's custom CPU driver on Windows. And since it was only for a single core, there were plenty of situations where it didn't do much at all.

The full 18-core Skylake-X dieshot.

With Skylake-X (note that the feature isn't available on Kaby Lake-X or the 6-core parts), Turbo Max 3.0 support is ready in Windows 10, and the motherboards have all been built around the feature, so it should just work. Perhaps more importantly, Intel has improved Turbo Max 3.0 to support maximum clocks on up to two CPU cores, which increases the number of applications that can see the added clockspeed. But the real kicker is likely to be the implementation on the CPUs with 10 cores or more.

The 10-core i9-7900X has a base clock of 3.3GHz, Turbo Boost of 4.3GHz, and Turbo Max of 4.5GHz. We don't have official clocks for the 12/14/16/18-core parts, but I wouldn't be surprised if Intel has a Turbo Max clock of 4.5GHz across the range. That would be excellent news, because if I'm right the 18-core part will have significantly lower clockspeeds when all/most of the CPU cores are loaded—even 4.0GHz might be asking a lot. This will allow potential buyers to pick a CPU based on core counts rather than maximum clockspeed.

Related to this is that fine-tuning of clockspeeds is one of the advantages that Intel maintains over AMD's Ryzen. You can set the maximum clockspeed based on how many cores are loaded with Intel's CPUs, while on Ryzen, if you overclock any of the cores, you basically overclock all of the cores—and to the same level. It's only at stock clocks that Ryzen varies the CPU clock based on how many cores are active.

Intel has offered varying turbo states for years, and that's not changing with Skylake-X. You can still target higher clocks with fewer active cores, and ideally that means we'll see similar 2/4/6-core maximum stable overclocks (I'm hoping for close to 5GHz with two cores), and then as the number of active cores increases we can dial in lower clocks to keep power and thermals in check. Intel also has an AVX multiplier offset, as AVX workloads can really push the CPU power and thermal load to the limit.

Something to note is that while Skylake-X and LGA2066 (aka socket R4) will be compatible with existing LGA2011 heatsinks, Intel is changing its base recommendation for CPU cooling. Previous enthusiast CPUs have included boxed air coolers, but for Skylake-X Intel will have a liquid cooling solution. That doesn't mean you can't run Skylake-X with air cooling, though as in the past you'll want to be careful with overclocking as the cooler needs to be able to dissipate a lot of heat.

How much heat? LGA2011-v3 CPUs had an official maximum TDP of 140W, and the 6/8/10-core Skylake-X parts have that same 140W TDP. Kaby Lake-X 4-core parts drop the TDP to 112W, but the 12-core model and above increase the TDP to 165W. And that's at stock clocks, as overclocking could easily reach CPU loads of 250W or more. Basically, overclocking will often hit thermal/power limits rather than silicon limits when you're running this many cores, so better cooling can have a tangible impact on clockspeed.

Intel claims the reason it will offer an 18-core desktop part is that the 10-core Broadwell-E was received "really well" in the enthusiast community. I'm not sure I buy that, at all, considering the Steam Hardware Survey shows a market share of 0.01 percent for 10-core systems (and only 0.26 percent for 8-core). There's still the trouble of figuring out what the heck you're going to do that actually needs 10-core or 12-core, let alone 18-core. 4K video editing is an obvious use case, but even that can be done with an 8-core chip (at about half the speed). Content creation in general, including software development and 3D modeling, can leverage the power of additional cores, and VR in the future could put the extra power to use. But right now? 18-core is way more than most of us actually need.

AMD is facing a bit of the same problem with the Ryzen 7 parts—what do we do with 8-core/16-thread as gamers? Outside of content creation, the best gaming use is going to be multitasking. Now you can run a full-quality 4K video encode in the background, using half of the available CPU cores, and you'll still have four to eight cores available to run whatever game you're playing. Twitch streaming with high quality CPU-based video encoding is sort of like this (though the 'high quality' aspect is debatable). But the reality is that most use cases for many-core processors end up being professional applications.

What's clear is that Intel wants to ensure its enthusiast brand CPUs stay at the top of the pecking order, and the company is willing to shave some cash off its margins to do so. This is what competition is all about.

Instead of a 20 percent generational improvement in core counts, we're getting 80 percent more cores.

Intel's enthusiast platform has been coasting for a while now—Gulftown had 6-core/12-thread way back in 2010, and Sandy Bridge-E in 2011 followed by Ivy Bridge-E in 2013 both kept the enthusiast parts (the X79 platform) at a maximum of 6-core/12-thread. Haswell-E in 2014 finally brought out an 8-core part on the X99 platform, and Broadwell-E in 2016 increased that to 10-core. In seven years, Intel has increased 'enthusiast' core counts by 67 percent.

While that's duly impressive, Intel's Xeon server/workstation parts have offered even more cores during the same timeframe. SNB-EP had 8-core (or up to 10-core with Westmere-EX), IVB-EX had 15-core, HSW-EP/EX offered up to 18-core, and finally BDW-EX delivered up to 24-core. In other words, at every single level, Intel has been holding back on the consumer parts (while charging huge premiums on the Xeon server parts). And granted, for many workloads—gaming in particular—there isn't much need to go beyond 8-core (or even 4-core). But AMD's Ryzen Threadripper will have 16 cores, so Intel sees AMD's bet and raises by two. Instead of a 20 percent generational improvement in core counts, we're getting 80 percent more cores.

Pricing is obviously a strong deterrent on most of these parts, but competition helps a lot there as well. When Broadwell-E launched, Intel had no competition at all beyond Core i5, and it could basically do whatever it wanted. The result was that the 8-core i7-5960X that cost $1000 was superseded by the 8-core i7-6900K, which performed a bit better but also had a higher $1089 MSRP. But the real kick in the teeth was the i7-6950X priced at $1723—it was as expensive as competing Xeon CPUs.

Skylake-X and Kaby Lake-X are still high-end processors, but the 10-core i9-7900X part is at least priced roughly the same as the previous generation 8-core parts. Having the 8-core i7-7820X for $599 is also nice, though it's still nearly twice the price of AMD's Ryzen 7 1700. Note also that only the Core i9 parts (i9-7900X and above) will support 44 PCIe Gen3 lanes from the CPU, with the Core i7 SKL-X parts providing 'only' 28 lanes. The 4-core Kaby Lake-X parts take that a step further, with only 16 PCIe lanes and a dual-channel memory interface.

The i5-7640X does offer the interesting prospect of buying into Intel's X299 platform with a CPU that's no more expensive than an i5-7600K, and then you can upgrade in the future to a bunch more cores if you need it. You also lose out on the integrated graphics aspect of the Skylake/Kaby Lake processors, which means no Quick Sync, but I suspect most PC enthusiasts will be happy to make that trade, especially since ditching the graphics cores paves the way for some other changes.

At an architectural level, Skylake-X/Kaby Lake-X should be roughly the same as the mainstream parts that are already on the market. Just to recap, one of the big changes going from Haswell/Broadwell (4th/5th Gen Core) to Skylake/Kaby Lake (6th/7th Gen Core) was a significant overhaul of the CPU pipeline and dispatch elements. All of Intel's CPUs from Nehalem (the original Core i3/i5/i7 architecture) through Haswell used a 4-wide scheduling pipeline. That means they could fetch, decode, and dispatch up to four instructions per clock cycle. Skylake increased this to a 6-wide architecture, adding additional execution units (ALUs) in the process.

Skylake-X inherits that same wider architecture, which on the LGA115x parts allowed Skylake to beat Haswell by around 15-20 percent at the same clockspeed. But with no graphics present on Skylake-X, Intel has room to add in other features, and it has decided the best approach is to quadruple the L2 cache from 256kB in Skylake to 1024kB in Skylake-X. At the same time, L3 cache sizes have been decreased, but the L3 cache is now a non-inclusive cache. That means the L3 cache can be devoted completely to data that gets evicted from the L2 cache.

Intel says this rebalancing of the cache hierarchy improves overall performance, but it's difficult to say how much it will change things. Larger L2 cache sizes could mean higher latencies for the L2 cache, or it may not. Intel has traditionally been aggressive on the cache side of things, and with a wider architecture than earlier cores, the larger L2 cache could yield a significant performance boost, perhaps another 5-10 percent faster than a 256kB L2 implementation.

All of the changes to the CPU side of things are great to see, but the chipset is also getting some new features. Specifically, the X299 chipset will have an additional 24 PCIe Gen3 lanes—though like Z170/Z270, the DMI 3.0 link to the CPU is equal to four PCIe lanes, so anything running off the chipset will have to share that bandwidth. The socket also gets a change, moving from 2011 pins to 2066 pins, with an official name of socket R4, though LGA2066 will likely be the more common name.

DDR4 memory support continues to be the standard, which is the same as the X99 platform. The good news is that officially Intel supports up to DDR4-2666, but even Haswell-E systems routinely hit memory speeds well in excess of DDR4-3200, so unofficially memory speeds probably won't change as much. DDR4 pricing is also much better than when Haswell-E launched back in 2014, where a quad-channel 16GB kit of DDR4-2133 cost around $250, and enthusiast grade DDR4-3200 kits could easily land closer to $1000. DDR4 prices have risen since last summer (thanks to increased use by mainstream platforms, along with smartphones and tablets), but at least 4x4GB kits of DDR4-3200 can now be found for under $150.

Another platform change is that X299 will support Intel's Optane Memory initiative, which leverages Intel's high performance, low latency 3D XPoint Technology to provide a fast SSD cache for your hard drive. I reviewed the tech for Maximum PC magazine last month, but came away wanting something more. For the price of a 32GB Optane module and 1TB hard drive, you can instead just get a 512GB-class traditional SSD and get what I feel is a better overall experience.

Optane Memory still has potential, but I'd be more interested in a 64GB cache that works across all hard drives rather than only caching the C: drive. Plus, the first time you access a program (or access a program after not using it for a while), you're back to HDD performance. Basically, I want large Optane SSDs for consumers, not small Optane caches.

Putting it all together

Ultimately, Skylake-X, Kaby Lake-X, and the LGA2066 platform are about performance. Intel now has to compete against a revived AMD, thanks to Ryzen 7 and Ryzen Threadripper, which means we'll actually see somewhat affordable pricing—key word: somewhat.

I'm far less interested in Kaby Lake-X than in Skylake-X, which means the entry point for the new platform starts at $389 for the i7-7800X. That's a 6-core/12-thread part clocked at 3.5-4.0GHz. Compared to the previous generation i7-6800K, which is basically what it replaces, the i7-7800X increases clockspeeds by up to 11 percent. Conservatively, I expect the microarchitecture updates and larger L2 cache to improve performance by another 15-20 percent, which means generationally the i7-7800X should be 25-35 percent faster, and another 5 percent faster on top of that relative to the earlier i7-5820K. If you're sitting on a Haswell-E system, or an older IVB-E/SNB-E build, that should be enough to catch your interest.

The next step up is the i7-7820X, and 8-core/16-thread part clocked at 3.6-4.3GHz (Turbo Max 3.0 allows 4.5GHz). Relative to the i7-6900K, clockspeed is up to 16 percent higher, and again architectural improvements mean overall it should be more like 35-40 percent faster. For the slightly older i7-5960X, add on another five percent or so. Even better, where the previous 8-core parts were priced at $1000 or more, the i7-7820X will 'only' cost $599. Thanks, AMD! Note that AMD's Ryzen 7 chips offer the same core counts, though architectural differences mean that clock for clock Ryzen is still generally a bit slower, and overclocking may favor Intel's parts by 800MHz or more.

Last on the parts where we have specs right now, the i9-7900X is the first of the Core i9 family, with 10-core/20-thread running at 3.3-4.3GHz (4.5GHz TM3). Relative to the i7-6950X, Intel has boosted clocks by up to 23 percent, so potentially we may see performance up to 50 percent faster. Occupying the $999 price point, this is clearly a CPU destined for extreme builds…but if you're willing to consider the 7900X, the other Core i9 parts certainly warrant a look.

Moving up the scale, every additional $200-$300 nets you two more CPU cores. Base clocks will probably drop by 100-200MHz at each level, which means the i9-7980XE will probably have some insane clock range of 2.6-4.3GHz, but enthusiasts can probably get 3.8GHz on all 18 cores with sufficient cooling. Under the right workloads—basically, stuff that can actually exploit all 36-threads—I expect the i9-7980XE to come close to doubling the performance of the current i7-6950X. Video editors, rejoice!

As one final tease, Intel says 8th Generation Core is on track for early next year. We didn't receive any real details, but I suspect this refers to Coffee Lake and the 6-core offerings coming to LGA1151.

Intel should officially launch the first salvo of Skylake-X and Kaby Lake-X parts in late June, at which time we'll hopefully have the full specs and performance analysis. Intel has indicated the various Basin Falls chips will roll out in phases, which means we may not see 12-core and above until a bit later this summer (within a few weeks), so stay tuned for more details. Meanwhile, AMD's Threadripper will be playing counterpoint to Core i9. There's no word on AMD pricing yet, but it's shaping up to be a massive throwdown for high-end processors this summer. If you've been longing for more cores at affordable prices, the wait is finally over.