More Details on Overclocking Nehalem

Rob Williams

Editor-in-Chief
Staff member
Moderator
From our front-page news:
At an Intel briefing today, we learned a lot more about what to expect when benchmarking Nehalem, and much of it answered the exact questions that have been lingering in my mind for some time. If you read our overview on Nehalem already, then you know the benefits of the tri-channel memory, but what about overclocking?

First and foremost, while the 'ideal' memory configuration for a high-end Yorkfield is 2x1GB DDR3-1600, the ideal solution for Nehalem will be 3x1GB DDR3-1066. Seems weak, but if you read my article last night (and I do recommend it), then you'd know that it's far from being the weak link here. It effectively removes any potential bottleneck, and in most regards, the I/O becomes the new bottleneck (one that's not really seen with RAID'ing multiple SSD though!).

How will you overclock memory on Nehalem, or the CPU for that matter? Well, I'll admit I still don't totally understand how memory is overclocked, or how the frequency is even calculated, but Intel stresses that the skies the limit. The chipset and CPU shouldn't be the weak link, rather it would be the modules themselves.

Going beyond DDR3-2000 speeds should be entirely possible. You might run into weird issues which will likely not be visible with regards to strange dividers, but the overall performance really wouldn't reflect it. That's something we'll specifically have to test once the chip hits the lab.

intel_nehalem_die_shot_062408.jpg


Contrary to what I mentioned in yesterday's article (oops), the Turbo Mode -does- have something to do with CPU overclocking, but it's a bit odd to explain. Turbo Mode will not be activated in the traditional state during an overclock, but in the BIOS, there will be a Turbo Mode that allows you to increase the figure to increase the overclock. Increasing the Turbo Mode will supposedly be an unlimited affair, but I'm still unsure what exactly that number is going to be based on, but I can definitely say that it's nothing to do with the QPI.

QPI is another thing. It can be overclocked, but Intel highly recommends to not adjust the 133MHz figure, and as far as I'm aware, motherboard vendors are asked to make it clear that adjusting it is dangerous. Even Intel themselves are unsure of what could happen with a highly overclocked QPI over time, but the results are apparently not representative of an ideal system.

There are still a lot of questions hovering around overclocking on Nehalem, and they won't likely be fully answered without real hands-on time with a machine. What I can state with extreme confidence is that Nehalem will be highly overclockable, and enthusiast overclockers will have little to complain about. I've seen ES Extreme Edition samples running at 4.0GHz on a modest air cooler, and I feel rather confident that production samples will act similarly.
 

Kougar

Techgage Staff
Staff member
Turbo Mode is based around the chip's overall TDP. So, say some cores are idle and get shut down, this increases the TDP margin significantly. If the user has a single-threaded program running, the remaining core can overclock itself up to the point it reaches the original design TDP envelope, but doesn't exceed it. So in effect truthfully nothing has changed and the TDP/overall power draw stays the same.

I think the overclocking question isn't how will Nehalem scale, but how to overclock it. Were those 4GHz Nehalem chips Extreme editions? ;)

From the gist of what you are saying about not adjusting the QPI figure, that just leaves the CPU Multiplier.

Oh, slightly off topic now. I found a SINGLE Ultra Low Latency DDR3 kit. Kingston makes (Or I should say made, it was discontinued) a DDR3-1375MHz CAS 5 kit. 5-7-5-15 to be precise. The motherboards reviewers tested it out on did not even offer timing settings below 5-7-5-15.
 

Rob Williams

Editor-in-Chief
Staff member
Moderator
Turbo Mode is based around the chip's overall TDP. So, say some cores are idle and get shut down, this increases the TDP margin significantly. If the user has a single-threaded program running, the remaining core can overclock itself up to the point it reaches the original design TDP envelope, but doesn't exceed it. So in effect truthfully nothing has changed and the TDP/overall power draw stays the same.

You will also be able to change the TDP limit inside the BIOS in order to keep the overclock successful. Part of the new power scheme is improved throttling. If you set a 180W TDP in the BIOS and your overclock begins to exceed it, it will begin to throttle considerably, which will be evident by a red LED around the CPU socket.

Although I don't believe the tool will become available from Intel, they showed the throttling real-time. The tool showed all eight threads, basically running at differing speeds as a process is run. It's a very quick process... some threads might keep at around 3.2GHz, but at most times the CPU would max out at the 4.0GHz overclock.

I think the overclocking question isn't how will Nehalem scale, but how to overclock it. Were those 4GHz Nehalem chips Extreme editions? ;)

It's hard to say. Will the multiplier be limited? Perhaps. Will the Turbo Mode be? Probably not. I think overclocking non-EE Nehalem CPUs will be incredibly similar to overclocking today's non-EE Yorkfield/Penryn chips.

From the gist of what you are saying about not adjusting the QPI figure, that just leaves the CPU Multiplier.

I couldn't figure this out, but no, I don't think it has anything to do with the multiplier. There will be something else that will be touched, but not the 133MHz figure. For whatever reason, the QPI does improve performance in some regards, but I'm still unsure what exactly. Increasing the QPI will apparently make considerable differences in 3DMark Vantage's CPU test, but who said that wasn't referring to the 133MHz.

Oh, slightly off topic now. I found a SINGLE Ultra Low Latency DDR3 kit. Kingston makes (Or I should say made, it was discontinued) a DDR3-1375MHz CAS 5 kit. 5-7-5-15 to be precise. The motherboards reviewers tested it out on did not even offer timing settings below 5-7-5-15.

Don't kid yourself... there will be a far harder push for lower-latency kits, because it's going to be needed for the sake of competition. If raw frequency isn't going to be highly drool-worthy as it always has been up to this point, then memory vendors are going to demand better chips from the likes of Qimonda, Samsung and others.

By the time Nehalem comes out, I wouldn't be surprised to see some other DDR3-1333 5-x-x-15 kits available.
 

Kougar

Techgage Staff
Staff member
Hey Rob :)

I was saying that because the QPI frequency is multiplied by the CPU multiplier, which gives the final CPU clockspeed. So for overclocking we would need to overclock the QPI link, which is similar to how AMD's HTT link overclocking works.

No one anywhere has stated this "Turbo Mode" would scale as far as 4GHz... If that turns out to be the case then I suspect the majority of users will stick to that, I'd certainly be considering it myself! :)

One site reported not only can users enable/disable Turbo Mode from within the BIOS, but they can also set the TDP envelope manually which you've confirmed... at least within reason according to the cooling capabilities of the system build. Glad I use water!

As each individual core controls it's own power gating and has it's own dedicated PLL chip, as ya said they probably use something internal to the CPU to control Turbo Mode frequencies. Likely so motherboard makers can't use it to unlock carte blanche overclocking. :D

Again you're correct I'm sure about ultra low latency DDR3 coming out, but when and at what price is my concern. Those Kingston modules were $800 kit for 2GB.

Honestly I didn't realize CAS 5 was possible. That was the only CAS 5 memory Kingston had and also the only such kit in existance. Very few offer even CAS 6 DDR3-1333 kits.

As I recall Elpida chips were good for lower timings and lower voltages but would not clock beyond 1500MHz. They got superceeded by more Micron chips, which just like the D9's need volts and high frequencies but suck at tight timings.
 

Kougar

Techgage Staff
Staff member
Okay, I'm somewhat floored. Intel went back to the 80's and did away with the domino dynamic logic design, and went back to using a static CMOS transistor design. I had no idea what this is, except it allowed them to scale frequencies at the expense of power.

Long technical story short in order to maintain current frequencies they had to create something new, the PCU. This is a Power Control Unit that is a little over 1 million transistors in size and gives the CPU itself the ability to be self-aware. The PCU has it's own embedded firmware and is designed to measure inputs from sensors all over the CPU on voltage, frequency, temperatures, and gives the OS something to talk to for power schemes, or even theoretically say for direct monitoring in realtime of this info by the user.

The PCU makes the CPU self-aware of it's entire operating environment, and is what actually allows Turbo Mode to even be possible. And as Anandtech mentioned this self-awareness allows the CPU to decide when to power down to lower power states, not when the OS thinks it should. Anyone that has experience with XP and Vista's idea of power schemes knows why this is a smart idea! I can't wait for someone to design software to poll the PCU for all of that data, exact CPU temps, volts, and frequencies can be known. Or while not very likely to happen, even for manually setting power states, voltages, or Turbo Mode settings, etc...
 

Rob Williams

Editor-in-Chief
Staff member
Moderator
Good thoughts! Intel was questioned about why they'd dedicate another million transistors to something like that, but they feel extremely confident that it's one of the smartest things that has been added to Nehalem.
 

Kougar

Techgage Staff
Staff member
Well, it was a smart idea yes, but from the way I read it the PCU was a requirement to ensure the silicon could maintain current 3GHz frequencies. Changing from domino dynamic logic gates back to the static CMOS gates used in late 80's microchips would have impacted the attainable frequencies otherwise.

And, I think this is exactly why: 4GHz at 1.7 volts? That tells you something like this has changed http://www.pcgameshardware.com/aid,...d_producer_explains_details_for_overclocking/ Hopefully overclocking on final boards+silicon won't be near as drastic... but I suspect the "free lunch" people have enjoyed with FSB overclocking is at an end.
 
Top