Intel Core i7 Performance Preview

Rob Williams

Editor-in-Chief
Staff member
Moderator
Finally, almost two years after first learning about Nehalem, the launch date is almost upon us. To help whet some appetites, read our in-depth look at i7 performance see what's en route!

With Core i7's launch due in just a few weeks, there's no better time than right now to take a hard look at its performance, which is what we're taking care of today. In addition to our usual performance comparisons with last-gen CPUs, we're also taking an in-depth look at both QPI and HyperThreading performance, and some of our results may surprise you.

You can read the full preview here, and discuss it here.
 

Kougar

Techgage Staff
Staff member
Thank you for the article! I was happy to see the gaming issue looked at in comparison to the QX9770 here, and using real world game settings. 2-5FPS is nothing a tiny OC on the GPU won't fix, but it is interesting to see Core i7 is being bottlenecked somewhere... Some of the graphs almost make it look like the slower 920/940 QPI bus might be behind that? CPU performance and memory performance certainly are not the bottlenecks, at any rate.

Regarding your overclocking, 4GHz appears to be typical, especially the borderline stability you were experiencing in the review. Take a look at those power consumption figures with the stock/overclock, as it seems Core i7 is a power sink. Madshrimps tested it out on a phase change cooler, the heat load at 4GHz with 1.4v was even giving that thing problems. -32C idle to 0C load and still not 100% stable, despite using low voltages.

Thanks a bunch for getting those die surface area sizes, where'd you find those at? Less transistors but still a larger die size, I was assuming that would happen but was not sure. The reason why is Intel changed how they built the cache transistors, (increasing the surface footprint to lower voltages needed) and to make an assmption I believe also because they switched from Dynamic logic gating to static CMOS logic gating.
 

Rob Williams

Editor-in-Chief
Staff member
Moderator
I agree, a small difference in overall FPS won't make much of a real-world difference (especially given how powerful our GPUs are nowadays), but the fact that there is a decrease at all concerns me. I'll be testing out gaming performance more this week, with a variety of different titles, and see if the performance decreases are common.

I've heard rumors that HyperThreading may be to blame for some of the games, and that seems somewhat reasonable, but I did a quick test with HT disabled in CoD4 and still experienced the same FPS. This is definitely an issue that I want to spend more time on, though.

Kougar said:
Some of the graphs almost make it look like the slower 920/940 QPI bus might be behind that?

I'm not sure exactly what you are referring to, but I'm pretty confident the QPI has little to do with gaming performance. Though I haven't tested it out directly yet, our two pages in the review that cover QPI and HT performance show pretty much zero difference when the QPI was decreased. I assume it's going to be the same for gaming, but again, that's something I'll test out later this week.

Kougar said:
4GHz appears to be typical, especially the borderline stability you were experiencing in the review. Take a look at those power consumption figures with the stock/overclock, as it seems Core i7 is a power sink.

I saw that Madshrimps article last night and was glad to see it wasn't just me. It's kind of upsetting though. To be honest, even though I like to see a good overclock, this issue doesn't bother me that much, because by itself, even the Core i7 920 managed to outperform the QX9770 in some cases. That's a lot of power, and it almost makes you wonder if overclocking is really that necessary.

I don't think Intel deliberately haltered overclocking potential, but it comes with the territory of this new architecture. There is so much more at play here than with Core 2, and not all of the components like to be overclocked like the FSB used to. Now we have to deal with the Base Clock, Uncore Clock, QPI.. et cetera... many more factors that need to be taken into consideration.

The fact that Madshrimps couldn't get stable at all much past 4.0GHz on PHASE pretty much says it all, I'm afraid. It doesn't look like I got a dud CPU, after all. I am still a little frustrated that Intel had a 4.0GHz running way back in June, on a quiet air cooler. I remember taking a look at that and being floored, because the air cooler was SO quiet, I thought the PC was turned off before they showed me the inside.

How they managed to hit 4.0GHz, I'm unsure. Disabling HT -does- help a little bit, but I clearly remember them having it on during this particular test, and they were not only letting a CPU-Z screenshot sit there either... they were manipulating photos on the fly. It's really frustrating!

Kougar said:
Thanks a bunch for getting those die surface area sizes, where'd you find those at?

Believe it or not... Intel ;-)

As for your theory on surface area, it seems logical to me. I spent a good half-hour trying to just figure that out, and really couldn't. The die itself seems to be a bit wider overall than Core 2 Quad, but I don't think that by itself would magically increase the die size.
 

Kougar

Techgage Staff
Staff member
Exactly what you said... it is not that 5-12 FPS really matters very much, but that there is a decrease at all isn't good. The motherboards cost a fortune, the CPUs cost a hefty amount, and most users are going to need to buy 3GB or 6GB DDR3 kits... for a CPU that is lackluster in games, doesn't OC well without a CPU multiplier unlock, and offers only steady gains over the QX9770 in most tasks.... it's just silly.

Buying a $180 Quad and OCing it to 3.6-3.8GHz without having to upgrade anything else for the same or better performance in most tasks strikes much more common sense, unfortunately. I don't think I'm going to buy into Nehalem, as much as I want a true SLI/Xfire motherboard I'm not willing to fork it out for an entire new computer to get a $300 board, when the board I MIGHT have done it for costs $400. It'll be interesting to see what board prices do after Xmas.

I've heard rumors that HyperThreading may be to blame for some of the games, and that seems somewhat reasonable, but I did a quick test with HT disabled in CoD4 and still experienced the same FPS. This is definitely an issue that I want to spend more time on, though.

My theory is wrong, it's actually the L2 cache to blame. Remember games typically show performance gains with Core 2 cache improvements or size increases... suddenly the L2 cache is downsized from 6MB/12MB to 256KB, and the data is moved to a much slower L3 cache... this was supposedly a contributing factor with AMD's Phenom gaming performance as well.

I don't think Intel deliberately haltered overclocking potential, but it comes with the territory of this new architecture.

I fully agree. Steps for overclocking are the same, just the names have changed for the most part, with some new variables thrown in.

About the surface size areas, I don't remember where I read it. It might have been Real World Technologies where it was mentioned Intel creased the (Physical) cache cell dimensions, in order to allow it to use a lower voltage for data retention. The Atom CPU does work this way. And Intel DID change from dynamic logic circuits to static CMOS logic circuits, something that's older tech but allows the CPU to run cooler on lower voltages at the sacrifice of higher frequency potential. I just wasn't sure about the part on if that was specifically why the die surface area was larger despite the loss of transistor count.
 

Rob Williams

Editor-in-Chief
Staff member
Moderator
Personally, I don't think the slight lack in gaming performance would keep me away from upgrading to Nehalem, but it all depends on personal needs. I love gaming, but I do a fair amount of photo manipulation, multi-media encoding, et cetera, and if there is a processor out there that can storm through these tasks faster, I'd be all for it.

Overclocking, like you said, is another thing though. I don't tend to run insane overclocks, but it's definitely a cheaper route than Core i7 at this point in time. It's too bad... Intel definitely didn't manage to please everyone with this launch like they pulled off with Core 2.

As for L2 Cache, that makes a lot of sense... thanks for pointing it out. I'm actually curious as to why Intel decided upon adding an L3 cache, as the previous solution seemed to be working fine. It might improve speed to some degree, but other general enhancements, along with HT, seem to be far outweighing whatever L3 could have brought to the table. If I'm completely overlooking something, feel free to call me out on it.
 

Kougar

Techgage Staff
Staff member
As for L2 Cache, that makes a lot of sense... thanks for pointing it out. I'm actually curious as to why Intel decided upon adding an L3 cache, as the previous solution seemed to be working fine. It might improve speed to some degree, but other general enhancements, along with HT, seem to be far outweighing whatever L3 could have brought to the table. If I'm completely overlooking something, feel free to call me out on it.

Well, I'm just trying to better understand the CPU and it's capabilities and limitations. So knowing a good deal about it, even if not the specific inner workings lets me do that. This just happens to be precisely the issue constraining Intel's server performance. :cool:

Just to set the foundation, ya well know the previous Intel solution was to build two dual-core dies for a quad, each with their own 4MB or 6MB L2 cache. This meant for every Quadcore CPU the FSB became saddled with handling cache coherency traffic. Say for example a 3 or 4 thread task was being processed. Several of the cores will likely constantly need to read or write data to the L2, and both L2 banks would need to poll the RAM for the same dataset. If core 1 modified the data in it's L2 and core 4 needed that data to continue calculating it will need to check it's L2, then send a request to check if the same data in the other L2 cache was "newer" or modified. If core 1 did have the newest timestamp on that data, core 4 needs to copy it back into its own L2 cache so it can finish whatever it was computing. That time delay with cores waiting on each other is what was costing Intel big time.

Individually for a single desktop quad-CPU overhead and time penalties involved would be negligible. But for the server market, imagine 4 quadcores, each individual die on each dual-die pacakge... 4 sockets, 16 cores. That's pretty typical, if a bit small compared to some configs. But that would make 8 individual L2 caches that must not only poll memory, but poll each other for partially worked on datasets, every time one core changes the data but doesn't send it to a bank of system memory yet. You see where I'm going with this... ;) Doing all of that over the FSB just made an inefficient system all the worse. The same sever set up with 4 sockets and 16 cores would only have 4 L3 caches, that's halving the number of caches that must maintain coherency traffic with each other and system memory.

That is why AMD built a NUMA L3 cache based Quad. Nehalem is designed to fix those issues for Intel, and those issues were exactly why AMD is still doing well in the server market. Core 2 is more powerful but all the additional overhead and waiting for one core to get the data it needs from another die hobbled it, giving AMD's Opteron the time it needed to finish crunching while Core 2 was busy idling. It's why I'm rather curious to see some official/valid server/HPC benchmarks of Nehalem, because that's where the largest performance gains should be seen.

Anyway, it just means Nehalem was built to be another "Core 2"... but not for desktop users. If they did it correctly it's Intel's Core 2 for the server market.
 

Kougar

Techgage Staff
Staff member
Ran across this... puts some numbers to what I was feeling about prices. I'm going to wait until prices drop a bit and I can snag some deals... see how things look by Black Friday.

X58 Boards and CPUs are already all over ebay, a few at close to retail prices.

So, here is the math:

any SB750 mainboard = $120
Phenom X4 9950 = $165
8GB DDR2-800 = $92
------------------------------------
total = $377

MSI P45 Neo-F = $95
Q6600 = $190
8GB DDR2-800 = $92
------------------------------------
total = $377

MSI P45 Neo-F = $95
Q9550 = $319
8GB DDR2-800 = $92
------------------------------------
total = $511

MSI Eclipse = $400
i7 920 = $320(which is discussable since it doesn't exist on market)
6GB DDR3-1333 = $250
------------------------------------
total = $970

MSI Platinum 295$
i7 920 320$
6GB DDR3-1066 140$
-------------------------------
755$

Sorry, but i7 is not a feasible investment at this point of time.

http://www.xtremesystems.org/forums/showthread.php?t=206624&page=8
 
Last edited:

Rob Williams

Editor-in-Chief
Staff member
Moderator
Thanks for all the insight on the caches Kougar... you've obviously put a lot of time into pondering over it. In the future, I'd like to be able to test server-specific applications like you mentioned, but I think it's a little ways off. I have a difficult enough time getting the work done that I need to get done already ;-)

As for the pricing information, it definitely helps put things into perspective. But there's nothing mentioned there that's surprising. It's a new architecture and requires brand-new motherboards as well, so there's obviously going to be a high cost of entry. All we can hope is that the prices go do rapidly.
 

Kougar

Techgage Staff
Staff member
Can't take the credit for it, just is stuff I pick up from reading everything... I only remember it well because it was exactly the reason AMD didn't fall off the end of the boat after Core 2 launched, it kept them competitive.

When I was mentioning the server benches I was just speaking in general. To get some useful benches those take exhorbitant amounts of work and 4-socket servers. I know you more than had your hands full over the past couple weeks, too!

It's the third whammy of needing to buy DDR3 that just makes Nehalem a rough sell, even though these new triple kits are the most reasonably priced DDR3 yet the boards are anything but reasonable. I'm already devising a scheme though, it's amazing what a little cash back from Microsoft's Live program can do to the bottom line if you want to build a new computer... ;)
 

Kougar

Techgage Staff
Staff member
Hey Rob, had a chance to read over any of the other reviews yet? Or still digging through that pile of work now that you can see the light at the end of the tunnel? ;) To keep it short TR and BT both seem to have no issues reaching a stable 200Mhz QPI, but anything beyond that seems to be akin to climbing a vertical brick wall... but that is all the 920 needs to make 4Ghz. I'm still digesting the info but if you were interested in some OC pointers they certainly have 'em!

Believe it or not I'm just now reading AT's article, they're confirming the domino/static CMOS logic is the reason behind the larger die despite the decrease in transister count. It's really interesting, because the power savings aren't that great except then pushing the clockspeeds... Link

Is it wrong that I can't get the need for a Core i7 system out of my head because I want to OC one to 4GHz? I think I need therapy or something...
 
U

Unregistered

Guest
it is a lot easier to overclock i7.

wonder why the bus is kept at 133? try 200 and use multiply 20. even 920 can get to 4G easily. check out this link. you can use google to translate them.

http://forum.coolaler.com/forumdisplay.php?f=300

On game performance, you should 260x2 or 4870x2. i am afraid the bottleneck is not i7.
 

Rob Williams

Editor-in-Chief
Staff member
Moderator
wonder why the bus is kept at 133? try 200 and use multiply 20. even 920 can get to 4G easily. check out this link. you can use google to translate them.

http://forum.coolaler.com/forumdisplay.php?f=300

On game performance, you should 260x2 or 4870x2. i am afraid the bottleneck is not i7.

The bus is kept at 133MHz because that's the safe limit for the chipset. You shouldn't assume that 4GHz is an easy overclock, because out of the three performance motherboards we've used so far, only one was able to hit 200MHz realibly. Even then, there is no way that 4.0GHz stable is going to be a common overclock on the 920. I'd like to see it stable once, period.

Overclocking isn't horrible on i7 like once expected, but it's a LOT more difficult to achieve a good one, and the temperatures are one of the biggest reasons that CPUs can go even higher. Even at 3.6GHz with this room temp, I'm lucky to not go over 100°C on one fo the cores.
 

Cobra26

E.M.I.
Hi folks,
First off all a nice review.

I still have some questions recarding the 920 and 940. Because i'm planning on buying a new rig with core i7.

So i need some expert advise.

I'm not interested in overclocking (overclocking a 920 is so power hungry wich i do not want) so everything runs at stock speeds.

I use my pc for games, movies, a lot of winrar usage, decoding, converting formats to other formats, video/photo editing, and some times graphics drawing with programs.

What do you recommend me? Is a 920 on stock speeds more then enough for me or should i choose the 940 wich is faster on clock speeds, and gives more fps on games, but is it worth the price tag? I always use a pc for an average of 3 to 4 years, except the videocard.

Can some one help me with some good advise?
Thanx in advance.
 
Last edited:

Kougar

Techgage Staff
Staff member
Hey Cobra. If you are sure you will be using that system for 4 years then you might enjoy the 940 more in the longer run. But otherwise I would just say save yourself the $, the 940 is not worth twice the price for a 266MHz speed bump.

Especially since an extremely small overclock would net you the same results at relatively the same power consumption. You'd spend more on the 940 than you would on the difference in your electric bill to make a 920 into a 940.
 

Cobra26

E.M.I.
Hey Cobra. If you are sure you will be using that system for 4 years then you might enjoy the 940 more in the longer run. But otherwise I would just say save yourself the $, the 940 is not worth twice the price for a 266MHz speed bump.

Especially since an extremely small overclock would net you the same results at relatively the same power consumption. You'd spend more on the 940 than you would on the difference in your electric bill to make a 920 into a 940.

Thank for you reply Kougar,

What kind of small overclock do you have in mind going to 3Ghz, and the power consumption?

I did some research on overclocking a 920, what i found was: a pretty high power consumption when you overclock the 920 although it was overclocked to 3.6Ghz (il quess a overclock to 3Ghz would still be power hungry i might be wrong) at load it went of the "chart" as shown in the link:

http://www.bit-tech.net/hardware/2008/11/06/overclocking-intel-core-i7-920/14

Any help is appreciated thanks.
 

Rob Williams

Editor-in-Chief
Staff member
Moderator
I've achieved 3.6GHz on near-stock voltages on our i7-920, so anything less than that should be a relative breeze. I'd go for 3.2GHz, perhaps, which would match the i7-965. It would require a little more voltage depending on the motherboard, in order to achieve a Base Clock of 160MHz, but it shouldn't be that difficult. For an even more reasonable overclock, you could just do 150MHz BCLK and achieve 3.00GHz, which is still a great clockspeed.

For the most part the power consumption increase would be rather low... I'd expect a maximum of a 10W increase, which is almost moot since the CPU has a 130W TDP to begin with.
 

Cobra26

E.M.I.
Thank you very much Rob Williams,

Il go for the 3.0Ghz overclock with the Core i7 920. Your right about the power consumption increase of arround 10W. Did some research of the power scaling when you overclock a 920 from 3Ghz to 3.8Ghz. Altough the increase is (very) high when you past 3.6 or above.

Still have 1 question regarding the i7 940, if its so easy to overclock a 920 to 3Ghz. And yields the same performance as the 940...i think. With little increase on the power consumption. Then why is the i7 940 (2.93Ghz) so much more expensive, is it made from better quality parts, there has to be a reason for the higher price tag, wich is?
 

Kougar

Techgage Staff
Staff member
Rob beat me to it... that was mostly my point, if you ONLY overclock to the 940's 2.93GHz speed then the additional power consumed used over ~4 years is not going to add up to the $266 price difference you pay upfront. ;)

Most Intel CPUs will become extremely power hungry as you clock them above 3.2GHz... My Q6600 will pull 200W+ of power by itself at 3.6GHz, probably more but I don't have the means to measure CPU-only power consumption.

Power saving technology turned off:
CPU Only Load (measured from the wall)

2.4GHz ~235watts 1.20v
3.6Ghz ~475watts 1.45v

It's no wonder I use watercooling... :D And why I most often run with my CPU at 3.2GHz at a much lower voltage.
 

Rob Williams

Editor-in-Chief
Staff member
Moderator
Most Intel CPUs will become extremely power hungry as you clock them above 3.2GHz... My Q6600 will pull 200W+ of power by itself at 3.6GHz, probably more but I don't have the means to measure CPU-only power consumption.

Wow man, I had no idea that they could draw quite that much power. Good to know.
 
Top