|
|||||||
| Reviews and Articles Discussion for Techgage content is located here. Only staff can create topics, but everyone is welcome to post. |
![]() |
|
|
Thread Tools |
|
|
#1 |
|
Editor-in-Chief
Join Date: Jan 2005
Location: Atlantic Canada
Posts: 13,231
|
Intel's Larrabee architecture has been on the mind of many enthusiasts for the past few months, but sadly, Intel hasn't released any specific performance data today. What they have revealed are the base mechanics of the architecture and other tasty tidbits to whet our appetites.
Intel today takes a portion of the veil off their upcoming Larrabee architecture, so we can better understand its implementation, how it differs from a typical GPU, why it benefits from taking the 'many cores' route, its performance scaling and of course, what else it has in store. You can read the full article here!
__________________
Intel Core i7-990X EE @ 3.43GHz, GIGABYTE X58A-UD5, Kingston 12GB DDR3-1333, NVIDIA GeForce GTX 680 2GB Kingston HyperX 3K 240GB SSD, WD VR 1TB, WD 2TB, Seagate 2TB, LG BD-ROM, ASUS DVD-RW, Corsair 1000HX, Corsair H60 Cooler Corsair 800D, Dell 2408WFP 24", ASUS Xonar Essence STX, Gentoo (KDE 4.10, 3.7 Kernel) "Take care to get what you like, or you will be forced to like what you get!" - H.P. Baxxter <Toad772> I don't always drink alcohol, but when I do, I take it too far.
|
|
|
|
|
|
#2 | |||
|
Techgage Staff
Join Date: Mar 2008
Location: Texas
Posts: 2,638
|
Just wow. IF the hardware is even close to delivering, then this is going to be the future. No doubt about it, fully programable super-computing cards.
GPUs, got people's attention with their supercomputing ability for specialized tasks. NVIDIA's CUDA especially, after it went from languishing as a marketing slide and demo video for longer than a year before almost overnight it turned into a major deal. Universities everywhere have built single computers with multiple GPUs and now receive better performance than supercomputing clusters they utilized previously. At much lower prices. Folding@home has literally exploded, NVIDIA and EVGA GPU-only teams appeared out of the blue and climbed through the ranks of folding teams that had been folding since F@H's inception 8 years ago, and single-handedly upset the rankings. It made the PS3 look mediocre, despite teams of PS3 folders doing the same just prior to CUDA launching with F@H support. Long paragraph short, GPU's got people to realize CPUs are no longer the way to go for raw performance. With a fully programmable "GPU" set to debut, there will not be a single hurdle left for people to begin porting applications over. Fully programmable in one of the better known coding languages, this is the start of really big things. ![]() Rob, because you mentioned SIGGRAPH in your article, I found this tidbit hidden in the Larrabee wiki. They give sources for it as well. Quote:
Now, some of my thoughts. A 1Gbit bus is rather insane... no wonder they are having difficulties with the PCB. Just look at what adding a 3rd memory controller to Nehalem did to X58 motherboards, manufacturers have to add two extra PCB layers if they wish to make use of the third controller with six banks of DDR3 memory. Fuzillia (Not the most reliable, I know) has this article out about how Larrabee is currently a 12-layer 300w card. This seems reasonable considering Larabee isn't going to launch for another year however, emphasis on cutting it down to production size comes last in the engineering process. But one only needs to look at ATI's first 512bit ring-bus memory GPU, R600, to get an idea of what I am thinking about here. If that was not scary enough, look at NVIDIA's first 512bit memory bus GPU, the GTX 260. Larrabee is going to be a huge "multi-core" die regardless of 45nm or 32nm process size when it launches, and the card complexity is still going to dwarf GTX 260. The performance may certainly be worth it. I can't wait.
__________________
Last edited by Kougar; 08-04-2008 at 05:55 AM. |
|||
|
|
|
|
|
#3 | ||
|
Techgage Staff
Join Date: Mar 2008
Location: Texas
Posts: 2,638
|
I can't resist a double-post. I apologize if this is considered hijacking, since technically this section is for discussing TechGage articles...
![]() After reading TechGage's article I went over to Anandtech since I particularly enjoy Anand Lal Shrimpi's hardware & design dissection articles. From above I was talking about having a 1024bit ring bus is going to mean incredibly huge amounts of die size by itself. So I was already wondering if Larrabee would be making the GT200 core look small. Partly into his article I got to thinking about the actual # of cores Larrabee would likely launch with due to this reason... Ironically (and partly why I enjoy Anand's articles so much) he explores this idea on page six. Surprisingly, it is more cores than I was expecting, but as he isn't factoring in other things like the 1024bit ring bus that y'all mention, I think 64cores is likely going to be the absolute max, and could very well be less. The again with Intel making cores the size of Itanium (~596mm^2 90nm), which is significantly larger than GT200 it's possible. Larrabee is going to be a beast.
__________________
|
||
|
|
|
|
|
#4 |
|
Guest Poster
Posts: n/a
|
I am wondering why Intel is trying to extend their x86 architecture into the GPGPU arena. It has too many unnecessary components not needed for a GPGPU.
They already have a key component to build a better GPGPU than Nvidia or AMD. One word, Itanium. It might be worth looking into. It may need some reworking, but Intel is just sitting on it and just messing around trying to extend x86 beyond it's reach. Rictor |
|
|
|
#5 | |
|
Guest Poster
Posts: n/a
|
Quote:
I think I see how x86 can be extended into the GPGPU (General Purpose Graphics Processing Unit) arena by combining VLIW (Very Long Instruction Word) and SIMD (Single Instruction Multiple Data). Note: Might want to come up with a better name than GPGPU, it's a bit of a misnomer now. Maybe General Purpose Media Processing Unit (GPMPU) I don't think I was totally off base, but Intel may want to create an entirely seperate processing unit like the FPU from the days of the 386. It would be composed primarily of multiple SIMD units. I'm unclear if Intel should offload all SIMD units from the CPU... come to think of it... I'm wondering... I don't know... maybe they could look into combining the SIMD units on multi-core CPUs into a single unit first, starting with the dual cores and moving up. Why is this important? Multiple reasons. First being, to test to see if it can improve SIMD performance without incurring to much complexity. To see if coordinating SIMD utilization (combining SIMD units) is possible and increase in performance is sufficient to justify the number of changes in design... might try software to coordinate the SIMD units first to check to see what if any design changes are required... observing the software in operation would lead to changes required in processor design to optimize SIMD utilization. It would be the first stage towards eventually moving SIMD functions off onto a dedicated chip for media processing with the CPU issuing VLIW to the GPMPU to maximize performance. I could maybe come with more reasons for doing it this way, but I would end up rambling on and on and on... Ah, hyperthreading, but that can come later... then move up to ultrathreading (more than 2 threads per processing unit)... hyperthreading, ultrathreading it's just register renaming mostly, though... maybe I should talk to AMD first... ah-ha... oh no... I had another idea, but it escaped me for the moment. I once told a visiting Intel Engineer, Merced was a dud too. I just didn't know you could use VLIW to issue multiple SIMD to the GPMPU, at the time. I feel like such a dud, I guess I'll go by that name now. Dud, formerly Rictor of Gothic Terror |
|
|
|
|
#6 | |||
|
Techgage Staff
Join Date: Mar 2008
Location: Texas
Posts: 2,638
|
To chime in, Intel's goal with Larrabee has been to attempt to offer GPU-level computational performance while allowing users to code in C++, or other x86 languages that you cannot on a modern GPU.
I'll just borrow a quote from Tim Sweeny, the guy behind the Unreal Engine: Quote:
__________________
|
|||
|
|
|
|
|
#7 | |
|
Guest Poster
Posts: n/a
|
Quote:
Tim Sweeny is just saying Larabee looks like a good concept (on paper), actually working out the details into a workable system design is much more difficult, especially if you are taking the wrong approach. Intel canceling their first attempt to take Larabee from concept to product shows their are still working out the details, even though the concept looks solid. Sadly, they already have most of the technology and patents to move ahead of the pack, but it will require as one of their people once put it, "a paradigm shift". Nvidia does offer GPGPU programmability using C (or C++) for their products through their proprietry CUDA language and will probably support more open GPGPU language like OpenCL and Microsoft's DirectCL. Nvidia purchasing of PhysX precipitated the whole move towards general purpose GPU computing and is their key advantage in pioneering GPGPU. Intel may want to look at that model, the physics engine as an add on to the computer system. The possibility I see is Intel trimming down the CPU to a minimum of 2 core with hyperthreading or ultrathreading (>2 thread per core), memory and I/O hub and huge amounts of cache (trace, L1, L2 and maybe even more) while the "physics engine" might comprise of various concepts, design and technology used in SIMD and Merced. Simplest model would be a dual-core CPU (maybe with hyperthreading) issuing VLIW to Itanium like architecture comprised of primarily SIMD units. But I'm sure the engineers at Intel will be able to come up with something better. The CPU would handle the scheduling (I/O, thread-ordering, memory access, etc.) and management while the Itanium-like physics engine is freed to do the number-crunching. The concept harkens back to the days of having a seperate FPU and CPU. The reason for trimming down the CPU is there may not be a whole lot of gain from having more than 2 cores especially if they can do hyperthreading or ultrathreading over quad or 6 core, but the option to add additional cores is there. My experience leads me to believe that 2 cores is more than sufficient, if you can offload some of the computing to the GPGPU (or as I like to refer to as General Purpose Media Processing Unit or GPMPU or phyics engine is another term you could use). Most processors are working out of their cache rather than main memory and going with a larger cache might prove more advantageous rather than more cores. Lower power, fewer cache misses, large cache to feed the physics engine. Of course it's just one approach and Intel just needs to take a look at what they have towards building a physics engine. One of the problems in designing systems is that when some key technologies reach a certain level of maturity, the design of the whole system needs to evolve to move forward. A paradigm shift is required. Whole avenues of design improvements open up but people often get locked into one way of looking at something or doing something and they're unable to see the whole world of possibilities that's piled up around them. l1h4x0r |
|
|
|
|
#8 | |
|
Guest Poster
Posts: n/a
|
Quote:
Tim Sweeny is just saying Larabee looks like a good concept (on paper), actually working out the details into a workable system design is much more difficult, especially if you are taking the wrong approach. Intel canceling their first attempt to take Larabee from concept to product shows their are still working out the details, even though the concept looks solid. Sadly, they already have most of the technology and patents to move ahead of the pack, but it will require as one of their people once put it, "a paradigm shift". Nvidia does offer GPGPU programmability using C (or C++) for their products through their proprietry CUDA language and will probably support more open GPGPU language like OpenCL and Microsoft's DirectCL. Nvidia purchasing of PhysX precipitated the whole move towards general purpose GPU computing and is their key advantage in pioneering GPGPU. Intel may want to look at that model, the physics engine as an add on to the computer system. The possibility I see is Intel trimming down the CPU to a minimum of 2 core with hyperthreading or ultrathreading (>2 thread per core), memory and I/O hub and huge amounts of cache (trace, L1, L2 and maybe even more) while the "physics engine" might comprise of various concepts, design and technology used in SIMD and Merced. Simplest model would be a dual-core CPU (maybe with hyperthreading) issuing VLIW to Itanium like architecture comprised of primarily SIMD units. But I'm sure the engineers at Intel will be able to come up with something better. The CPU would handle the scheduling (I/O, thread-ordering, memory access, etc.) and management while the Itanium-like physics engine is freed to do the number-crunching. The concept harkens back to the days of having a seperate FPU and CPU. The reason for trimming down the CPU is there may not be a whole lot of gain from having more than 2 cores especially if they can do hyperthreading or ultrathreading over quad or 6 core, but the option to add additional cores is there. My experience leads me to believe that 2 cores is more than sufficient, if you can offload some of the computing to the GPGPU (or as I like to refer to as General Purpose Media Processing Unit or GPMPU or phyics engine is another term you could use). Most processors are working out of their cache rather than main memory and going with a larger cache might prove more advantageous rather than more cores. Lower power, fewer cache misses, large cache to feed the physics engine. Of course it's just one approach and Intel just needs to take a look at what they have towards building a physics engine. One of the problems in designing systems is that when some key technologies reach a certain level of maturity, the design of the whole system needs to evolve to move forward. A paradigm shift is required. Whole avenues of design improvements open up but people often get locked into one way of looking at something or doing something and they're unable to see the whole world of possibilities that's piled up around them. Dud |
|
|
|
|
#9 |
|
Guest Poster
Posts: n/a
|
I think Intel is taking the wrong approach, trying to build a design that competes directly with Nvidia or AMD's GPGPU. Nvidia GPUs are designed primarily as compute shaders moving towards a "PhysX (physics) engine" GPGPU.
Intel might want to take a different approach... they should be building from their position of strength, the CPU. I'm thinking they should take look at the SIMD unit in the CPU and move towards concept of "media processing unit" then maybe move it out of the CPU and push towards a "physics engine" design. Try to develop a complimentary design to the Nvidia's GPU (explained later). If they really want a leading edge GPGPU, they should just buy Nvidia outright. Of course, that could be good and bad. It would save them considerable amount of time and they would have a graphics processor to compete with AMD/ATI. But buying Nvidia they may end up missing an opportunity to revolutionize the CPU industry and run afoul of antitrust laws. GPU and CPU makes up about 90% of the computer and even if they did, they would end with antitrust fines without even trying. Just look at Microsoft.... and the recent Intel antitrust fine involving Dell... Dell favors Intel and for that Intel got fined. Dell didn't even bother with AMD until they finally got a solid design win with the Athlon. Intel just rewarded Dell for being a faithful customer and bam Intel gets hit with a fine. Complimentary design to Nvidia's GPU: In order for Nvidia to use PhysX, some of the GPU resources have to be allocated away from the shader operation... It's opportunity for Intel to push Nvidia back to their area of expertise, 3D graphics. And... this is important... use it as a starting point to build their own PhysX engine. If you have multiple GPUs, Nvidia even makes the option available to dedicate one of the GPUs for doing PhysX instead of rendering video. That makes no sense to me, it defeats the whole purpose of having SLI. Won't it be much better if the CPU (let's say a quad core) could be organized to utilize all four SIMD units to do PhysX instead and free up the GPU to do what it's suppose to do... shading/rendering. Maybe I don't understand SIMD all that well or the limitation in attempting this feat... but it might be worth a look, it can't hurt. What if scenario... University of Antwerp already built a supercomputer using 13 Nvidia GPUs (SLI not required, SLI is only necessary for rendering 3D graphics to a display using SLI mode) for about 6000EUR and beats their previous supercomputer (cluster) soundly (about 4x according to the report). Intel should be seriously worried... WHAT IF... Nvidia starts selling a complete solution... replacing the Intel processor with their own processor, the Ion. First the supercomputing world then it would start to trickle into the home... The CPU doesn't need to be super fast, just very good at managing/scheduling I/O... The backup plan for Intel is... if Nvidia's lead becomes too big... design a better CPU for managing/scheduling I/O... Intel will have to do this anyway whenever they get around to building their own PhysX engine... which is why I think they will move the SIMD units out of the CPU... Intel has lot more experience with system design then anybody... PCI, chipset, cache, CPU, processing technology, etc... unfortunately, Nvidia has been branching out to chipset and CPU design too. Nvidia is a bigger threat than AMD/ATI combined... laugh... OK, you can stop laughing... AMD/ATI, they can coordinate from both ends. That's the good part, the bad part, they're resources will be stretched thin and being able to accommodate the other isn't necessarily a good thing... an engineer might be able to tell you why. Home supercomputer might be a strong selling point... to the point of making Intel's "conventional" processors irrelevant, however fast they may be. Nvidia may have inadvertently wandered into an area where Intel could be excelling at... supercomputing... I feel Nvidia lead is not insurmountable, though. The plus side for Intel, they already have the pieces for building their own PhysX engine, a better PhysX engine. Better than Nvidia's, since a large chunk of the transistor real estate is used for graphics and Intel can build a pure PhysX engine, read dedicated more transistors to the PhysX engine. OK, now you know what I have in mind... Once Nvidia figures this out, if they haven't already... they may opt to build a pure PhysX engine. The kicker, they might have the patents to PhysX engine... and the other shoe... they can't patent physics. So there you go, so don't build a PhysX engine... maybe multimedia processing engine or whatever you want to call it. Patents... feh. Intel may want to take another look at Merced technology and combine it with SIMD. Hopefully, it's not a wild goose chase. Technically, I'm not an engineer... so if it does work, it's not my fault... =) +-----------------------------------+ |--------------VLIW--------------| |====================| | SIMD SIMD SIMD SIMD | |====================| |====================| |====================| |====================| +-----------------------------------+ What would you end up with... how about a world-class supercomputer for well under $3000 for the home. Well, technically... today's personal computers are probably faster than most of the supercomputers build less than a decade ago. Today's PC technology is outpacing today's supercomputers, and tomorrow's supercomputers will be personal home computer. It's been done... 'nuff said |
|
|
|
#10 | ||||||||
|
Editor-in-Chief
Join Date: Jan 2005
Location: Atlantic Canada
Posts: 13,231
|
Quote:
Quote:
Quote:
Quote:
Intel doesn't have a "Stream" or a "CUDA", but they do have IA, which developers are already familiar with, and as such, Larrabee would be able to offer great performance for OpenCL and perhaps others with perfect C/C++ support, because it's not on a different architecture. I'm also not too sure that Intel would be building Larrabee with a lot of what makes an x86 a desktop chip... I'd expect a lot to be tweaked and altered in order to fit as many of these cores into a single chip for graphics use. Quote:
Quote:
Quote:
Quote:
__________________
Intel Core i7-990X EE @ 3.43GHz, GIGABYTE X58A-UD5, Kingston 12GB DDR3-1333, NVIDIA GeForce GTX 680 2GB Kingston HyperX 3K 240GB SSD, WD VR 1TB, WD 2TB, Seagate 2TB, LG BD-ROM, ASUS DVD-RW, Corsair 1000HX, Corsair H60 Cooler Corsair 800D, Dell 2408WFP 24", ASUS Xonar Essence STX, Gentoo (KDE 4.10, 3.7 Kernel) "Take care to get what you like, or you will be forced to like what you get!" - H.P. Baxxter <Toad772> I don't always drink alcohol, but when I do, I take it too far.
|
||||||||
|
|
|
![]() |
| Tags |
| None |
| Thread Tools | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Clearing Up Misconceptions about CUDA and Larrabee | Rob Williams | Reviews and Articles | 11 | 01-06-2010 02:40 PM |
| Craving Some Larrabee Info? We've Got It! | Rob Williams | Processors | 0 | 08-04-2008 02:05 AM |
| Intel Details Nehalem, Dunnington, Tukwila & Larrabee | Rob Williams | Reviews and Articles | 39 | 07-12-2008 12:59 AM |
| Intel Larrabee to be Released Late 2009 / Early 2010 | Rob Williams | Processors | 0 | 01-18-2008 04:26 AM |
| YouTube Opens Up Revenue-Sharing | Rob Williams | Off Topic | 0 | 12-11-2007 02:48 AM |