Intel's new Nehalem-EX CPUs rock servers with eight cores, 16 threads, infinite sex appeal
What's that, you have an array of six-core CPUs in your rack? That is so last year. You're going to feel pretty foolish when all the cool admins start popping eight-core chips up in their closets this fall. That's the number on offer in Intel's latest, the Nehalem-EX. It's an evolution of the architecture that some of you may be spinning in your Core i7 machines, but boosted to support up to 16 threads and 24MB of cache. 2.3 billion transistors make the magic happen here, and Intel is pledging a nine-times improvement in memory bandwidth over the Xeon 7400. Chips are set to start hitting sockets sometime later this year, and while nobody's talking prices, staying hip in the enterprise server CPU crowd doesn't come cheap.























Apple would love to use these as an excuse to bump up the price of the mac pro.
wait so better technology in their machines is bad?!?
Yes, it'll increase the already critical smug levels.
wrong. This won't work in a Mac Pro.
You're right, they'd have to change the motherboard. (another excuse)
1) Nehalem-EX are for 4 and 8 socket enterprise servers, not workstations of any kind.
2) If Apple offered an X-serve variant with Nehalem-EX, it would hardly be to just "increase prices", they would be targeting a completely different segment of the industry... And yes, 4-socket Server chips have always been far more expensive because of the usually better technology and far lower volume than mid-level server chips.
*COUGH* It's a joke *COUGH*
Ok, maybe I am just really stupid but I must ask...
Why is there a core 0 ?
because programmers count from zero
Just be thankful it's not Core000, Core001, Core010, Core011, Core100, Core101, Core110 and Core111. ;)
Correction: engineers count from zero programmers can count from whatever thier programming language allows
so I'm betting the new Venus Sparc processor from fujitsu still stomps the living daylights out of this nehalem at 8 cores with 64 simultaneous threads also im betting this isn't gonna use only 45 watts like the venus as a server processor nehalem fails hard
Refs:http://www.theregister.co.uk/2009/05/13/fujitsu_venus_sparc64/
http://www.engadget.com/2009/05/15/fujitsus-supercomputer-ready-venus-cpu-said-to-be-worlds-fast/
I'll use C as an example of this, although I am not making any sort of claims as to the origin or history of "counting from zero" - just giving a practical example that gives some sense to the practice.
In C, an array is referenced as []. Arrays aren't too complicated, they're just contiguous blocks of memory that contain multiple values, all right next to each other. The "address" portion will always be the same for any specific array, and the memory location of element you're referencing is computed as:
+ ( * size_of_item )
Now the C compiler knows the type you've given the array, so it takes care of multiplying the by the size of one item. So, to make things simple, let's presume you have an array of 1 byte elements. The address points directly to the first element in the array. So if you want to retrieve that element, you want to retrieve the byte located at . As the equation above makes pretty clear, element number "1" is going to be [ + ( 1 * 1 ) ] == + 1. Not what you're looking for! Element 0, however, fits the bill: [ + ( 0 * 1) ] == .
So oftentimes in programming counting starts at 0, because 0 means "the thing at the reference" and 1 means "the thing 1 unit away from the reference".
Oh god I screwed that previous comment up. Didn't realize < and > would get insta-killed. So I'll just redo this properly - is there a way to delete a prior comment?
I'll use C as an example of this, although I am not making any sort of claims as to the origin or history of "counting from zero" - just giving a practical example that gives some sense to the practice.
In C, an array is referenced as address[offset]. Arrays aren't too complicated, they're just contiguous blocks of memory that contain multiple values, all right next to each other. The "address" portion will always be the same for any specific array, and the memory location of element you're referencing is computed as:
address + ( offset * size_of_item )
Now the C compiler knows the type you've given the array, so it takes care of multiplying the by the size of one item. So, to make things simple, let's presume you have an array of 1 byte elements. The address points directly to the first element in the array. So if you want to retrieve that element, you want to retrieve the byte located at "address". As the equation above makes pretty clear, element number "1" is going to be [address + ( 1 * 1 ) ] == address + 1. Not what you're looking for! Element 0, however, fits the bill: [address + ( 0 * 1) ] == address.
So oftentimes in programming counting starts at 0, because 0 means "the thing at the reference" and 1 means "the thing 1 unit away from the reference".
"Zero stones; ZERO CRATES!!!!!"
@cb88
"stomp the living daylights out" ? Not quite buddy. Its a great processor, but the Nehalem-EX is a beast. Anyways the "Venus" is really for a different market segment that the Itanium competes in. Not to mention a Venus box will be far more expensive than an equivalent Nehalem-EX. Lets wait for the benchmarks before spilling out the hyperbole..
so... how does this compare to the cell processor?
Kinda the way a .44 Magnum compares to a .38 special.
Cell is still better for Some applications like Video encoding/editing for example but its not built for anything unlike Intel Cpu
(I'm saying Cell is faster based on the Core i7 Cpu which Cell was much much faster with video editing)
anyway i bet we gonna See these in laptops in 2013 lol and our desktops will have 24 Core or something by that time
It hard to compare CPU's when they are not in the same category. The Cell CPU is a hybrid, it's basicly a GPU with CPU capabilities, and is designed for parallel calculations, witch is mainly used for graphics rendering and physics calculation/simulation. Where the intel CPU is designed for serial calculation (in parallel x8) and has lots of other integrated functions, that are more multi application friendly.
In laymans terms, it's like comparing a speedboat to a container ship. Where the cell CPU is the speedboat, and the intel is the container ship, moving lots of weight slowly, and the cell CPU very little weight but very fast.
Wrong question !
The correct question is : what do we think we need - Nehalem or Atom?
Do we need 8 or 10 cores or even a 100 cores - or do we need to play youtube, check blogs, write emails, pay bills etc..?
Hey, is it because of recession or something else that the automobile industry doesn't do something similar? Imagine a car with 8 engines ! Wow, sounds cool.
The Cell is a different beast. It is comprised of a single "controller" core that delegates tasks to a bunch of simple SIMD cores. For workloads made up of processing-intensive lightly-branching code that is easy to do in parallel, the Cell is great. The new "PowerXCell 8i" version running at 3.2Ghz can put out 102 Gflops double precision! For comparison, the Nvidia Tesla C1060 (aka GTX 285) can only do 78 in double precision!
But the Cell is more like a vector coprocessor in that it doesn't do general computing tasks well, including managing dozens of separate serial tasks at once, parallel code that is highly interconnected where the cores have to talk to each other constantly or highly branching code. this is exactly what the Nehalem-EX is meant for.
That said, despite not being designed for it, the 8-core 16-thread nehalem with SSE4.2 SIMD should be able to put out some serious parallel floating point numbers.. Perhaps even surpassing the new Cell. All the while able to be a monster as a more general purpose processor for commercial workload servers.
Actually most of us Admins are just trying to save our jobs. We are willing to use old tech if it means more headcount.
Are you a single core cpu?
"We are willing to use old tech if it means more headcount"
You should reintroduce typewriters, paper filing, etc so you can increase the headcount. Full employment FTW!
I'm hoping my technical manager puts in a capital request for one of these. Our dual quad Xeon's are starting to show their age. =P
So here's the million dollar question - would Vista or Win7 actually use, or allow the use of, all 8 cores properly? And if they do, what programs currently can take advantage of all cores? Correct me if I'm wrong, but games (which for the most part are the kinda system pushing software that potentially could benefit the most from this) still mostly utilize only 1 core, maybe 2, but I'm 99.99% sure no games take advantage of 4 cores, let alone 8.
So, in short, who the heck needs this??
In a nutshell, server farms, e.g. Google, Amazon, Ebay, Financial Exchanges.
When programming for multiple cores, it typically doesn't matter if you have two or twenty-two- the logic is the same (although the architecture may vary slightly, due to optimizations of thread count and available cores). To think about it another way, many programs have multiple threads. In a game, this might be to maintain communications with other players, to maintain the player's position in their virtual world, etc, etc - all doing their own thing - you may have dozens of these threads running. A simplistic view is that each core can service one thread (and an HT core can service two), so the more cores you have available, the more processing power you have.
It's important to note, however, that not all programs can be broken down into distinct parallel "chunks", and so having more cores doesn't necessarily mean faster- it all depends on the program.
I code medical imaging systems, and one such system services several million MRI, CAT scan, etc studies per year. I run between 30 and 60 threads at any given time; a CPU like this would be killer.
^^^ like they said, there are a lot of special applications.
All of them combined are still a small percentage of computer users, but they do very important work (except animators, they're more pretentious than useful).
These are server chips, not desktop chips. And servers will use as many cores as you can give it. Server operating systems are designed to use as many CPUs as possible. In fact, Windows Server 2008 can handle 64 cores, and Windows Server 2008 R2 can handle 256 cores.
http://www.infoworld.com/t/hardware/microsoft-windows-server-2008-r2-support-256-cpu-cores-750
Does it really matter? I bet your existing desktop is more than fast enough for games or whatever :) These high-end chips are meant for seriously parallel work.
We've got some Sun Fire T5220s that have 1 processor, 8 cores, 4 threads per core (so 32 threads total). Plenty of server apps are still single-threaded, but if you can run multiple instances of them you're set (think: the way Apache spawns new processes to handle each request).
In Snow Leopard, Apple will be introducing Grand Central, which (they claim) will make all apps multi-core aware. I guess the idea is that it presents a bunch of cores as a single fast processor by breaking up tasks at the OS level, rather than requiring programmers to write apps that are MP-aware.
It will be interesting to see if Grand Central is just kind of a hack that speeds up unaware apps, but not as fast as an app properly written to be multithreaded... or if it actually gives us advantages that simply writing an app for multiple cores does not.
People still run Windows on servers?
(Yeah, okay, some people do, but even so, Windows is a pretty faulty assumption to begin a sentence about a server CPU with.)
Leaving aside platform (and as far as I'm aware Windows doesn't have any limitation on processor cores - once you get beyond one core, handling three or 23 is logically no different), most server applications fall into one of two categories:
a) Use way more threads than you will ever have processors (e.g. 200+)
or
b) Use a tunable number of threads that you can set to match/be appropriate for your processor count.
There are some server applications which might not be able to take full advantage of more than, say, four cores (I kind of just wrote one, actually) but these are relatively rare.
--sam
"So, in short, who the heck needs this??"
Currently, servers and data crunchers. For the rest of us, programming paradigms and APIs are being refined to support better parallelization. Games are just starting to take advantage of allowing multiple threads, and newer well-written game engines scale almost linearly with the number of cores and threads provided.
Eventually, having twice the number of cores will have about the same performance impact as having twice the clock speed.
I googled; apparently Windows is currently limited to 64 cores. The Unix variants handle more (usually 255).
So you can have at least four of these chips, even in a Windows box...
HAHAH... This chip is for monster enterprise servers, not gaming dorks. And yes, commercial workloads like OLTP, OLAP, large databases, etc can easily take advantage of all 64 cores in an 8-socket machine.
There's Turbo Boost.. but no Super Pursuit Mode? lame.
Coming soon to a server near you: 32 cores, 256GB of RAM and a sh!tload of VMs...
IBM already has linkable 5U boxes that are 4-way Quads with up to 128GB of DDR3. You can link up to 4 of them together as one physical machine. 64 physical cores and 512GB of DDR3 ram, they were showing them off at the 2008 Server/SQL/VS launch events. Retool the boards for these chips and its a VM technogasm.
Question is, does it unfold proteins?
Im shocked and disappointed no-ones mentioned crysis. But I like your take on the joke