AMD's FireStream 9250: first processor to top 1 Teraflop
AMD's second generation FireStream 9250 just broke the single-precision teraflop barrier at the International Supercomputing Conference in Germany. The proc takes advantage of AMD's GPU expertise to augment the processing power of your rig's CPU with an additional 8-gigaflops per watt of processing from this 150 watt processor. A 55x performance bump, say developers, when compared to crunching financial analysis code, for example, on a CPU alone. The 9250 Stream fits into a single PCI slot and includes double-precision floating point hardware performing at more than 200 gigaflops. The processor and supporting SDK are due for release in Q3 for $999.Update: According to TGDaily, the 9250 features ATI's upcoming RV770 GPU at its core -- the foundation of future 4800-series graphics cards. So 4x cards setup in Crossfire X should be capable of offering your next gaming rig an additional 5 Teraflops or power. You know, in theory.
Read -- press release
Read -- TGDaily

















Does this mean that future PCs will be using PCI slots for more processing? I am not too sure what this actually means for average PC user (games/photo/video etc), so could someone please explain it to me?
The article is slightly misworded, they don't augment your processor, they're peripheral processing power that would be used by programs that researchers would write specifically to run on those cards. They could also be used in distributed computing setups. The average end user software company isn't going to write code for these, because I doubt very many end-users are going to have these. The future for personal computers is in smaller computers (laptops), and multi-core/thread processors that perform everything--that's what I think.
""The average end user software company isn't going to write code for these, because I doubt very many end-users are going to have these.""
My understanding that's precisely what AMD is trying to achieve: integrate GPU and CPU. And make all that cheap and affordable.
Transistors are cheap, anyway. It doesn't have to be high-end GPU - but specialized one and sufficiently fast one at that. Majority of market is cheap 2D cards (with rudimentary 3D) anyway.
""The future for personal computers is in smaller computers (laptops), and multi-core/thread processors that perform everything--that's what I think.""
You say literally what AMD says: multi-core processors. But cores would be different: some of them are generic CPU cores, some are specialized GPU cores.
Computers:
a series of cores that run on a series of tubes
an additional 5 teraflops, or power? choices, choices...
WOW!!
its it time to believe in AMD agai?
I know I sould not say this but... I think this will run crysis. LOL
Yeah, it should add couple of fps definitely ;)
> its it time to believe in AMD again?
As long as nVidia and Intel quarrel over every little matter, AMD/ATI has a chance.
Does that mean my PS3 has the power of 2 of these?
the ps3 processor is a completely different architecture and honestly, not all that powerful as the specs leave you to believe
No. Not even 25% as powerful. The PS3 basically has a 7800GTX 256MB clocked at 7800GTX 512MB speeds. It doesn't even remotely come close to a 3870 single card, which is 3 times as powerful. The 3870x2 is ideally 80% more poweful than a single 3870. The RV770 core is twice as poweful as a 3870x2.
So multiply that chain of power to see how weak your PS3 is compared to a modern PC.
i think he meant the cell processor. isnt it said that it can reach 2 teraflops?
LMAO, you wish. Even then, what are you gonna run? Pfft.
@nerdtalker
well what would you run off this chip as well? just said it will be coming with sdk, ull have you program stuff for it as well. The ps3 .. not much is used now but like Folding@home is using its power. as well as just games, slowly getting used to it. But 4800's sound tempting for pc gaming, if they use this chip for gpu
IBM rates the full Cell chip at 192 GFLOPS peak.
The PS3 uses only 7 of Cell's 8 secondary processors, and games get access to just 6 of them.
Ok I am going to do the "imagine a beowulf cluster" thing. So you get a thousand of these, and fit them one per $1000 machine.
That means for the cost of $2,000,000 you can build a petaflop beowulf style cluster?
Ok let's throw in $1,000,000 for all the networking type equipment.
Then another 1 million for maintenance.
And $100,000 for electricity. Another $100,000 for rent.
That means you can have a single precision petaflop capable machine built for "only" $4.2 million dollars.
If you want petaflop double precision .. that's gonna cost only around $19 million.
How much was the IBM Roadrunner machine? It's gotten up to a petaflop on LINPACK right?
Why not put them 4x $1k machine, giving you 4 teraflops per $5k machine.
250 of them, and you'll have a petaflop for 1.25mil.
That'll cut the maintenance and networking by at least 1/2.. so 1/2 mil each.. and leaving the rest the same gives a total of 2.45 million.
Peter-> Are you a accountant?
Peter-> Are you a accountant?
lol. anyway. i think amd is taking advantage of its video department to be able to better compete with nvidia and intel at the same time. which is pretty sweet. as much i think its pretty cool of amd i still want intel to be the best at processing or whatever they are up to. (for selfish reasons). if i was more impartial id be rooting for amd.
anyway, way to go. if it wasnt for amd we would be a couple of years behind on todays technology (since intel is always working to be better than amd). anyway. i guess i wouldnt mind getting a machine with a couple of these on it :P he he..
They are NOT competing with Intel. These GPGPUs are NOT true general purpose processors. You CANNOT execute anything but specifically written code designed for it, on this card. The range of instructions and calculations are SEVERELY limited to small simple maths. You have to break down your calculations to this level to use the GPU for calculation.
You still need a true general purpose CPU from Intel or AMD.
But will the processor itself be a mega-epic-flop?
These are going to get shat on by Nehalem and the Nvidia 9800's
I see OpenCL's involvement in this...
or not...
maybe you need to actually read the press release.
In keeping with its open systems philosophy, AMD has also joined the Khronos Compute Working Group. This working group’s goals include developing industry standards for data parallel programming and working with proposed specifications like OpenCL. The OpenCL specification can help provide developers with an easy path to development across multiple platforms.
“An open industry standard programming specification will help drive broad-based support for stream computing technology in mainstream applications,” said Rick Bergman, senior vice president and general manager, Graphics Product Group, AMD. “We believe that OpenCL is a step in the right direction and we fully support this effort. AMD intends to ensure that the AMD Stream SDK rapidly evolves to comply with open industry standards as they emerge.”
Good to hear. Now we just need an OpenCL frontend for nVidia's CUDA systems as well, and I can actually start using it.
niiiiice
Where's those guys that always ask about crysis now? (no, really stay outta here)
That price point should get lots of attention from small start-up companies, and hopefully a more discounted price for colleges and universities trying to find a way to do more with their budgets.
Interested to find out what the SDK documentation says.
1 Teraflop?
What would I do with that? Watch every episode of Family Guy---at once?
GIGIDY GIGIDY!!!
lol ouch intel did that hurt your nuts ?
Wait we are forgetting one thing... AMD needs to get a factory to actually mass produce these efficiently and reliably. Until then, I'll keep my fingers crossed.
Oh yea... and like someone mentioned above, these processors are more limited than your general purpose CPUs. Still fun to think about raw computing power though :D
TSMC, dork.
Pretty sweet. :)
No one ever reviews ASUS Tablets - R1E
* Decent Sized Screen for a laptop
* 2.4ghz / 4gb ram (I put a 200+ 7200 drive in mine)
* comes with extra battery, Drive Enclosure and LiteScribe DVD/Multi-Drive
* To dream for case (nicely packaged, too)
and - for less than most 'business' class models!
You can borrow mine when you pry it from my cold dead fingers!
(db)
Sorry, I wrote that comment last week, for a completely different review - it just now popped up here????
gophigure!
(db)
great!!! so what is a tereflop, exactly?
is this competing with the Toshiba SpurEngine????
yeah what the hell is a teraflop
is it a THz with a cooler name?
and it was leading me to believe that it was a CPU because AMD doesnt make GPUs, their ATI brand does
i would rather have this fast a proc than a graphics proc that fast
and for the guy who built a $4.2 million petaflop system, where does peta come in
i thought it was....
Kilo
Mega
Giga
Tera
Exa
Yoda
i was wrong
i need to go figure out what the hell a teraflop is
FLOPs stands for Floating point operations per Second. Thus this chip is capable of a trillion floating point operations for second. This is different from hz or gigahertz, which is just the clock cycle of the cpu.
Alright kids! It's basic math time!!!
So... let's take HDDs for example. If you have a gigabyte, you have 1 million bytes. Now, If you add terabytes to the equation, you get 1000 gigabytes. 1000GB=1TB=1 Billion bytes.
So, in that case, if you have 1 teraflop, you get 1 Billion, er, flops.
Good luck figuring out what a flop is. I dunno any better than you do. I'm just good at using a calculator. ;)
tera, peta, exa
1 000 000 for the networking ? For 150 000 USD you may get a Cisco 6509 with a sups 720 and 288 free ports... Ok, ill build it for 850 000.., ah the cables are very expensive :)
Basically what this card will allow is for you to use your graphics cards to crunch numbers. Think about that for a minute. All your computer does is crunch numbers; however, this will mainly be geared toward those tasks that you turn on and your computer just crunches numbers by itself without input from you.
This chip should increase the speed you defrag, scan for viruses, burn DVD/CD's, and other process intensive tasks.
"This chip should increase the speed you defrag, scan for viruses, burn DVD/CD's, and other process intensive tasks."
Those are all tasks that are not (yet, SSD can change that) CPU/processing power limited. Defragmentation, virus scanning are HDD read and write speed limited and burning a CD/DVD is limited by the speed your optical drive. These chips can be used for converting video and audio files into other formats/codecs. 1 of these chips would (optimally) be around 10 as fast as Intel's fastest processor for those kind of tasks.
Other uses are for example photoshop, weather forecast calculations, data processing after seismic scans (for mapping oil field for or earthquake prone areas for example), MRI scan data processing and so on...
Good point, but in my defense of me letting the stupid show, when I said Burn DVD/CD's I was including the process or encoding (Which would be helped greatly and what I was mainly focused on).
Sorry, no more posting at 3 a.m. after sex.
@Some Kid:
A FLOP is a measure of performance, I'll just quote Wiki for you:
"In computing, FLOPS (or flops or flop/s) is an acronym meaning FLoating point Operations Per Second. The FLOPS is a measure of a computer's performance, especially in fields of scientific calculations that make heavy use of floating point calculations, similar to instructions per second."
So one of these cards is capable of doing 1 TeraFLOP single precision, that precision is just a measure of accuracy, peak performance for double precision is around the 200 GigaFLOPs. Just for some comparison, the Cell chip in the PS3 does has a peak of around 200 GigaFLOPs single precision and 20 GigaFLOPs double precision. The Geforce 8800GTX that was released at around the same time as the PS3 does around 500 GigaFLOPS single precision, the Core 2 Extreme QX9770 has a peak of about 100 GigaFLOPS, that's just to give you an idea of what kind of numbers we are talking about.
@MastrCake:
Your numbers are a little off, kilo=1000, mega=1000 000, giga=1000 000 000 (1 billion, not 1 million) and 1 Tera is 1 Trillion (10^12). So you're off by about 3 zeros :P, but 3 times zero is still nothing :P.
Am I correct in assuming that this card would be a valuable asset to such things as weather forecasting, or other data-intensive works?
Alexander, that depends.
This card is effectively trying to compete with IBM's Cell Broadband Architecture PCI cards.
Small scale developers may only have access to a few of these cards, as may small businesses who run specialized data services.
The bulk, however, will be run in large companies, slotted into server racks with AMD Opteron processors. They will function as really fast coprocessors, accelerating calculations of specific types, usually the most data intensive and the ones that are best suited for conversion to the fireblade architecture.
Weather forecasting data centers are absolutely humongous, and 150 watts for a terraflop is a lot as far as I am aware. I believe IBM's cell platform is far, far cheaper to run. Since power usage is the main concern for data centers, not just because it's pricey but also because it's difficult to find locations where the powergrid is powerfull enough that it won't burn out if you hook up 200 computers, I'm not sure if this is really all that interesting by itself.
Furthermore, IBM is well established in academia and businesses and their Cell Broadband Engine Architechture is old, and therefore has a much broader developer base - so I reckon this release is "for show" and for AMD partners.
Of course, it could also be that this architecture is awesome and easy to develop for, in which case this release is worth somewhat more.
What we can use this for, unless we're high performance computing geeks, is to say "yay, the next ATI gfx card will r00l!".
Finally. It CAN play Crysis. Doomsday is upon us
I am blown away with this 5TFlops! 100th supercomputer have 19TFlops with 2048 cores with power consumption measured in hundreds of kWatts and it was built this year. Such a waste of money. Did you
This article isn't entirely right, the FireStream 9250 needs a PCIe 2.0 slot, not a PCI slot.