Home

News

Forums

Hardware

CPUs

Mainboards

Video

Guides

CPU Prices

Memory Prices

Shop



Sharky Extreme :


Latest News


- Outdoor Life: Panasonic Puts 3G Wireless Into Rugged Notebooks
- Averatec Launches Lightweight Turion 64 X2 Laptop
- Acer Fires Up Two New Ferrari Notebooks
- Belkin Debuts Docking Station for ExpressCard-Equipped Notebooks
- Logitech 5.1 Speaker System Puts Your Ears At Eye Level
News Archives

Features

- SharkyExtreme.com: Interview with ATI's Terry Makedon
- SharkyExtreme.com: Interview with Seagate's Joni Clark
- Half-Life 2 Review
- DOOM 3 Review
- Unreal Tournament 2004 Review

Buyer's Guides

- September High-end Gaming PC Buyer's Guide
- September Value Gaming PC Buyer's Guide
- October Extreme Gaming PC Buyer's Guide

HARDWARE

  • CPUs


  • Motherboards

    - Gigabyte GA-965P-DS3 Motherboard Review
    - DFI LANPARTY UT nF4 Ultra-D Motherboard Review

  • Video Cards

    - Gigabyte GeForce 7600 GT 256MB Review
    - ASUS EN7900GT TOP 256MB Review
    - ASUS EN7600GT Silent 256MB Review
    - Biostar GeForce 7900 GT 256MB Review





  • SharkyForums.Com - Print: Itanium - any comments?

    Itanium - any comments?
    By Arcadian November 07, 2000, 09:16 PM

    Yes, it's me, Arcadian, and everybody is starting to realize now, that servers are my favorite kinds of computers. That's why I want to open the floor to comments about Intel's new IA-64 architecture, and their upcoming Itanium chip.

    What do you guys think of Itanium? Are there any major worries or drawbacks that you foresee? Do you think it will be successful, in spite of being late? What do you think of Intel pushing an instruction set that is completely new? Anyone out there with confidence in Itanium? Does anyone out there not know what Itanium is?

    We can discuss anything here, including what people think of AMD's Sledgehammer, and if it will compete with the Itanium processor. I happen to be the kind of person who has a lot of confidence in Itanium, but I would love a discussion that can proove me wrong. If anyone has any technical specifications they can list, or ideas of what may be included in the architecture, please speak up! This should be an interesting conversation.

    By Humus November 07, 2000, 11:34 PM

    Well, I think the Itanium is hot ... well, literally, with it 320 million transistors ...
    But actually, I feel like Itanium will not be a hit, while I do believe in the IA-64 archetecture. What will be the main disadvantage of the Itanium is lack of applications. This will probably change by the time, but it took like 10 years to move over to 32bit ...

    About the Sledgehammer, well, I was kinda disappointed when I heard of it first. I hoped that they would follow the IA-64 route, but they probably realized the 64bit need a little to late. They just didn't have to time to do a completely new design.

    By Down8 November 08, 2000, 01:42 AM

    I really want an Itanium. Or, rather, any IA-64 processor. if the P4 doesn't kill Intel, I would probably go with that over the Sledgehammer [but when I next upgrade it will be to a T'bird].

    It will take some years to completely tranfer over to 64bit, but Win9x still had compatible code to make use of 16bit stuff, so I expect that Whistler will be similar for 32bit.

    -bZj

    By jtshaw November 08, 2000, 10:44 AM

    Intel has a lot of power over companies out there. I know HP and Intel worked hard getting lots of linux stuff ported to IA-64 (now basically anything that runs on linux runs on IA-64..though it was supposively very easy to do). MS apparently has a windows os that will run IA-64 now too. With some pressure from MS and Intel people will fold and start ported there software over to IA-64. It won't happen overnight but it will probably happen sooner then many may think. You have to take in account that most software packages don't need to be ported to IA-64 because when it first comes around it will be for expensive servers only. This will give non-server software designers plenty of time to port there stuff if IA-64 is to ever become mainstream. Anyway, I think the major stepping stone will be Intel actually getting the Itanium out there....if I am not mistaken it is almost pushing a whole year late....

    By jaywallen November 08, 2000, 11:24 AM

    What do I think of the "Itanium"? I hope that the marketing guys who think of the names for these things should be dope-slapped. I'm gettin' tired of made-up names on cars, CPUs and children. Maybe they should call this thing the "Unobtainium"!

    :P

    Regards,
    Jim

    Actually, I'm looking forward to the chip and the architecture on which it stands.

    By Arcadian November 08, 2000, 01:18 PM

    jtshaw, I have heard that there are already over 400 apps compiled in native IA-64 already. This includes a lot of scientific apps as well.

    One professor from a University (I forget which one) runs scientific programs and said that there was a routine that he used to run on the supercomputers of two years ago that took a day to complete and get data. On a 4 processor Itanium system (he has one of the Pilot release systems), this same task took less than 1/2 hour. Granted that computers have matured over the last two years, but to get a 48x improvement over a supercomputer seemed impressive to me!

    I think there will be a lot of software apps available for Itanium by launch (I think I read somewhere that launch was in Q1 2001). I also think Itanium will give Ultrasparc III a run for its money.

    Also, this comment is for jaywallen. What parts of the architecture do you forward to the most?

    By jtshaw November 08, 2000, 02:02 PM

    I certainly don't doubt that, I know basically everything running on Linux or BSD runs on IA-64. I don't know much about what has been ported Windows wise but I know the OS has and I would assume anything developed for servers or by universities would be the first to go over. I think the Itanium will be a mid to high end server wrecking ball when Intel gets them out the door. UltraSparc III is going to get a run for the money I believe.

    quote:Originally posted by Arcadian:
    jtshaw, I have heard that there are already over 400 apps compiled in native IA-64 already. This includes a lot of scientific apps as well.

    One professor from a University (I forget which one) runs scientific programs and said that there was a routine that he used to run on the supercomputers of two years ago that took a day to complete and get data. On a 4 processor Itanium system (he has one of the Pilot release systems), this same task took less than 1/2 hour. Granted that computers have matured over the last two years, but to get a 48x improvement over a supercomputer seemed impressive to me!

    I think there will be a lot of software apps available for Itanium by launch (I think I read somewhere that launch was in Q1 2001). I also think Itanium will give Ultrasparc III a run for its money.

    Also, this comment is for jaywallen. What parts of the architecture do you forward to the most?

    By Moridin November 08, 2000, 03:19 PM

    My biggest concern about Itanium is the high transistor count and large power requirements. While these are not really a problem themselves they may be indicative of underlying problems, in particular they indicate a very complex design.

    I would not normally worry about a complex design but Itanium EPIC architecture is based on VLIW, and the primary goal of VLIW is to simplify the chip design. This is accomplished by using the compiler to find parallelism rather then dedicated hardware built into the chip.

    I wonder why the chip is so complex. IMHO EPIC should have resulted in simple, small, fast chip, not a large complex, slow (in MHz) chip. The other concern is, now that you have shifted complexity into the compiler from hardware it makes the compiler much more difficult to write.

    The rise if RISC architectures in the late 80's and early 90's came in large part from the recognition that careful selection of the ISA could simplify both the compiler and the micro-architecture of the processor. This resulted in better implementation of both and yielded better performance then CISC processors, even though the CISC processors had more powerful instruction sets.

    Since then Intel has done an amazing job at keeping up and even passing most RISC architectures. The cost was that both chips and compilers were larger and harder to design then for a RISC chip of equivalent power.

    It remains to be seen whether the complexity of Itanium is the result of this being the first IA-64 processor, or if it indicates Intel made some fundamental mistakes when it defined the IA-64 ISA.

    By Moridin November 08, 2000, 03:39 PM

    A few more points.

    I like the idea of allowing the compiler to do a lot of the work of finding parallelism, but I don't think this allows you to completely eliminate OOOE which is what Intel has tried with Itanium. OOOE allows you deal with some problems that VLIW/EPIC cannot handle on its own.

    You do not know at compile time if your data is in cache or main memory. You can prefetch and hope it is there by the time you need it, but this will not always work, and if it fails the processor may sit idle for hundreds of cycles. OOOE on the other hand can deal with the problem dynamically when it occures.

    The second problem associated with the lack of OOOE is that OOOE resources are the same as the resources required by SMT. I can't see IA-64 doing SMT without adding OOOE as well. This would add complexity and defeat the VLIW/EPIC concept of eliminating complexity by eliminating OOOE.

    In the long run though the combination of explicit parallelism with an out of order core could be very powerful, if the IA-64 ISA can be made to do OOOE.

    The other problem I have is the support for IA-32. I'd be willing to bet that a lot of the complexity of Itanium comes from IA-32 support. Intel would have been much better of not supporting it.

    By Moridin November 08, 2000, 03:56 PM

    quote:Originally posted by Humus:
    Well, I think the Itanium is hot ... well, literally, with it 320 million transistors ...
    But actually, I feel like Itanium will not be a hit, while I do believe in the IA-64 archetecture. What will be the main disadvantage of the Itanium is lack of applications. This will probably change by the time, but it took like 10 years to move over to 32bit ...

    About the Sledgehammer, well, I was kinda disappointed when I heard of it first. I hoped that they would follow the IA-64 route, but they probably realized the 64bit need a little to late. They just didn't have to time to do a completely new design.

    IA-64 will be a server/workstation chip for quite some time and may never move down into the desktop. As such it doesn't need all that much in the way of software support. As long as Oracle, SGI's graphic app (what is it called again? MAYA?) and a few other key apps are ported IA-64 will be ok in this regard.

    As far as OS support goes I'm sure HP will port HP-UX. That along with Dynamo will give IA-64 access to almost all current server and workstation applications.


    By Moridin November 08, 2000, 04:08 PM

    Ok, I'll try for four posts in a row.

    I just wanted to add that in the high-end server market a powerful processor is not an absolute requirement. Sun's USII processors have performed well below most of its competitors over the last 5 years yet Sun has been very successful in that time. At the same time the Alpha, which is the fastest processor available has done poorly.

    In this market the apps lend themselves to multiple processors so the power of the individual processors is less important then the multi-processor implementation. I.E. the bus, memory system, chipset and OS. If you do these things well you can actually increase you sales since the customer buys more processors to do the same job and the fastest system is the one with the most processors.

    By JabberJaw November 08, 2000, 05:37 PM

    Sorry, what's EPIC? VLIW? OOOE? Thanks

    By Arcadian November 08, 2000, 06:04 PM

    quote:Originally posted by Moridin:
    My biggest concern about Itanium is the high transistor count and large power requirements. While these are not really a problem themselves they may be indicative of underlying problems, in particular they indicate a very complex design.

    As long as Itanium can be air cooled, it doesn't matter how large it is, or how high the power requirements go. In the high end server world, it's trivial to get more fans and larger power supplies.

    In fact, I'm sure the general mood during design was to get high performance, even if you have to brute force it. In other words, if there is a method of getting 5% performance, but it increases the core temperature a few degrees, do it anyways.

    From what I understand about McKinley, a smaller die size, and lower power were more of a factor in design. McKinley was also designed fairly independantly from Merced (Itanium), so there were many differences in implementations. The biggest difference was that Merced had to come out first, so a small die with lower power was less of an issue.

    quote:Originally posted by Moridin:
    I would not normally worry about a complex design but Itanium EPIC architecture is based on VLIW, and the primary goal of VLIW is to simplify the chip design. This is accomplished by using the compiler to find parallelism rather then dedicated hardware built into the chip.

    This is true for VLIW, but not for EPIC. VLIW is very compiler intensive, while EPIC is a mix between compiler and hardware assisted optimizations.

    quote:Originally posted by Moridin:
    I wonder why the chip is so complex. IMHO EPIC should have resulted in simple, small, fast chip, not a large complex, slow (in MHz) chip. The other concern is, now that you have shifted complexity into the compiler from hardware it makes the compiler much more difficult to write.

    Again, complexity came from trying to get the chip to debut quicker. McKinley will have a much smaller core, AFAIK.

    It's also true that the compiler is tougher to write, but as I said above, there is hardware that continues to optimize the code.

    quote:Originally posted by Moridin:
    The rise if RISC architectures in the late 80's and early 90's came in large part from the recognition that careful selection of the ISA could simplify both the compiler and the micro-architecture of the processor. This resulted in better implementation of both and yielded better performance then CISC processors, even though the CISC processors had more powerful instruction sets.

    Since then Intel has done an amazing job at keeping up and even passing most RISC architectures. The cost was that both chips and compilers were larger and harder to design then for a RISC chip of equivalent power.

    Good point, and excellent observation. A chips performance can often times be in the implementation rather than the architecture.

    quote:Originally posted by Moridin:
    It remains to be seen whether the complexity of Itanium is the result of this being the first IA-64 processor, or if it indicates Intel made some fundamental mistakes when it defined the IA-64 ISA.

    The IA-64 ISA began its design in the early '90s. It was created by some of the brightest minds in the industry. Chances are that the IA-64 ISA is pretty sound. The complexity of the die itself is not likely going to affect performance, in my opinion. Sure an optimized layout would be preferable, and we'll likely see that with McKinley.

    By Arcadian November 08, 2000, 06:14 PM

    quote:Originally posted by JabberJaw:
    Sorry, what's EPIC? VLIW? OOOE? Thanks

    VLIW = Very Long Instruction Word. It is an Instruction Set Architecture (ISA), just like RISC and CISC are ISAs, but the difference is that the compiler can take a bundle of instructions, and align them so that they are more "friendly" for the processor. This way, there are certain optimizations that can be done to the CPU that can really improve performance. However, it does assume that you have a perfect compiler, and those are hard to make . Transmetta's Crusoe chip is a VLIW processor.

    EPIC = Explicitly Parellel Instruction-set Computer. This is an ISA based on VLIW, but it takes it to another level of practicality. Instead of depending 100% on the compiler, the processor also has hardware to optimize code. This gives the best of both worlds of VLIW and CISC. IA-64 is the first EPIC ISA, and we will have to see if it was worth it.

    OOOE = Out Of Order Execution. In the old days, instructions were fed to processors In Order. This is ok, but sometimes the processor has to wait for data to load from memory. An Out Of Order computer can execute other instructions while waiting for the first one to receive data. The P6 and K7 architectures both use OOOE, but VIA's Cyrix III does not. The Itanium is In Order, but the theory is that the compiler will order the instructions in a better way than an Out Of Order engine. Fortunately for EPIC, if the compiler does not, the CPU can still optimize further (this is limited, however). I'm eagerly waiting to see how the Itanium does.

    Hope this answers you questions .

    By Bash November 09, 2000, 12:03 AM

    Concerning Itanium complexity...

    I guess that from what I've heard the Itanium shouldn't be an excessively complex chip. This is primarily due to its EPIC architecture -- because the instruction scheduling is done in the compiler, the Itanium doesn't need to include the excessively complicated instruction scheduling circuitry. I can't remember the number any more, but I think the P3 instruction scheduler takes a sizeable percentage of the chip area.

    So the question of the day is where do the all the transistors in the Itanium get used. One candidate is the register file. I believe the Itanium has 128 usable registers. What's that, about 10x the number of registers available in the x86 architecture? Also, I've heard the first generation Itanium mantains backward compatability with x86 binaries by having some kind of instruction translation hardware built in -- this must take a bit of space too. I suppose if they have any free transistors after that we can just pray for more functional units


    -Bash

    By Conrad Song November 09, 2000, 01:02 AM

    http://developer.intel.com/design/ia-64/microarch_ovw/index.htm

    Somehow I don't know of any server microprocessor out there today that ISN'T excessively complex. Itanium is a beast. Not the same animal as POWER4, but a beast still the same.

    By Arcadian November 09, 2000, 02:05 AM

    quote:Originally posted by Bash:
    Concerning Itanium complexity...

    I guess that from what I've heard the Itanium shouldn't be an excessively complex chip. This is primarily due to its EPIC architecture -- because the instruction scheduling is done in the compiler, the Itanium doesn't need to include the excessively complicated instruction scheduling circuitry. I can't remember the number any more, but I think the P3 instruction scheduler takes a sizeable percentage of the chip area.

    So the question of the day is where do the all the transistors in the Itanium get used. One candidate is the register file. I believe the Itanium has 128 usable registers. What's that, about 10x the number of registers available in the x86 architecture? Also, I've heard the first generation Itanium mantains backward compatability with x86 binaries by having some kind of instruction translation hardware built in -- this must take a bit of space too. I suppose if they have any free transistors after that we can just pray for more functional units

    -Bash

    Thanks for responding, Bash .

    I have two comments. First, it's hard to compare the number of registers in IA-64 vs IA-32, because IA-32 has a good amount of renaming register. Just how many is something that Intel chooses not to disclose. However, I bet it's around 100.

    Second, the Itanium has IA-32 compatability through the use of an entire IA-32 core right on the die. Yes, the die combines everything in IA-64 PLUS an IA-32 core for compatability. The Itanium can switch between the two using an instruction that Windows 64bit knows how to use.

    By Humus November 09, 2000, 02:36 PM

    quote:Originally posted by Arcadian:
    Thanks for responding, Bash .

    I have two comments. First, it's hard to compare the number of registers in IA-64 vs IA-32, because IA-32 has a good amount of renaming register. Just how many is something that Intel chooses not to disclose. However, I bet it's around 100.

    Second, the Itanium has IA-32 compatability through the use of an entire IA-32 core right on the die. Yes, the die combines everything in IA-64 PLUS an IA-32 core for compatability. The Itanium can switch between the two using an instruction that Windows 64bit knows how to use.

    Around 100??? You kidding? I'd bet it's below 20.
    I don't have exact numbers but I think Athlon only have 17 renaming registers, for the simple reason that more registers were useless. I mean, how many false dependencies can occure, and how long time do you think it takes to get them resolved?

    By Bash November 09, 2000, 03:36 PM

    quote:Originally posted by Arcadian:
    Thanks for responding, Bash .

    I have two comments. First, it's hard to compare the number of registers in IA-64 vs IA-32, because IA-32 has a good amount of renaming register. Just how many is something that Intel chooses not to disclose. However, I bet it's around 100...


    I know that the core of the P3 has a great deal of registers available for renaming, but there's a big difference between renaming registers and actual user accessable registers. When you compile code for the x86, you have to continually load variables from memory because you don't have near enough registers to store all of them. With 128 user accessable registers you'll be able to compile code much more effeciently. Of course, this is a necessity for the Itanium since the compiler is doing all of the instruction scheduling too.

    Oh about the transistor count in the 3rd or 4th message of this thread...320 MILLION??? Is that counting a 16meg L2 cache or something? I bet the core is around 32 million, which is approx. the same as the current P3 / Athlon.


    -Bash

    By jaywallen November 09, 2000, 04:07 PM

    quote:Also, this comment is for jaywallen. What parts of the architecture do you forward to the most?

    Sorry, I was so busy being a smartass that I forgot to actually say something about the architecture.

    I'm a pragmatist, and I like the new architecture for what it promises me. My background is in physics with a specialty in medical imaging. So the thing I look forward to is finally having a platform for mere mortals that has the memory bandwidth and addressability, as well as the pure raw hunk to do things like, say, fourier transform-based high resolution motion-corrected image reconstructions in near-real-time. Present-day solutions to image reconstruction needs are barely able to provide us with useful resolution and specificity in the output. Furthermore, a great deal of the detector event discrimination chores must be handled by the detection apparatus itself. The application of discrimination algorithms at the detectors makes the detection apparatus less efficient, but, worse yet, presupposes that the twits who designed the algorithms are not causing the apparatus to discard useful information! I'd rather grab all of the interaction events I can detect in raw form and analyze them and apply discrimination techniques where multiple alternative algorithms can be imposed, at the back end. This means that the only absolute limitation for future analysis of medical images will be the actual absolute limitations of the detectors used in the imaging procedures. And subsequently developed analytic techniques will be applicable to the saved raw data, meaning that revised analyses and readings of those analyses may yield important new data from studies performed previously. As a scientist, I hate discarding data before it has been analyzed just because I was too stupid to be able to analyze it properly at the moment it was gathered!

    Regards,
    Jim

    By Arcadian November 09, 2000, 09:07 PM

    quote:Originally posted by Humus:
    Around 100??? You kidding? I'd bet it's below 20.
    I don't have exact numbers but I think Athlon only have 17 renaming registers, for the simple reason that more registers were useless. I mean, how many false dependencies can occure, and how long time do you think it takes to get them resolved?

    Well, more than renaming registers, there are registers for each pipeline, and lots of hidden registers for many puposes. Over all, I'd say the P6 architecture has well over 100 registers. However, it's interesting that only 8 are user accessable.

    Also, this is for Bash: I just wanted to remind you that I mentioned registers because we were comparing die sizes. I realize that user accessable registers mean more for performance. Thanks for the clarification, though.

    By Arcadian November 09, 2000, 09:15 PM

    quote:Originally posted by jaywallen:
    My background is in physics with a specialty in medical imaging. So the thing I look forward to is finally having a platform for mere mortals that has the memory bandwidth and addressability, as well as the pure raw hunk to do things like, say, fourier transform-based high resolution motion-corrected image reconstructions in near-real-time.

    Regards,
    Jim

    Thanks for the detail, Jim. I think you'll find that Itanium has the raw floating point power to match anything else out there at similar clock speeds, including Alpha. I hear that the floating point engine can calculate 8 single precision or 4 double precision floating point numbers at the same time. Compare this to SSE-2, which can only do 4 single precision and 2 double precision floating point calculations simuiltaneously. And SSE-2 is not available for every program, while IA-64 will be taken advantage of from the beginning.

    Also the Itanium's chipset, called 460GX, does memory interleaving technology to give very high bandwidth. I think it will be more than enough to provide plenty of data to the power hungry Itaniums. I'd really like to see some Itanium chips in action.

    By Humus November 10, 2000, 12:31 AM

    quote:Originally posted by Arcadian:
    Well, more than renaming registers, there are registers for each pipeline, and lots of hidden registers for many puposes. Over all, I'd say the P6 architecture has well over 100 registers. However, it's interesting that only 8 are user accessable.

    Sure, if you count all registers it's gonna be a huge amount ... but you said you thought it was over 100 renaming register, which is very unlikely.

    By Arcadian November 10, 2000, 01:29 AM

    quote:Originally posted by Humus:
    Sure, if you count all registers it's gonna be a huge amount ... but you said you thought it was over 100 renaming register, which is very unlikely.

    With so many instructions that can be simultaneously in flight on the Pentium III, I would be surprised if there weren't a lot of renaming registers to support all the dependancies. But, since Intel isn't revealing how many they use, it's probably senseless to argue.

    So Humus, do you have any other comments on Itanium?

    By Humus November 10, 2000, 07:58 AM

    quote:Originally posted by Arcadian:
    With so many instructions that can be simultaneously in flight on the Pentium III, I would be surprised if there weren't a lot of renaming registers to support all the dependancies. But, since Intel isn't revealing how many they use, it's probably senseless to argue.

    So Humus, do you have any other comments on Itanium?

    There are a lot of instructions in flight in the Athlon too, and from what I understand there's no gain of having more than 17 renaming register, and I doubt the situation is significantly different on the P3 ...
    But it's not an important issue ...

    About the Itanium, one kickass feature of this lilly processor is register indexing (or whatever it's called). For the first time you can put small arrays into register, pretty cool if you ask me ..

    By Conrad Song November 10, 2000, 10:23 AM

    quote:Originally posted by Bash:

    ...When you compile code for the x86, you have to continually load variables from memory because you don't have near enough registers to store all of them. With 128 user accessable registers you'll be able to compile code much more effeciently. Of course, this is a necessity for the Itanium since the compiler is doing all of the instruction scheduling too.

    -Bash

    . The pass and retrieval of subroutine parameters are all load/store instructions. Passing by register is extremely difficult to do because of the few general registers in IA-32, and the difficulty in supporting this is in a precompiled object or library. In summary, passing by register on IA-32 is nearly non-existant.

    In IA-64, the "default" passing model is on the register-stack. IA-64 instructions are provided to allocate and free register stack space, which automatically fills/spills to the stack and rotates as needed. Well, if you think that fill/spills are load/stores, you're right. But if you analyze the stack frame level of object-orientated code, in particular, you'll find that the stack frame level a majority of the time stays well within 1-2 levels a high percentage of the time from the current level. And because most methods are not parameter heavy, chances are that spills/fills are infrequent. So effectively, registers 32-127 become a register cache for the run-time stack.

    Better still, because this is the only parameter passing model, these benefits are gained across precompiled objects and libraries without special treatment. Big win.

    By mosier November 10, 2000, 12:36 PM

    quote:Originally posted by Arcadian:
    jtshaw, I have heard that there are already over 400 apps compiled in native IA-64 already. This includes a lot of scientific apps as well.

    One professor from a University (I forget which one) runs scientific programs and said that there was a routine that he used to run on the supercomputers of two years ago that took a day to complete and get data. On a 4 processor Itanium system (he has one of the Pilot release systems), this same task took less than 1/2 hour. Granted that computers have matured over the last two years, but to get a 48x improvement over a supercomputer seemed impressive to me!

    I think there will be a lot of software apps available for Itanium by launch (I think I read somewhere that launch was in Q1 2001). I also think Itanium will give Ultrasparc III a run for its money.

    Also, this comment is for jaywallen. What parts of the architecture do you forward to the most?


    I know a Prof (more of research ph.d)that is running the "pilot" release now. It is with some PHAST technology via P&G. Personally, I would say the computers are unbelievable when doing projects that can actually utilize the multi-processor IA-64. I have used the computer for different apps, and running a program developed for rendering the images for the consortium runs probably 10 times as fast as what is took before. Now the bottleneck problem is the hardware that the program is rendering for. The inability to run the old apps is going to be the downfall for the near future, but scrapping the old and starting anew can only be a good thing.

    Now if that weakest link in hardware could be taken care of....

    By Arcadian November 10, 2000, 01:29 PM

    quote:Originally posted by mosier:
    I know a Prof (more of research ph.d)that is running the "pilot" release now. It is with some PHAST technology via P&G. Personally, I would say the computers are unbelievable when doing projects that can actually utilize the multi-processor IA-64. I have used the computer for different apps, and running a program developed for rendering the images for the consortium runs probably 10 times as fast as what is took before. Now the bottleneck problem is the hardware that the program is rendering for. The inability to run the old apps is going to be the downfall for the near future, but scrapping the old and starting anew can only be a good thing.

    Now if that weakest link in hardware could be taken care of....

    Wow, thanks for the info, Mosier. It's nice to hear that rendering software gets so much of an improvement. I have several questions, though.

    1) What system were you using previously that ran 10 times slower?

    2) Which bottleneck in hardware were you refering to?

    3) What rendering program were you using before, and which are you using now? I am curious what Itanium does and does not support.

    4) What is PHAST from P&G?

    5) How many Itanium processors is this professor running?

    Thanks again... I appreciate any comments.

    By Moridin November 10, 2000, 07:56 PM

    quote:Originally posted by Humus:
    [B] There are a lot of instructions in flight in the Athlon too, and from what I understand there's no gain of having more than 17 renaming register, and I doubt the situation is significantly different on the P3 ...
    But it's not an important issue ...

    B]

    The number of rename registers is directly related to the number of pipeline stages. The more stages the more rename registers you need. The Athlon has 72 and the P4 has 128. I don't know hoe many the PIII has but I suspect it is about the same as the Athlon since the working parts of the pipelines are about the same length.

    By Moridin November 10, 2000, 08:05 PM

    quote:Originally posted by Conrad Song:
    http://developer.intel.com/design/ia-64/microarch_ovw/index.htm

    Somehow I don't know of any server microprocessor out there today that ISN'T excessively complex. Itanium is a beast. Not the same animal as POWER4, but a beast still the same.


    The only reason I am concerned with Itanium complexity is that EPIC should have resulted in a simpler design, not a more complex one. Rumors are that McKinley is half the size of Itanium and has more on chip cache as well. So most likely this is the result of a poor first design for IA-64, but I want to see something more definite before I decide.

    No kidding the Power 4 is a beast. I bet each Power 4 module consumes 1500 W or more. A fully configured 64 processor Power 4 could consume 10 kW or more.

    By Humus November 10, 2000, 10:41 PM

    quote:Originally posted by Moridin:
    The number of rename registers is directly related to the number of pipeline stages. The more stages the more rename registers you need. The Athlon has 72 and the P4 has 128. I don't know hoe many the PIII has but I suspect it is about the same as the Athlon since the working parts of the pipelines are about the same length.

    Where did you get that information from?
    I was unable to find any information about how many renaming register there are on the Athlon on AMD site.

    By Moridin November 11, 2000, 11:10 AM

    quote:Originally posted by Humus:
    Where did you get that information from?
    I was unable to find any information about how many renaming register there are on the Athlon on AMD site.

    Whoops, I may have been thinking of the number of instructions in the Instruction control unit/reservation satiation. (That will teach me to post when I am drinking.) Sorry about that.

    I have seen numbers in this range before for the number of rename registers though, but I can't remember where just now. I know you need a lot more then 10 to 20 registers, 10 to 20 sets of registers (80 to 160) makes more sense to me. You may need up to 1 set of registers for each instruction currently in one of the pipelines.

    By Humus November 12, 2000, 01:05 PM

    I'll start with admitting that I'm not sure how the renamning actually work and I don't have any Athlon pipeline scheme in front of me now, but I don't believe in 100 renaming register.
    Register renaming is only used to avoid pipeline stalls with false dependencies. First of all those aren't too many (ok, not uncommon but they are below perhaps 15% (wild guess) of the code). Anyway, we may want to avoid stalling the pipeline even in the worst case scenario which would be something like this:

    ADD EAX, EXB
    ADD EBX, ECX
    ADD ECX, EDX
    ADD EDX, ESI
    ADD ESI, EDI
    ADD EDI, ESP
    ADD ESP, EBP
    ADD EBP, EAX
    ...
    etc.

    My feeling is that you have a bunch of renaming registers, and when a write after read occurs you need to assign the register to a new renaming register for the write operation so that reading and writing to that register can occure at the same time. Now that a register is renamed the old register is free to be used for subsequent renaming operations. The worst case scenario is that all register would need to be renamed, like in my example above with a cpu that could execute 7 instructions in parallel (the last instruction has a true dependency of the first). In this case you need 15 renaming registers. Add to that the flags register + 1 for renaming for it and you have 17 registers.

    By nukefault November 13, 2000, 09:09 PM

    I'm guessing the IA64 transition will be a lot like the Mac PPC transition way back when. Personally, I think if Intel can do a decent job of the IA64 as opposed to the early days of the P3 it will be an awesome setup that blows the doors off AMD in the high end server market.

    On the other hand, this is Intel, and so I'd take everything surrounding IA64 with a hefty helping of salt.

    Also, Intel does tend to overcharge for their chips and I'm confident that AMD will retain its price/performance leadership. Clawhammer looks way more feasible for the average high end user, and Sledgehammer should do well in the high end market as well because it won't have the transition problems of IA64.

    By Arcadian November 13, 2000, 09:52 PM

    quote:Originally posted by nukefault:
    Also, Intel does tend to overcharge for their chips and I'm confident that AMD will retain its price/performance leadership. Clawhammer looks way more feasible for the average high end user, and Sledgehammer should do well in the high end market as well because it won't have the transition problems of IA64.

    No transition? Sledgehammer will have 10x the transition that Itanium will have. First, ask yourself what Itanium needs. The first answer I expect you to say is software. Now think about the transition Seldghammer needs. If you stare back at me with a blank look, I'm going to assume you don't know, so let's talk infrastructure.

    * Will AMD have rack optimized servers ready from OEMs?
    * Will motherboard manufacturers include 64bit or 66MHz PCI in their 760MP motherboards?
    * Will AMD push forward with power and disk redundancies, as well as hot-plug?
    * Will AMD's systems scale well, and include ample memory upgradability?
    * Will AMD have support from major server OEM's like IBM, HP, NEC, or Compaq?
    * Will AMD include ECC support for memories and caches?
    * Will AMD be able to guarentee stability to increase uptime and reliability?
    * Will AMD include server management features to increase servicability?

    These are all transition items necessary for Sledgehammer's acceptance. AMD is not known for being a good supplier of any of them, so I have a feeling Sledgehammer will have a tough time breaking through the low end.

    But, I'm glad you mentioned this, since it gave me a chance to explain about server infrastucture. Thanks for the post, nukefault.

    By Humus November 14, 2000, 11:35 AM

    I think the sledgehammer may have a hard time in the server area, but I think it may actually be rather popular in desktops if the price is compititive.

    By Arcadian November 14, 2000, 05:06 PM

    quote:Originally posted by Humus:
    I think the sledgehammer may have a hard time in the server area, but I think it may actually be rather popular in desktops if the price is compititive.

    Clawhammer will be the processor for desktop systems, and I can safely say that you will definately NOT see Sledgehammer for desktops. AMD plans on using a ccNUMA architecture for the Sledgehammer platform, and their 4-way and 8-way systems will be very expensive. I figure $10,000-$20,000 for an entry level system (this is still higher price/performance than Intel's 4-way systems, which go from $20,000-$40,000 for "light" configurations). Clawhammer will probably go for the $2,000-$3,000 systems (performance desktop segment), so if you are looking for a K8, that should be the one you choose.


    Contact Us | www.SharkyForums.com

    Copyright © 1999, 2000 internet.com Corporation. All Rights Reserved.


    Ultimate Bulletin Board 5.46

    comment on SharkyForums.com
    previous page
    home




    HardwareCentral
    Compare products, prices, and stores at Hardware Central!


    Copyright © 2002 INT Media Group, Incorporated. All Rights Reserved. About INT Media Group | Press Releases | Privacy Policy | Career Opportunities