Home

News

Forums

Hardware

CPUs

Mainboards

Video

Guides

CPU Prices

Memory Prices

Shop



Sharky Extreme :


Latest News


- Outdoor Life: Panasonic Puts 3G Wireless Into Rugged Notebooks
- Averatec Launches Lightweight Turion 64 X2 Laptop
- Acer Fires Up Two New Ferrari Notebooks
- Belkin Debuts Docking Station for ExpressCard-Equipped Notebooks
- Logitech 5.1 Speaker System Puts Your Ears At Eye Level
News Archives

Features

- SharkyExtreme.com: Interview with ATI's Terry Makedon
- SharkyExtreme.com: Interview with Seagate's Joni Clark
- Half-Life 2 Review
- DOOM 3 Review
- Unreal Tournament 2004 Review

Buyer's Guides

- September High-end Gaming PC Buyer's Guide
- September Value Gaming PC Buyer's Guide
- October Extreme Gaming PC Buyer's Guide

HARDWARE

  • CPUs


  • Motherboards

    - Gigabyte GA-965P-DS3 Motherboard Review
    - DFI LANPARTY UT nF4 Ultra-D Motherboard Review

  • Video Cards

    - Gigabyte GeForce 7600 GT 256MB Review
    - ASUS EN7900GT TOP 256MB Review
    - ASUS EN7600GT Silent 256MB Review
    - Biostar GeForce 7900 GT 256MB Review





  • SharkyForums.Com - Print: Pentium 4 pros and cons

    Pentium 4 pros and cons
    By Arcadian October 31, 2000, 12:16 AM

    We all know a good amount of what will be included in the Pentium 4 architecture. Intel has made a great amount of information available about this. However, one thing we do not have is any reliable performance numbers. So, I wanted to get some hard core speculation regarding the Pentium 4. This thread has really died down this past week, so I wanted to recessitate it a little. Come on and help me out.

    What I'm looking for are people's opinions on what aspects of the Pentium 4 will likely be fairly positive or what will be fairly negative. What do you think has been overrated to the point of marketing hype, and what do you think needs to be discussed more? Do you think the Pentium 4 will surprise everybody, or dissapoint. I welcome opinions, but would like it more if you could keep this technical, so we can respond to ideas.

    Here are some aspects that could warrent discussion:

    - 400MHz Front Side Bus
    - Double Pumped Execution Units
    - Lack of Dual Processing at Launch
    - More than Double the Heat of Pentium III
    - Rambus Memory
    - High Bandwidth Caches
    - 20-stage Pipeline / Branch Misprediction
    - High Clock Frequencies
    - Higher Clock Frequencies in the Future
    - Excellent Branch Predictor
    - Small L1 Cache
    - Trace Cache
    - Large # of Instructions in Flight
    - New PC Specification / Large Heat Sinks
    - Low Expectations (on Websites)
    - Rumors on the Web
    - i850 Delays
    - Price Issues
    - Much More!

    Wow... this is a lot to talk about. Hopefully, we get some good discussions going.

    By gamigin October 31, 2000, 06:40 AM

    Rambus mem's? what for? they're expensive and are they really that good?
    Doubled heat aye... gotta get a hell of a fan i guess... or just stick with AMD

    By jtshaw October 31, 2000, 11:01 AM

    I know a lot of people have already counted the P4 out. There has been lots of speculation that it isn't going to be as fast as the Athlon or even the PIII on a Mhz. to Mhz. basis and people seam to have fun talking about how crappy of a product it is. This all might be true but it could just as easily be wrong.
    There are a few things I see in the P4 that lead me to believe Intel understands what new processors need. One is the 400 Mhz. (or 100Mhz. "Quad Bumped") bus. AMD seamed to catch on a little earlier with the Duron and the Athlon that bus speed is an important factor (maybe a lesson they learned back when the PII beat them to 100Mhz. and won the proformance title). Intel took a nice jump from 133mhz. all the way to 400mhz. Another interesting thing I heard is that Intel is working on a DDR chipset for the P4 and is trying to figure out a way to break the contract with Rambus.
    RDRAM is definitly something to talk about with the P4 because even if Intel does create a DDR chipset we won't see it for a while. From what I gather, in terms of the PIII, RDRAM had no real benifit because the bus couldn't match the high ram speed. It is possible that with the P4 RDRAM could actually cause proformance increase because of it's added bus bandwidth. Troubles with RDRAM aren't looking to go away and it will still probably be rather expensive and hard to get a hold of.
    Another major feature of the P4 I think is worth discussing is the 20-stage pipeline with "Branch Misprediction". This seams to be the heart of the P4's high clock speeds but it could also be the reason why P4 doesn't appear to be all that much better then the Athlon or PIII at this time. It could be software needs to be optimized to take full advantage of this. The longer pipeline also means if bad information gets into the pipeline it will take longer to flush it out...
    The double pumped executions units are another feature that I think could be excellent for proformance if code is properly optimized to take advantage of it.
    One thing many people on these forums, as well as elsewhere, are going to gripe about is the price of the P4. It will be expensive comparitive to AMD parts. Intel has a pricing stratagy that involves supply and demand. Intel hasn't needed to lower the price of there processors as much as AMD has because they are still selling more of them. AMD is making money for the first time ever but they are still making far less then Intel. Businesses are still more likely to go with Intel parts because they have been buying them for years. I am not saying this is right but for now it is the way things are.
    I have been blabbing on for too long already...to end this for now, I think that the P4 has some features which look like they could be really great for proformance. They are also plenty of questions in my mind about the ability of the P4 to really proform. I can't wait to see it come out so we can run it to the ground and see just how well it runs.

    By LiquidGoop October 31, 2000, 11:34 PM

    One thing that you overlooked, so I just had to bring it up, was the fact that the p4 will be TWICE as big as the p3. This means taht intel will only be able to produce half the amount of cpus on the same wafer. However, intel has said that supply will not be a problem. I think everyone remembers the p3 1.13Ghz.. err.. incident. I'm not tryingto bash intel, the p4has alot going for it, especially the 400Mhz bus. The 20-stage pipeline might hurt it somewhat,that lost ground might easily be made up by the excellent branch predicotr. Hard to tell at this point, looks like some pros and cons toeven out, coincidently the title of teh topic.. The small L1 cache looks like a bad decision, IMO. Seeing the high clockspeeds, you'd think that it would have more l1 cache.

    Anybody have any mustang specswecould compare the p4 to? Wear anasbestos coated jacket tho, we know how things can heat up in here

    By Phoenix November 01, 2000, 03:52 AM

    I'm interested to see how the 400MHz FSB helps mempry performance as I have heard arguments from both sides. I don't really think that a lack of dual processing at launch will hurt them too much, as long as AMD doesn't have a good dual setup that performs better than the Intel line. The only large market for dual, or larger, setups are servers for business, which Intel will still have control over, even if AMD does come out with something better. I'm wondering if the small cache will be a good idea, even if the cache is faster. Would a small amount of cache have a bigger hit on performace if it was used in a server? If so, maybe this is why they haven't come up with dual support yet. Price could hurt it, and low expectation could really help the P4 if it turns out good, the public loves a good surprise.

    By Arcadian November 01, 2000, 01:04 PM

    quote:Originally posted by Phoenix:
    I'm interested to see how the 400MHz FSB helps mempry performance as I have heard arguments from both sides. I don't really think that a lack of dual processing at launch will hurt them too much, as long as AMD doesn't have a good dual setup that performs better than the Intel line. The only large market for dual, or larger, setups are servers for business, which Intel will still have control over, even if AMD does come out with something better. I'm wondering if the small cache will be a good idea, even if the cache is faster. Would a small amount of cache have a bigger hit on performace if it was used in a server? If so, maybe this is why they haven't come up with dual support yet. Price could hurt it, and low expectation could really help the P4 if it turns out good, the public loves a good surprise.

    I may have some insight regarding your cache size concerns. From what I have heard (from various sources ) is that the 8KB d-cache is sized to allow for scalability into >2.0GHz ranges. This will not apply at first, but remember that Intel would like the design to be around for approximately 5 years like the P6 architecture.

    In terms of servers, there will be a version of the Pentium 4 called Foster (which will probably be named Pentium 4 Xeon). Like the Pentium III Xeon, there will probably be small cache versions and large cache versions, the latter of which will be used in departmental/enterprise servers. The large cache will not be L2, but rather L3, and will probably come in sizes ranging from 1MB to 3MB. From what I know, this product will come later next year, so as to not conflict with the 900MHz Pentium III Xeon (with 1MB or 2MB of cache) due out in a couple of months.

    The server market should really be heating up this next year between Sun's Ultrasparc III and Intel's Pentium 4 Xeon. Personally, I think the Pentium 4 Xeon (still not certain on this name?) will kick the pants off Ultrasparc III .

    By jtshaw November 01, 2000, 05:51 PM

    A quick comment about the Pentium 4 Foster/Xeon vs. the UltraSparcIII. If The P4 Xeon is kin to the P4 like the P3 Xeon is to the P3 then it will probably crush the ultraSparcIII on many levels...you ever tried to get upgraded hardware for a sparc station? PAIN IN THE BUTT!

    By -= HaX0r =- November 02, 2000, 02:58 AM

    Just go with AMD from now on, Intel is losing their strong hold on the market.

    By Phoenix November 02, 2000, 03:16 AM

    quote:Originally posted by Arcadian:
    I may have some insight regarding your cache size concerns. From what I have heard (from various sources ) is that the 8KB d-cache is sized to allow for scalability into >2.0GHz ranges. This will not apply at first, but remember that Intel would like the design to be around for approximately 5 years like the P6 architecture.

    In terms of servers, there will be a version of the Pentium 4 called Foster (which will probably be named Pentium 4 Xeon). Like the Pentium III Xeon, there will probably be small cache versions and large cache versions, the latter of which will be used in departmental/enterprise servers. The large cache will not be L2, but rather L3, and will probably come in sizes ranging from 1MB to 3MB. From what I know, this product will come later next year, so as to not conflict with the 900MHz Pentium III Xeon (with 1MB or 2MB of cache) due out in a couple of months.

    The server market should really be heating up this next year between Sun's Ultrasparc III and Intel's Pentium 4 Xeon. Personally, I think the Pentium 4 Xeon (still not certain on this name?) will kick the pants off Ultrasparc III .

    Thanks for the info. Arcadian. I have been playing some Sun Sparc's lately, at the University I attend some of the member of the Linux User's Group setup a Beowulf cluster of Sun Sparc 5's, not the fastest CPU's in the world, but they were free

    By Flip November 02, 2000, 04:19 AM

    The Pentium 4
    New Architechure?
    New Instrucion Sets?
    Quad-pumped 100MHz FSB?

    Lets think about this whole thing logically, so we can really see if this new chip will do more for us than a Pentium 3.

    First of all, the p4 implements hyper-pipelining, which means that it has a 20 stage pipeline, which the p3 only has a 10 stage pipeline. With this hyper-pipeline there is a possibility for higher clock frequencies, but also with this deeper pipe comes more room for error. Lets think: If the p3 has 10 stages, that means that the ALU is predicting instructions and executing them 20 steps down the line, that means that in simple terms, if there were only two instructions, 1 and 0, there would 1024 different paths down that pipe, and the algorithums job is to try very hard to speculate which path out of 1024 to take. Now, that being said, if the p4 has 20 stages, do the math for two instructions, and that would over 1.04 million paths, so now these new and improved algorithms can now predict with the same accuracy which path to take out of 1.04 million? I think not, no matter how many super-genius's they had working on the new algorithms. Not to say that this new 20 stage pipeline is all that bad, Intel has also implemented Advanced Dynamic Execution can keep up to 126 instructions in flight, which helps keep the mis-predictions to a minimum. Another way that Intel intends to solve this problem is the variation of L1 cache called trace cache, which stores decoded x86 instructions in the flow that they are to be processed. Also with the 256K of Advanced Transfer Cache running at full speed with a 256 bit path to the double clocked ALU's this processor will not have much time to rest, and in turn will generate MUCH MORE heat!
    About the SSE2, of course they had to throw some new instruction sets in, but as of right now, there is no such software that is coded to take advantage of these new instruction sets, so we will talk about how good they are when we have a chance to see them in action. And to conclude, there is no clear definition as to which is better(p3 v. p4), all we can do now is just sit back, relax, wait until they are released, and let the benchmarks be the judge.

    Flip
    (as in the chip)

    By Arcadian November 02, 2000, 11:14 AM

    This is for Flip.

    I'm sorry, but you have the wrong idea of how a pipeline works. If I weren't so busy right now, I'd explain it to you. Maybe I'll have time later. I just wanted to clear that up, because your relation of a pipeline to branch prediction is not correct. Thanks for the response, though... I'll try to get back to you later.

    By Arcadian November 02, 2000, 11:16 AM

    quote:Originally posted by -= HaX0r =-:
    Just go with AMD from now on, Intel is losing their strong hold on the market.

    HaX0r, the idea here is to get some discussion going, not to close the argument with a one line opinion. If you think AMD has a stronger hold on the market, can you please explain why?

    By jtshaw November 02, 2000, 11:58 AM

    Surprisingly Intel isn't really lossing anything. They are still making more money and I believe even selling more chips then AMD. Not to say AMD chips are crap, they aren't, but Intel is still the proven solution for businesses as has been stated time and time again on these forums. I would hardly say Intel's stance is in trouble at this time.

    quote:Originally posted by -= HaX0r =-:
    Just go with AMD from now on, Intel is losing their strong hold on the market.

    By Flip November 02, 2000, 12:16 PM

    Arcadian
    Sorry, I didn't mean to sound like an idiot, I wasn't sure completely how it works, I just inferred a lot from what I had read, if you could point me to a good article explaining the 20 stage pipeline, it would be much appcreciated, or even better, if you gave me an explaination, I'd be very greatful!

    Flip

    p.s. no hard feelings? I'm pretty new at the deep down technical interworkings of the microchip.

    By zombor November 02, 2000, 01:30 PM

    hmmmm...i cant remember offhand how big the PIII pipeline is, but isnt 20 more than double what the p3 had? If this is true, it would take the proc twice as long to process n instruction, but the clock speed would be able to be doubled, right? Then, if the chips will debue at 1.4 and 1.5 Ghz, the "rawprocessing power" would be an effective 700-800Mhz p3(assumming there arent any p4 optimized instructions being used). If im wrong tell me, cause at this point, in seriously doubting this chip as a contender for games and shuch.

    By Arcadian November 02, 2000, 02:18 PM

    OK... it looks like people here need a little explanation of how a pipeline works.

    The common analogy is to think about washing your clothes. Doing this chore requires several steps. First you have to put clothes in the washer, and wait 30 minutes for the load to complete. Then you have to put the clothes in the dryer, and wait about 1 hour for them to dry. Then you have to sit down and fold them and put them away, which takes about 15 minutes. Every load of laundry that you do in this kind of example takes about 1 hour and 45 minutes.

    However, consider an alternate example. Instead of waiting 1 hour for the dryer to complete its cycle, say you put another load in the washer, so that both things can go on at the same time. In addition, when it comes time to fold and put away the clothes, say you put additional loads in the washer and dryer. This way, everything is happening at the same time. It is much more efficient. Let's pretend you have as many loads of laundry as a processor has chunks of data (a lot of loads! ). If this were the case, it would take you 1 hour for every load, instead of 1 hour and 45 minutes, because the dryer is the limiting factor. No matter how fast you fold your clothes and put them away, you still have to wait the hour for the dryer to finish. This is PIPELINING, but it's UNBALENCED. In this example, you have 3 pipeline stages: the washer, the dryer, and the folding. Let's take this one step further.

    Suppose you didn't like waiting an hour for the dryer, and a 1/2 hour for the washer, so you bought new machines. Now you have 2 washers and 4 dryers, each spaced in time so that they finish in 15 minute intervals. Remember it takes you 15 minutes to finish folding and putting away your clothes, so if you have clothes being finished by at least one washer and one dryer every 15 minutes, then you have peak efficiency, and can get one load of laundry put away every 15 minutes. This is like having a 7-stage pipeline. It accounts for a 7x reduction in time than if you didn't do pipelining at all! This is called BALENCED PIPELINING.

    I am speculating here, but in the Pentium 4, it may have been that the ALU was like the dryer in the above example. It was the limiting factor in an already balenced pipeline. By doubling the clock on the ALU unit, you have effectively made twice as many of them, and it's just like buying extra dryers.

    Thus, if you were to only take 1 packet of data (just like one load of laundry), it would take you 20 clocks to get from one side of the pipe to the other. However, since processors have much more data than you have loads of laundry, there is usually 20 pieces of data; one in each of the pipeline stages. Thus, when the pipeline is operating at top efficiency, one piece of data will pop out on each of the Pentium 4's 1.5GHz clocks.

    The problem, however, is that the pipe isn't always full. The analogy is that you accidently forgot and put a pen in your pants pocket, and did the laundry. You only notice your mistake when you get to the point where you fold your clothes. Now, all your clothes have turned blue, and you have to wash them all over again. It's time to take all the clothes out of all the machines, and start all over again. (Fortunately, you have bleech to get out the stains ).

    In the Pentium 4, the ink pen is the same as a branch misprediction. Like forgetting about your pen, it is rare, but it sure takes a lot of time to reverse the mistake. Fortunately, the Pentium 4 has an excellent branch predictor, so it's like searching your pockets for pens before putting them in the wash. Maybe you'll catch all the pens in your pants pockets, but you might miss a couple in your shirt pocket. OK... here the analogy starts to fall apart, but you get the idea.

    The 20-stage pipeline allows for each stage to take the shortest amount of time, just like the above example only take 15 minutes to do a load of laundry. This allows for much higher clock speeds. But, every time a branch is mispredicted, for example, it takes a long time to fix everything, and performance will slow down. Overall, though, Intel is hoping that high clock speeds will eventually counteract the long pipeline penalty, and you will still get a faster processor.

    Hope this helps you guys .

    By Flip November 02, 2000, 03:26 PM

    Arcadian,
    You sure have a way with analgies, that is a great explanation, thanks for spending the time to show us the light.

    Flip

    By Moridin November 02, 2000, 04:02 PM

    quote:Originally posted by Flip:
    Arcadian
    Sorry, I didn't mean to sound like an idiot, I wasn't sure completely how it works, I just inferred a lot from what I had read, if you could point me to a good article explaining the 20 stage pipeline, it would be much appcreciated, or even better, if you gave me an explaination, I'd be very greatful!

    Flip

    p.s. no hard feelings? I'm pretty new at the deep down technical interworkings of the microchip.


    Instead of re-inventing the wheel and writing a long post on pipelining I'll give you a ling to a discussion thread from a few months back on the topic. It starts of a G4 discussion but turns into one of the best Q&A on pipelining I have seen.
    http://arstechnica.infopop.net/OpenTopic/page?q=Y&a=tpc&s=50009562&f=77909774&m=224092771

    By Moridin November 02, 2000, 04:56 PM

    Fun topic.

    - 400MHz Front Side Bus

    Nice improvement, but still well below what RISC workstations will be using. Not that X86 has ever had the memory bandwidth of RISC workstations, but they may be loosing ground. The Alpha EV7 for example will have 4 times the memory bandwidth of the P4.
    This is probably an absolute minimum for a new processor given growing gap between processor and memory speed.

    - Double Pumped Execution Units

    I love this idea. If it scales it could be the most significant part of the architecture. It doubles you throughput and eliminates Read After Write (RAW) dependencies for instructions it applies to.

    - Lack of Dual Processing at Launch

    Not a big factor in my mind. Multi-processor is primarily used for servers and most IT shops will want to give the chip a while to stabilize before trusting anything important to it. PIII Xeons would likely be preferred to the P4 for some time yet.

    - More than Double the Heat of Pentium III

    Heat is not a concern for the end user. The chip either works or does not. It is up to the system builder to insure an appropriate thermal solution. The large die size should help this somewhat since it keeps the W/cm^2 about the same as the PIII. You won't see the P4 in notebooks any time soon though.

    - Rambus Memory

    DDR memory will be an option at some point. We don't know what will happen to the price of RDRAM if the P4 increases demand, but with Intel rebates current prices are competitive.

    - High Bandwidth Caches

    No kidding the bandwidth of the caches is high. The L1 D cache bandwidth is twice that of the PIII while the L2 bandwidth is more then the processor can use. The L2 can deliver 256 bits per clock while the L1 D cache is only 128 bits per clock and the (single) decoder is on 32 bits per clock. To add to the mystery (to me anyway) is that the L1 D cache is single ported, the load store unit is not double pumped but the AGU is. This may mean that the extra wide data path from the L1 gives something like the effect of a duel ported L1.
    I'd be interested in any comment about this.

    - 20-stage Pipeline / Branch Misprediction

    A lot has been said about the longer pipeline lowering IPC. It may or may not, given the effect of better branch prediction and new branch hint instructions.
    I'll go out on a limb and say that even if IPC is lowered it won't hurt overall performance, and here is why I think that. If you take two balanced pipelines, one 11 stages the other 22, if both do the same amount of work (same amount of logic) the 22 stage pipeline should clock twice as high. Lets say that the 11 stage pipeline can reach 1 GHz then the 22 stage should reach 2 GHz. Lets also assume that the branch penalty is 10 and 20 respectively.

    If you work out the amount of time each pipeline is stalled by a branch mispredict it comes to exactly the same value. (10 ns) Both will sit idle for the same length of time on a mispredict, but the longer pipeline will be faster the rest of the time and therefor complete any given job more quickly. This only applies if the pipelines are balanced.

    Again, any comments would be welcome.

    I'll finish off in another post to keep this short (er) and somewhat readable.


    By Arcadian November 02, 2000, 06:04 PM

    Thanks for the long and interesting reply, Moridin. I wanted to touch on a few comments that you made.

    quote:Originally posted by Moridin:
    - High Bandwidth Caches

    No kidding the bandwidth of the caches is high. The L1 D cache bandwidth is twice that of the PIII while the L2 bandwidth is more then the processor can use. The L2 can deliver 256 bits per clock while the L1 D cache is only 128 bits per clock and the (single) decoder is on 32 bits per clock. To add to the mystery (to me anyway) is that the L1 D cache is single ported, the load store unit is not double pumped but the AGU is. This may mean that the extra wide data path from the L1 gives something like the effect of a duel ported L1.
    I'd be interested in any comment about this.

    Hmm... I'm wondering what you're getting at here. Can you explain further, because your idea intrigues me.

    I've noticed, too, that the cache bandwidth is almost TOO big for what the processor needs. Either they wanted to make sure the cache was the least of the bottlenecks, or Intel has something sneaky in mind .

    quote:Originally posted by Moridin:
    - 20-stage Pipeline / Branch Misprediction

    A lot has been said about the longer pipeline lowering IPC. It may or may not, given the effect of better branch prediction and new branch hint instructions.
    I'll go out on a limb and say that even if IPC is lowered it won't hurt overall performance, and here is why I think that. If you take two balanced pipelines, one 11 stages the other 22, if both do the same amount of work (same amount of logic) the 22 stage pipeline should clock twice as high. Lets say that the 11 stage pipeline can reach 1 GHz then the 22 stage should reach 2 GHz. Lets also assume that the branch penalty is 10 and 20 respectively.

    If you work out the amount of time each pipeline is stalled by a branch mispredict it comes to exactly the same value. (10 ns) Both will sit idle for the same length of time on a mispredict, but the longer pipeline will be faster the rest of the time and therefor complete any given job more quickly. This only applies if the pipelines are balanced.

    Again, any comments would be welcome.

    I have the feeling that Intel was sort of aiming for bandwidth intensive programs, which would be somewhat latency tolerant. Programs that wait for user input (Word Processing) certianly do not need to get any faster, but programs like video encoding, 3D rendering, and background encryption/decryption would be very bandwidth intensive programs that will probably do well on the Pentium 4. It seems that the large number of pipeline stages was a tradeoff that was needed in order to reach the next generation of clock speeds, though I'm not sure a lot of the zealots on the boards will look at it that way at first.

    Most people want immediate satisfaction in speed, and like you said, a lot of optimization is available, so we may have to wait for Pentium 4 to really get better programs written for it. I did not know that there was a Branch Hint instruction available, but that goes to show that optimizations are possible in programs, and that the Pentium 4 will probably not see these advantages at first. Also, SSE-2 needs to be optimized, as well as taking advantage of the Pentium 4's 128 byte cacheline boundaries.

    Well, I'm a little off topic right now, but we should really spawn a discussion on optimizations.

    quote:Originally posted by Moridin:
    I'll finish off in another post to keep this short (er) and somewhat readable.

    Please continue to post. I enjoy reading from you.

    By Arcadian November 02, 2000, 06:12 PM

    quote:Originally posted by Flip:
    Arcadian,
    You sure have a way with analgies, that is a great explanation, thanks for spending the time to show us the light.

    Flip

    Thanks... I appreciate the kind words. You know, I am happy to share my knowledge with everybody in this board, and actually it was one of the reasons I requested the Highly Technical Forum to be created. If you want me to write concerning a different topic, I would be interested in teaching you more.

    Take care.

    By OOAgentFiruz November 02, 2000, 06:34 PM

    Another tutorial : ) http://www.hardwarecentral.com/hardwarecentral/tutorials/2427/1/

    By OOAgentFiruz November 02, 2000, 06:45 PM

    quote:Fortunately, the Pentium 4 has an excellent branch predictor, so it's like searching your pockets for pens before putting them in the wash.
    What type of Branch Prediction is the P4 utilising,do you have a link to any info : )
    F.Y.I
    The K7 uses a 2,048-entry branch history table (BHT) with a simple two-bit Smith prediction algorithm. This predictor stands in sharp contrast to the K6's elaborate 8,192-entry BHT with its two-level GAs predictor, a feature that AMD now admits was overkill.

    By Arcadian November 02, 2000, 07:20 PM

    quote:Originally posted by OOAgentFiruz:
    What type of Branch Prediction is the P4 utilising,do you have a link to any info : )

    F.Y.I
    The K7 uses a 2,048-entry branch history table (BHT) with a simple two-bit Smith prediction algorithm. This predictor stands in sharp contrast to the K6's elaborate 8,192-entry BHT with its two-level GAs predictor, a feature that AMD now admits was overkill.

    I do indeed have a link for you . Here is a document that should answer a lot of your questions. It is by the Pentium 4 processor's lead architect, Doug Carmean.
    http://www.intel.com/pentium4/download/nbarch.pdf

    On Page 28, it says the following regarding the branch predictor of the Pentium 4.

    "Accurate branch prediction is key to enabling longer pipelines"

    "Dramatic improvement over P6 branch
    predictor:
    8x the size (4K)
    Eliminated 1/3 of the mispredictions"

    "Proven to be better than all other
    publicly disclosed predictors (g-share, hybrid, etc)"

    You should read the rest of the document, too. It has a lot of information!

    By virusag13 November 02, 2000, 09:03 PM

    OOAgentFiruz, thanks for the link! That site should keep me busy for a while.

    By Phlux November 02, 2000, 10:45 PM

    A little off the topic of the pipeline in PIV, but out of curiosity, are their technical limitations of socket 423? Why would Intel be changing the socket of the PIV so soon after it's release. Not to say that things cannot change but it is supposed to debut with the release of Foster? I do not understand this move.

    Another question, will the PIV show a significant improvement with more memory bandwith, or will the dual RDRAM channel keep it happy. 3.2Gb/s is quite a bit. Looking for a flow chart for the 850 architecture as well. Any ideas?


    Just a thought

    By Arcadian November 02, 2000, 11:38 PM

    quote:Originally posted by Phlux:
    A little off the topic of the pipeline in PIV, but out of curiosity, are their technical limitations of socket 423? Why would Intel be changing the socket of the PIV so soon after it's release. Not to say that things cannot change but it is supposed to debut with the release of Foster? I do not understand this move.

    Actually, you're not off topic at all. This topic deals with all the pros and cons of the Pentium 4 architecture and platform.

    To answer your question, though, it could be any number of reasons why Intel is choosing to change sockets. One thing I do want to point out, though, is that I have yet to see an official press release stating that there will be a change in socket. The only things I have seen have been insubstantiated rumors, so I can't be sure that there will even be a change in socket. (I would really appreciate a link to verify this, if anybody has one).

    Assuming that there is a change in socket, though, I would have to guess that it is probably for the following reason. Usually the same team that works on a product will not be the same team that works on that product's successor. I think that, because two different teams were working on Willamette (code name for Pentium 4) and Northwood (Pentium 4 on .13u shrink), that the requirements of the latter product was not received in time to be implemented in the first product. In other words, Willamette was being worked on first by one team, and they came up with a specification based on the requirements for the chip. Another team was working on Northwood, but before they came up with the requirements, the specification for socket 423 was probably already finished, so they created their own specification based on the new requirements. This is only my theory, but since miscommunication between the two groups can happen easily, it is a possibility.

    quote:Originally posted by Phlux:
    Another question, will the PIV show a significant improvement with more memory bandwith, or will the dual RDRAM channel keep it happy. 3.2Gb/s is quite a bit. Looking for a flow chart for the 850 architecture as well. Any ideas?

    Just a thought

    RDRAM gives the Pentium 4 the same memory bandwidth as the front side bus bandwidth. Both interfaces allow for 3.2GB/s, and that allows for optimal transfers. It is doubtful that a different amount of memory bandwidth could allow for a better performance. My opinion is that DDR memory may in fact reduce the performance of the Pentium 4, since the bandwidth for PC2100 is only 2.1GB/s. Of course, we will have to wait until there are Pentium 4 boards with different memory interfaces to compare the two.

    Also, the flow chart for the i850 will probably be very similar to other Intel chipsets, with the only exception being the dual memory interfaces.


    Contact Us | www.SharkyForums.com

    Copyright 1999, 2000 internet.com Corporation. All Rights Reserved.


    Ultimate Bulletin Board 5.46

    previous page
    next page




    HardwareCentral
    Compare products, prices, and stores at Hardware Central!


    Copyright 2002 INT Media Group, Incorporated. All Rights Reserved. About INT Media Group | Press Releases | Privacy Policy | Career Opportunities