View Full Version : XP 64 bit
Riftgarde
10-03-2004, 09:28 PM
Will the 64 bit version be something like a $50 upgrade for XP 32bit users? Or will it be a whole new $150-300 OS? With loghorm in a couple years, I think the latter would make them less money than the $50 upgrade.
djnes
10-03-2004, 10:45 PM
I'm not much of a code guy, but I believe it has to be written from the ground up to be 64 bit, and cannot be in the form of an upgrade. Therefore, I am sure MS will charge a full OS price for it. I don't know if this is true either, but I've read on several sites it won't be publicly available...it will only be available with new 64 bit systems....which would make the purchasing of an AMD64 system right now (for the purpose of 64 bit) useless. I don't know if that's true or not...so take it with a grain of salt.
BillLeeLee
10-03-2004, 10:49 PM
I have heard that Microsoft will allow you a free upgrade to 64-bit if you have a full 32-bit XP license. Forgot where I heard it, but I vaguely remember it being a credible source.
I have heard that Microsoft will allow you a free upgrade to 64-bit if you have a full 32-bit XP license. Forgot where I heard it, but I vaguely remember it being a credible source.
I'm the owner of a 64 bit system and beta tested the 64bit version in the early stages, and heard this as a possibility in a MS email.
Ice Czar
10-03-2004, 11:36 PM
while I didnt wait and went Suse 9.1 Pro
the questions remain if we are talking about simply porting an OS to employ memory larger than 4GB (2GB per thread the current 32bit limitation with a few Xeon oriented 3GB switch exceptions in W2K\2003) or actually long coded to employ the additional registers
Both linux and XP 64 Beta are just ported to employ more memory per thread not take advantage through long coding
and of course the same for applications
64bit computing in real life (http://www.lostcircuits.com/cpu/panorama/) @ LostCircuits
outlining the advantages of the Opteron\AMD64 Architecture
(as youll see this primarily applies to multi-processor systems but some would carry over to the AMD64 as well)
AMD Opteron Coverage - Part 1: Intro to Opteron/K8 Architecture (http://www.anandtech.com/cpu/showdoc.html?i=1815) @ anandtech
Go Deep (http://www.anandtech.com/cpu/showdoc.html?i=1815&p=4)The difference in pipeline architectures is what makes a clock-for-clock comparison between the Xeon and Opteron invalid (much like the Pentium 4 to Athlon XP comparison was invalid on a clock-for-clock basis). The Xeon's architecture allows it to reach high clock speeds at the expense of doing less work per clock cycle, the appropriate comparison ends up being one of cost and real-world performance, not one of clock speed.
The more pipeline stages you have, the less work is done per clock and thus the higher you're able to clock your CPU; this is the reason the 20-stage Xeon is currently at speeds of 3GHz, compared to the 12-stage Opteron which is debuting at 1.8GHz.
in short, its not just how fast you run, but also how long your legs are, a Westy (http://www.leader.es/valleblanco/westy3.jpg) need to run twice as fast as a Wolfhound (http://www.arrakis.es/~pablol/wolfhound.jpg) to cover the same ground
AMD's 64-bit strategy - x86-64 (http://www.anandtech.com/cpu/showdoc.html?i=1815&p=5)
The benefits of a 64-bit microprocessor architecture are mainly memory related; if you take two identical microprocessors, make one 64-bit and one 32-bit, the advantage of the 64-bit CPU is that it can address much more memory than the 32-bit CPU (2^64 vs. 2^32). For those that were hitting the limits of 32-bit memory addressability (4GB), Intel's only high performance solution was to transition to Itanium, but if all you're looking for is more than 4GB of memory and solid x86 performance, then you're SOL from Intel's perspective.
AMD's 64-bit strategy is significantly different; AMD has always been focused on the current customer needs, not on the vision of the computing future 5 - 10 years from now and this is reflected in their 64-bit strategy. The strategy is simple and has been done before in the past; stick with a high-performing x86 core, and simply extend the ISA to support 64-bit memory addressability - the end result is what AMD likes to call x86-64.
In legacy mode, the K8 will run all native 16 or 32-bit x86 applications, the processor basically acts just as a K7 would.
Things get interesting in "long" mode where a 64-bit x86-64 compliant OS is required; in this mode, the K8 can either operate in full 64-bit mode or in compatibility mode. Full 64-bit mode allows for all of the advantages of a 64-bit architecture to be realized, including 64-bit memory addressability. One of the major features of the K8 architecture is the fact that the number of general purpose registers is doubled when in x86-64 mode, and thus this feature is also taken advantage of in full 64-bit mode.
Compatibility mode gives you none of the advantages of a 64-bit architecture on the application level, as it is designed for running 32-bit apps on a 64-bit OS (hence the name compatibility); The extra registers and 64-bit register extensions are ignored in this mode. Compatibility mode is important because of the 2GB process size limitation under Windows OSes. Although 32-bit Windows offers support for a maximum of 4GB of memory, each process can only use a maximum of 2GB of memory - the remaining 2GB is reserved for the OS. By running a 64-bit version of Windows (when released) and a 32-bit application, compatibility mode allows for each 32-bit process to have up to a full 4GB of memory, with the OS using anything above that marker.
Finally we have 64-bit long mode, where there is more than meets the eye. In addition to > 4GB memory addressability, in 64-bit long mode, applications have access to twice as many named general purpose registers. Remember that registers are basically high speed memory locations on the microprocessor where temporary values are stored. For example, if you were to compute the sum of two numbers, both of those numbers as well as the final result would be stored in these registers.
There is an immediate advantage if your running over 4GB of RAM and a 64bit Operating System (Linux (http://www.suse.com/us/private/products/suse_linux/prof/index.html) or Windows 64bit Evaluation (http://www.microsoft.com/windowsxp/64bit/evaluation/default.asp) and while not that many applications are coded to employ the long mode, animation software will be among the first
Look what we found, an on-die memory controller (http://www.anandtech.com/cpu/showdoc.html?i=1815&p=6)
The benefits of an integrated memory controller are clear - low latency memory accesses and an extremely fast controller design thanks to the fact that it is manufactured using the latest processes using the fastest transistors.
(see chart)
You can see that the integrated memory controller of the Opteron is significantly lower latency than the nForce2's dual-channel DDR memory controller. It is also worth noting that the 875P memory controller is extremely low latency, especially for an external controller - but you have to keep in mind that we're comparing two different clock speed CPUs here when we're comparing to the Intel platform. While the platform may have a latency similar to that of the Opteron, the CPU is running at a much higher frequency meaning that more clock cycles are being wasted in the same amount of time:
(see chart)
The above graph shows the number of clock cycles wasted on waiting for data from main memory, here we see the clear advantage of having an on-die memory controller.
The downside to the on-die memory controller is that in order to get support for new memory technologies, you need to replace your CPU, not just your motherboard. AMD has built functionality into the K8 core that allows an external chipset to disable the on-die memory controller and use an external one. However, remember that a K8 without the integrated memory controller is basically like an optimized K7 with a longer pipeline.
Considering the need to employ ECC RAM, and the amount of RAM a typical workstation would employ, especially to take advantge of breaking the 2GB application limit above, its unlikely that the typical animator would be "upgrading" RAM speeds on a regular basis, rather older workstations wiould be added to a rendering farm\cluster and new one replace it
Multiprocessor Mecca (http://www.anandtech.com/cpu/showdoc.html?i=1815&p=7)
The culmination of all of this is that the K8 core (and thus the Opteron) scales very well with the number of CPUs you have in a system, much better so than any Intel processor.
Whereas the Xeon only sees an 11% increase in performance from going to two CPUs, the Opteron sees an impressive 24% performance boost! These are not numbers to scoff at; AMD has clearly designed the Opteron for serious multiprocessing environments. We hope to be able to bring you 4-way scaling benchmarks very soon.
Another interesting thing about the K8 architecture is that it has already been engineered for use in multicore designs. AMD's Fred Weber mentioned to us that the logic for multicore, single die Opteron processors has already been verified, although nothing has taped out. The process is actually quite simple; AMD produces two Opteron cores, removes the physical layers of the Hyper Transport links and connects the two on a single die.
and this leads us to the two greatest potential advantages
one that AMD has announced that dual cores are on the way, and that they will be employing the current Opteron 940 socket and thus its possible and potentially likely that a dual processor board bought now could be upgraded to a quad processing powerhouse in the not too distant future, and that while Intel has also announced their intent to release dual cores, they are much farther out than AMD, and with the current architecture they are bottlenecked,, Xeons dont scale well at all
unlike Opterons with their on die memory controller and Hypertransport, and its unlikely that their dual core will employ the current socket and architecture
the above excerpts are from the initial release of the Opteron
a more current article would be below, and describes why the L3 cache is so important to the Xeon
AMD Opteron vs. Intel Xeon: Database Performance Shootout (http://www.anandtech.com/IT/showdoc.html?i=1982)
Future Xeon and Pentium 4 processors will ship with the x86-64 extensions enabled but architecturally they will be identical to the currently available Prescott based Pentium 4. The architectural similarity between Intel's IA-32e ad IA-32 processors (IA-32e is Intel's marketing equivalent to AMD64) is an important point to note as it means that if Opteron is able to outperform Xeon in 32-bit mode, it will maintain a performance advantage in 64-bit mode as well.
FSB Impact on Performance: Intel's Achilles' Heel (http://www.anandtech.com/IT/showdoc.html?i=1982&p=3)
We've alluded to FSB bandwidth being a fundamental limitation in Intel's multiprocessor architecture, and now we're here to address the issue a bit further.
A major downside to Intel's reliance on an external North Bridge is that it becomes very expensive to implement multiple high speed FSB interfaces as well as a difficult engineering problem to solve once you grow beyond 2-way configurations. Unfortunately Intel's solution isn't a very elegant one; regardless of whether you're running 1, 2 or 4 Xeon processors they all share the same 64-bit FSB connection to the North Bridge.
The following diagram should help illustrate the bottleneck
(see chart)
In the case of a 4-way Xeon MP system with a 400MHz FSB, each processor can be offered a maximum of 800MB/s of bandwidth to the North Bridge. If you try running a single processor Pentium 4 3.0GHz with a 400MHz FSB you'll note a significant performance decrease and that's while still giving the processor a full 3.2GB/s of FSB bandwidth; now if you cut that down to 800MB/s the performance of the processor would suffer tremendously.
It is because of this limitation that Intel must rely on larger on-die L3 caches to hide the FSB bottleneck; the more information that can be stored locally in the Xeon's on-die cache, the less frequently the Xeon must request for data to be sent over the heavily trafficked FSB.
What's even worse about this shared FSB is that the problem grows larger as you increase the number of CPUs and their clock speed. A 2-way Xeon system won't experience the negative effects of this FSB bottleneck as much as a 4-way Xeon MP; and a 4-way Xeon MP running at 3GHz will be hurting even more than a 4-way 2.0GHz Xeon MP. It's not a nice situation to be in, but there's nothing you can do to skirt the issue, which is where AMD's solution begins to appear to be much more appealing:
(see chart)
First remember that each Opteron has its own on-die North Bridge and memory controller, so there are no external chipsets to deal with. Each Opteron CPU features three point-to-point Hyper Transport links, delivering 3.2GB/s of bandwidth in each direction (6.4GB/s full duplex). The advantage is clear: as you scale the number of CPUs in an Opteron server there are no FSB bottlenecks to worry about. Scalability on the Opteron is king, which is the result of designing the platform first and foremost for enterprise level server applications.
Intel may be able to add 64-bit extensions to their Xeon MPs, but the performance bottlenecks that exist today will continue to plague the Xeon line until there's a fundamental architecture change.
[H]EMI_426
10-04-2004, 02:57 AM
FreeBSD also has a release available for AMD64. Linux and Windows aren't the only OSes out there for AMD64.
Ranma_Sao
10-04-2004, 03:38 AM
while I didnt wait and went Suse 9.1 Pro
the questions remain if we are talking about simply porting an OS to employ memory larger than 4GB (2GB per thread the current 32bit limitation with a few Xeon oriented 3GB switch exceptions in W2K\2003) or actually long coded to employ the additional registers
Both linux and XP 64 Beta are just ported to employ more memory per thread not take advantage through long coding
and of course the same for applications
CLIP
Umm, Windows does take care of the extra registers, that's what a compiler is for. ;)
Ice Czar
10-04-2004, 03:54 AM
the only fully coded 64 bit OSs Im aware are for the Itanium
AMD betas arent "long coded" nor are any aps yet
Finally we have 64-bit long mode, where there is more than meets the eye. In addition to > 4GB memory addressability, in 64-bit long mode, applications have access to twice as many named general purpose registers. Remember that registers are basically high speed memory locations on the microprocessor where temporary values are stored. For example, if you were to compute the sum of two numbers, both of those numbers as well as the final result would be stored in these registers.
even Renderman Pro Server is just ported to take advantage of more than 2GB of memory per thread, not those general purpose registers, the whole ap would need to be recoded for that, same with the OSs, thus only OSs and Aps written from the ground up for the Itanium
or at least that is the current state of my understanding
(and according to Pixar, who I contacted regarding Renderman Pro Server)
Ranma_Sao
10-04-2004, 04:53 AM
Well, I'll look at a stack trace tommorow, but last I remember, windows supported the extra registers, since the AMD64 (Now x64) compiler supported them, aka compiled code to use them. (Very, Very few parts of windows are actually coded in assembly.)
Ranma_Sao
10-04-2004, 05:00 PM
It is using them, to prove it, just attach a kernel debugger and watch R8-R13 get twiddled. ;)
almostinsane1
10-04-2004, 08:12 PM
Umm, Windows does take care of the extra registers, that's what a compiler is for. ;)
XP64 is Windows Server 2003 64bit with the XP theme. The code is all from 2K3.
Ice Czar
10-04-2004, 08:31 PM
so the OS is setup to employ the register extentions , but the applications need to be recompiled inorder for that to work?
AMD and X86-64 (http://www.faculty.iu-bremen.de/birk/lectures/PC101-2003/05cpu64/amd_intel.htm)
http://images.dr3vil.com/uploads/modes_table.GIF
32-bit to 64-bit Migration Considerations (http://web.ccr.jussieu.fr/ccr/Documentation/Calcul/vac/html/en_US/doc/compiler/ref/rucl64mg.htm)
This section outlines various portability considerations in moving C programs from 32-bit to 64-bit mode.
Ranma_Sao
10-04-2004, 10:42 PM
Yes, otherwise they will be ran in WoW, (Windows on Windows) and emulated. But Windows on x64 does take full advantage of the processor's features.
vBulletin® v3.8.2, Copyright ©2000-2009, Jelsoft Enterprises Ltd.