Does AGP Do ?
the Beef ?
- Some Critical Thoughts
We Go and Buy AGP Boards Now ?
It's not very long until the first AGP systems will
hit the market. Intel will release its new 440LX Pentium II chipset with
AGP support on August 26, 1997, only a few weeks away. Lots of hopes and
hypes about AGP can be found all over the place and so I found it was time
to add my 2 cents to this discussion, hopefully giving you some clear facts
about this new interface.
What Does AGP do?
AGP is nothing mystical at all and the idea behind AGP
isn't even particularly unique. If future graphic accelerators should be
particularly faster than the current PCI graphic cards it will take much
more than just AGP. However AGP enables graphics hardware
to do its job faster whilst also keeping the costs low.
The AGP specification is based on the 66 MHz PCI
specification rev. 2.1, which isn't in much use currently, since all current
PCI cards are still only able to use the 33 MHz PCI bus speed. AGP however
is adding three special extensions via so called 'sideband' signals, provided
by some special lines added to the PCI specs. These three extensions are
Now what does this mean in laymen terms?
pipelined memory read/write operations
demultiplexing of address and data on the bus
timing for data transfer rate as if clocked with 133
First of all AGP offers a much higher throughput
over the AGP bus than PCI does. PCI as currently clocked at 33 MHz can
transport 133 MB/s at peak rates over its 32 bit data bus (33,000,000 *
4 byte * sec-1). AGP is clocked with 66 MHz, which enables a
peak rate of 266 MB/s (66,000,000 * 4 byte * sec-1)at the classic
so called 'x1' mode, but using the 'x2' mode, which transports data on
both the rising and the falling edges of the 66 MHz clock, it can transport
up to 532 MB/s at peak rate (please note that it is up to the graphic accelerator's
vendor if 'x2' mode is supported). So far about the '133 MHz' data transfer
rates, which doesn't mean that the AGP bus is clocked at 133 MHz at all!
Now in real world AGP is able to transfer closer to the hypothetical peak
values due to some extra signal wires which enable pipelining and queuing
Figure 1: Non-pipelined PCI vs. AGP
An is the address of the request, and Dn is the result. Copyright(c) Intel
Due to this new technology, AGP peak transfer rate
is as high as the peak transfer rate of current main memory, which in Pentium
and above systems operates with a 64 bit wide bus at 66 MHz bus clock.
Future systems will reach a main memory peak transfer rate of 800 MB/s
by using 100 MHz bus clock.
Copyright(c) Intel Corporation
Now this new main memory like high data transfer
rate that AGP offers us is only one part of the story, but for the beginning
of AGP it might be the most important one.
Due to the high data transfer rate between the graphics
accelerator and main memory, AGP enables graphic accelerators to use main
memory instead of local memory for things like typically textures, which
can be as big as up to 128 kB. These textures so far had to be loaded into
the local graphic accelerator memory to be processed there by the graphic
processor. Now these textures can be processed in main memory without a
performance impact. Intel calls this DIME, for DIrect Memory
"Execute". UMA the 'unified memory architecture' used on low cost
boards in the past, where already main memory was used as graphics memory,
had two important differences:
These two differences show why UMA was particularly
slow and should make you understand why AGP graphic accelerators should
be faster than current PCI solutions.
The main memory provided via AGP and thus called 'AGP
memory' doesn't replace the screen buffer of the graphic accelerator as
done in UMA. The AGP memory is an addition to it.
UMA had to go through the much slower PCI interface.
If this is hard to understand, let me give you a
The 3D accelerators with the 3Dfx Voodoo chip
e.g. the Diamond Monster 3D usually come with 4 MB memory. Now 2 MB of
this memory are used for textures and 2 MB are used for frame buffer and
Z-buffer. This is why the Monster 3D is limited to 640*480 resolutions
in e.g. GLQuake, since only 2 MB can be used for frame buffering and 2
MB are used for textures, which would not be necessary if main memory could
be used for this job, as possible with AGP's DIME.
"Graphics local RAM is usually more expensive
than generalized system memory and it cannot be used for other purposes
by the OS when unneeded by the graphics of the running applications. The
graphics chip needs fast access to local memory for screen refresh, Z-buffers,
and pixels (front and back-buffers). For these reasons, programmers can
always expect to have more texture memory available via AGP system memory.
Keeping textures out of the frame buffer allows larger screen resolution,
or permits Z-buffering for a given large screen size. Most applications
could use 2-16 MB for texture storage. By using AGP and DIME, they can
get it." (Intel Corporation)
But let's for now get back to the theory again.
The chipset has to provide the function to map the
'AGP memory' to normal main memory. Intel calls this GART (Graphics
Address Remapping Table).
"The processor "linear" virtual addresses get
translated by its paging hardware into physical addresses. These physical
addresses are used to access system RAM, local Frame Buffer, and AGP RAM.
The CPU accesses to the Local Frame buffer and AGP RAM use the same addresses
as the graphics chip does; for that reason, the operating system sets up
the CPU paging hardware to a straight 1:1 non-translation of virtual to
physical address. " (Intel Corporation)
Copyright(c) Intel Corporation
What's the Beef ?
So far so good, let's now summarize the benefits AGP
Obviously it doesn't take a Pentium II to provide
the needs for an AGP system. This is why Socket 7 systems with AGP (e.g.
upcoming VIA Apollo VP3 chipset) will do just the same as the AGP provided
by the 440LX chipset for Pentium II platforms.
higher bandwidth than PCI, up to 4 times as high
no sharing of bandwidth with other components like in
case of PCI
DIME, direct memory execution of textures
CPU accesses to system RAM can proceed concurrently
with the graphics chip's AGP RAM reads
Allowing the CPU to write directly to shared system
AGP memory when it needs to provide graphics data, such as commands or
animated textures. Generally the CPU can more quickly access main memory
than it can graphics local memory via AGP, and certainly faster than via
the PCI bus.
Unfortunately, getting an AGP board plus an AGP graphic
accelerator won't be enough to take advantage of AGP's new performance.
Nothing goes without a proper operating system which has to take care of
particularly the DIME/GART part of the AGP benefits. The OS has to provide
main memory for the AGP RAM and has to monitor that main memory is still
enough for the running applications. This shall be achieved via DirectDraw
of Memphis (Windows98) and Windows NT 5. As long as these
operating systems aren't out, nobody will be able to take advantage of
the DIME and hence only half of the AGP benefits are used.
AGP - Some
The number one benefit from AGP is supposed to be the
DIME feature, which is meant to save video RAM onboard the graphics adapter.
There are some doubts however, where I'm wondering if this idea will turn
out to be as wonderful as it sounds. We have learned that AGP offers a
theoretical peak throughput of 528 MB/s using 'x2' mode and the next 'x4'
mode is already planned. This mode would offer a throughput of about 1
GB/s, isn't that amazing? There is a little problem we easily forget though.
This throughput is meant to transport data from main memory to the graphic
accelerator. Now currently the maximum throughput of main memory to the
CPU at 66 MHz bus clock is exactly these 528 MB/s. You certainly don't
expect that the whole system is doing nothing while the graphic accelerator
is accessing the main memory via DIME, do you? Whilst the graphics accelerator
is doing its work, the CPU and other DMA using devices are accessing main
memory just as well of course. Therefore AGP will never be able to get
a throughput of 528 MB/s, since this is the whole bandwidth of main memory
and thus it has to be shared with CPU and others. If you see it in
a very simple statistical way you can't expect that AGP will get more of
that main memory bandwidth than 50% = 264 MB/s. What is the 'x2' mode good
for then? These above averaged 528 MB/s bandwidth of main memory are already
only valid for SDRAM systems. EDO is considerably slower, let alone good
old FPM. What AGP really needs is the 100 MHz bus!! This bus will
offer 800 MB/s bandwidth with SDRAM and so AGP could get a good share of
it. Hence there's not much value in going on about 'x1' or 'x2' mode AGP
graphic cards currently, since there's simply no technical chance that
data could be transfered at the speed 'x2' mode is offering in 66 MHz bus
speed systems. What does this mean for us? 'Let's wait again!!' Let's wait
for the 440BX or VIA Apollo VP4 chipset, both using 100 MHz system bus.
There's one other consideration as well. Modern VRAM
or WRAM cards as well as RAMBUS RAM cards are offering a video memory (onboard,
LFB or local memory) bandwidth of up to 1.6 GB/s (e.g. Number Nine Revolution
3D, 128 bit port WRAM). This is much more than even 'x4' mode will offer.
These cards will be faster if they are using their local memory for texture
processing rather than the much slower AGP RAM. This means that high end
cards will work just as PCI cards in the past, only taking advantage of
the higher data transfer speed of AGP, no DIME used. Intel thinks that
this will be more expensive, but isn't it funny ... RAM prices are lower
than ever. This should not really be a reason for a more expensive card.
This leads to the question if you have
to use DIME to benefit from AGP. The answer even provided by Intel is 'NO'.
You can use AGP without using the DIME feature at all.
In this case the graphic accelerator is just benefiting from the much higher
transfer rates than PCI. The 'sidebands' can be used, but they don't have
to. Without 'sidebanding' the transfer rate is already 266 MB/s, which
is double of what a stand alone PCI graphics card would get. Here the access
can (as with PCI cards as well) either use PIO or DMA to transfer the data
from main memory into the frame buffer of the graphic accelerator.
The majority of graphic accelerators will most likely
use DIME, thus saving on board texture memory and hence making the card
cheaper without loosing performance. Of course these cards should be using
the 'sidebands' to enable 'x2' mode.
The high end versions will most likely use DIMEL (Direct
Memory Execute and Local also). Often used textures would be stored in
a (large) on board local memory, less frequently used ones would reside
in the AGP RAM. These cards will come with a lot of memory on board, like
e.g. the (expensive but fast) Diamond Fire GL 4000 (PCI) with its 32 MB
RAM already shows. Even Intel admits that high end solutions will still
have a very large local memory, but will be too expensive for mainstream.
We Go and Buy AGP Boards Now ?
The answer is yes and no. As you will see from my benchmark
results, currently there isn't much to AGP at all. However the SDRAM support
and the upgradeability (to AGP) of 440LX chipset boards will be a great
advantage over the 440FX Pentium II boards. It will probably take at least
until NT 5 and Memphis are released until there will be a really visible
performance boost from AGP.
Valuable AGP Links
and 3D Graphics Software by Intel - a must read for everybody who
wants to know more about AGP
GRAPHICS PORT INTERFACE SPECIFICATION also by Intel
Support in Windows 95 and Windows NT from Microsoft
Accelerated Graphics Port (AGP) - A Diamond Multimedia White Paper