![]() |
![]() |
F3DEX2 Microcode
This section discusses the F3DEX2 microcode. The following topics are included:
Features of the F3DEX2 Microcode
The F3DEX2 microcode is a reconfiguration of the previously released F3DEX microcode with an increase in RSP calculation speed.
F3DEX2 has the following special features:
Note: The F3DEX2 AND Fast3D/F3DEX series microcodes are NOT compatible at the binary level.
Specifically, the F3DEX_GBI_2 macro must be defined at compile time.
Either add -DF3DEX_GBI_2 to the C compiler (cc/gcc) options, or add the following "define" statement in front of the ultra64.h include statement in the source file:
The GBI used by the F3DEX2 series are compatible at the source level with those used by the Fast3D/F3DEX series. Code created with Fast3D/F3DEX can be recompiled for use with F3DEX2.
|#define F3DEX_GBI_2
|#include <ultra64.h>
|
The following GBI is not supported: g[s]SPInsertMatrix()
F3DLX / F3DLX.NoN (the version of F3DEX with subpixel calculations omitted) nor for F3DLP.Rej (the version of F3DLX.Rej with texture correction omitted) are no longer supported. These were supported by F3DEX, but as a result of optimizing the F3DEX microcode, they no longer have any merit, since both the subpixel calculation process and the texture correction process have been eliminated.
Accordingly, you should replace these microcodes with the new microcodes shown below. (See the Appendix A for information about the correspondence between old and new microcode.)
Old Microcode | Replace With New Microcode |
gspF3DLX.fifo.o | gspF3DEX2.fifo.o |
gspF3DLX.NoN.fifo.o | gspF3DEX2.NoN.fifo.o |
gspF3DLP.Rej.fifo.o | Use either gspF3DLX2.Rej.fifo.o or gspF3DEX2.Rej.fifo.o |
Support has been added for F3DEX2.Rej (the subpixel calculation version of F3DLX2.Rej). This differs from F3DEX2 in that the Rejection process is performed without performing Clipping. With a vertex cache size of 64, the processing speed is faster than F3DEX2 but slower than F3DLX2.Rej. Since you get the same picture quality as with F3DEX2, it is a good idea to use F3DLX2.Rejand F3DEX2.Rej when the situation calls.
If you previously used gspS2DEX.fifo.o, you should replace it with gspS2DEX2.fifo.o
The S2DEX2 microcode does not differ in performance in any way from S2DEX. It simply has the ability to self-load with the F3DEX2 series. As long as you do not mix use of the sprite microcode with the F3DEX2 series, there is no problem with continuing to use S2DEX
S2DEX2, like F3DEX2, supports the XBUS version. However, unlike S2DEX, it does not support the S2DEX_d microcode for debugging.
The method for using the S2DEX2 microcode is the same as that for F3DEX2:
Either define the F3DEX_GBI_2 macro with the compile option, or define the microcode with a define statement before the ultra64.h include statement.
Include the header file PR/gs2dex.h after the ultra64.h include statement.
Following is an example for a define statement:
|#define F3DEX_GBI_2
|#include <ultra64.h>
|#include <PR/gs2dex.h>
|
There are three ways to pass this RDP command string to the RDP: the XBUS method, the FIFO method, and the DRAM(DUMP) method.
FIFO Method
In the FIFO method, the RDP commands are expanded in a FIFO buffer in RDRAM and then passed to the RDP. Of the three methods, this is the only one supported by the F3DEX series.
XBUS Method
In the XBUS method, the RDP commands are passed from the RSP to the RDP via the XBUS, which is the internal bus that directly connects the RSP and the RDP. So unlike the FIFO method, this method makes no use of RDRAM. Many of the sample programs which accompany the current library make use of this microcode.
DRAM (DUMP) Method
In the DRAM(DUMP) method, the RDP commands are simply expanded in RDRAM, and the CPU must initiate the process of passing the data to the RDP. This method makes excessive use of RDRAM, so it is not a practical method.
The Fast3D microcode series (the original microcode and the basis to F3DEX) had separate microcode supporting the three methods. However, for F3DEX, the internal buffer that had been used for passing RDP commands with the XBUS method was used instead as a vertex cache area in order to increase the vertex cache from 16 to 32. As a result, the F3DEX microcode could no longer support the XBUS method. Moreover, unlike the FIFO microcode, the XBUS microcode cannot process graphics in the RDP while the audio microcode is operating. So in order to boost overall performance, support for the XBUS method was abandoned.
With the release of the F3DEX2 series, the use of this internal buffer has been optimized, enabling support of the XBUS method microcode while maintaining the same size vertex cache as F3DEX.
The size of the internal buffer used for passing RDP commands is smaller with the XBUS microcode than with the normal FIFO microcode (around 1 Kbyte). As a result, when large OBJECTS (that take time for RDP graphics processing) are continuously rendered, the internal buffer fills up and the RSP halts until the internal buffer becomes free again. This creates a bottleneck and can also slow RSP calculations. Additionally, audio processing by the RSP cannot proceed in parallel with the RDP's graphics processing. Nevertheless, because I/O to RDRAM is smaller than with FIFO (around 1/2), this might be an effective way to counteract CPU/RDP slowdowns caused by competition on the RDRAM bus. So when using the XBUS microcode, please test a variety of combinations.
For those who could not use F3DEX because they were utilizing the Fast3D XBUS microcode, we recommend switching over to the F3DEX2 microcode. For more information see Appendix A, which lists the new microcode corresponding to the old microcode.
gspF3DEX2(.NoN) | ----0x410 Bytes |
gspF3DEX2.Rej | ----0x600 Bytes |
gspF3DLX2.Rej | ----0x600 Bytes |
gspL3DEX2 | ---- 0x540 Bytes |
gspS2DEX2 | ---- 0x800 Bytes |
If numerous microcodes share the FIFO buffer, the value should match the microcode with the largest required size.
When microcode self-loads, a number of parameters are maintained.
The parameters that are maintained are the:
DisplayList taskGeometryMode, Light and vertex cache are not maintained. The Model as well as the Projection matrix are maintained, but because the MP matrix is not maintained, you need to load either the M or P matrix again and reconstruct the MP matrix.
Because of these changes, self-loading with F3DEX/S2DEX microcode cannot be done. Self-loading is only possible among the F3DEX2 series, and among F3DEX2 and S2DEX2.
Self-loading is also possible between between the FIFO microcode and the XBUS microcode.
F3DLX2.Rej loads all 64 vertices to the vertex cache at once.
The old version F3DLX.Rej also had a vertex cache size of 64, but it could only load 32 vertices at one time. Thus, in order to load the data for 64 vertices it was necessary to execute two gSPVertex instructions. But now, with the release of F3DLX2.Rej, that restriction no longer applies, and you can use a single gSPVertex instruction to load data for anywhere from 1 to 64 vertices. The same holds true for F3DEX2.Rej.
F3DLX2.Rej supports CULL_FRONT
Unlile the old version F3DLX.Rej which did not support CULL_FRONT/CULL_BOTH, the new F3DLX2.Rejm microcode supports both of these. Naturally, F3DEX2.Rej also supports CULL_FRONT/CULL_BOTH.
The number of GBIs in gSPForceMatrix has been changed
In the Fast3D/F3DEX series, gSPForceMatrix had compound commands comprising 4 GBIs. But in the new F3DEX2 series, the number of GBIs in a compound command has been changed to 2. Source code which depends on this fact must be changed. Please refer to the examples below.
Example of code which must be fixed:
Gfx *gp = glist; | Gfx *gp = glist; |
gSPForceMatrix(gp, mptr); | --->gSPForceMatrix(gp, mptr) |
gp += 4; ~~~ |
gp += 2; ~~~ |
Example of code which does not need to be fixed:
Gfx *gp = glist;
gSPForceMatrix(gp++, mptr);
Line microcode and FillRectangle/TextureRectangle can coexist.
There had been problem in that FillRectangle/TextureRectangle would not render properly if Scissor Box was not specified again after the the Line microcode was used for Line rendering. But that was problem was fixed in L3DEX2. As a result, you can now render without re-specifying ScissorBox even when you switch from L3DEX2 -> F3DEX2 with LoadUCode.
Old - Fast3D/F3DEX Series | New - F3DEX2 Series |
<FIFO>
gspFast3D.fifo.o gspF3DEX.fifo.o gspF3DLX.fifo.o |
gspF3DEX2.fifo.o |
<XBUS>
gspFast3D.o |
gspF3DEX2.xbus.o |
<FIFO>
gspF3DNoN.fifo.o gspF3DEX.NoN.fifo.o gspF3DLX.NoN.fifo.o |
gspF3DEX2.NoN.fifo.o |
<XBUS>
gspF3DNoN.o |
gspF3DEX2.NoN.xbus.o |
<FIFO>
gspF3DLP.Rej.fifo.o gspF3DLX.Rej.fifo.o |
gspF3DLX2.Rej.fifo.o
gspF3DEX2.Rej.fifo.o |
<XBUS>
No corresponding microcode |
gspF3DLX2.Rej.fifo.o
gspF3DEX2.Rej.fifo.o |
<FIFO>
gspLine3D.fifo.o gspL3DEX.fifo.o |
gspL3DEX2.fifo.o |
<XBUS>
gspLine3D.o | gspL3DEX2.xbus.o |
<FIFO>
gspS2DEX.fifo.o gspS2DEX_d.fifo.o | gspS2DEX2.fifo.o
No support |
<XBUS>
No corresponding microcode | gspS2DEX2.xbus.o |
gspF3DEX2.fifo.o/gspF3DEX2.xbus.o
- Vertex cache size is 32
- Subpixel calculations
- Clipping
gspF3DEX2.NoN.fifo.o/gspF3DEX2.NoN.xbus.o
- Vertex cache size is 32
- Subpixel calculations
- Clipping of planes other than NearPlane
gspF3DEX2.Rej.fifo.o/gspF3DEX2.Rej.xbus.o
- Vertex cache size is 64
- Subpixel calculations
- Rejection processing
- (Rendering of entire triangle stops if part of triangle is outside CLIPBOX)
gspF3DLX2.Rej.fifo.o/gspF3DLX2.Rej.xbus.o
- Vertex cache size is 64
- No subpixel calculations
- Rejection processing
- (Rendering of entire triangle stops if part of triangle is outside CLIPBOX)
gspL3DEX2.fifo.o/gspL3DEX2.xbus.o
- Line microcode
- Vertex cache size is 32
- Subpixel calculations
- Clipping
gspS2DEX2.fifo.o/gspS2DEX2.xbus.o - Sprite microcode
06/15/98 |
Release 2.05 |
Fixed the action of G_TEXTURE_GEN_LINEAR. |
Fixed the problem so the Z coordinate values of the polygons rendered in the space between the front of the focal point and the NearPlane using F3DEX2.NoN do not exceed the defined range. |
Changed the rendering procedure so that, when Clipping is used, the drawing of the polygons that result from the clipping process produces nearly the same results as when rendering with the F3DEX series. |
05/28/98 |
Release 2.04 (patchNg980610 version) |
Changed gbi.h in order to disable G_TEXTURE_ENABLE. |
No change in the microcode from version 05/20/98. |
05/20/98 |
Release 2.04 |
Fixed the problem so lighting calculations are now performed correctly for triangles that have vertices set with normal vectors which have been normalized to 128. |
04/23/98 |
Release 2.03 |
Fixed the gSPBranchLessZ* instruction so its works correctly. |
Fixed the problem so color is correct when more than 3 Lights are used. |
04/16/98 |
Release 2.02 (patchNg980423 version) |
Changed the number of GBIs in gSPForceMatrix/Added information regarding coexistence of Line microcode and FillRectangle/TextureRectangle |
gSPPopMatrix is now ignored when the stack is empty, fixing the problem of the system hanging when gSPPopMatrix process is performed when the stack is empty. |
Added S2DEX2 microcode to the package. This microcode can be loaded together with the F3DEX2 series using gSPLoadUcodeL. Also added related information. |
Fixed the gSPBgRectCopy instruction in S2DEX 1.06 so image processing functions properly on narrow-width frames. |
Fixed the problem so Flat Shading color remains defined when Flat Shading is used and the Clipping process is activated. |
Fixed the problem that caused gSPLightColor to act strangely whenever anything other than LIGHT_1 was specified. |
03/30/98 |
Release 2.01 |
Added F3DEX2.Rej, speeded F3DLX2.Rej up somewhat. |
03/26/98 |
Release 2.00 |
Official release |
Copyright © 1999 Nintendo of America Inc. All Rights Reserved Nintendo and N64 are registered trademarks of Nintendo Last Updated January, 1999 |