![]() |
![]() |
Past Microcode
This section includes information related to older version microcode:
Note: Fast3D and Line3D microcode have migrated to F3DEX/L3DEX.
The following microcode is not currently supported:
All computations are performed with as much precision as is practical, in order to create the highest quality images. (Please see "Note for Near Clipping" for additional details.
gSPLine3d
Unsupported GBI Macros
Performance
Vertex transformations and lighting calculations are heavily vectorized,
so it is best to operate on as many vertices as possible. Even
number-sized loads are more efficient because vertices are processed
in groups of two.
When doing lighting, and any vertices are clipped, clipping and
lighting are implemented as ucode overlays, using a most recently
used algorithm. Lighting happens at vertex load time and
clipping happens at triangle draw time, so this division of microcode is
acceptable. However, a display list that loads only a few vertices
at a time and then draws a small number of triangles, would not amortize
the microcode swapping overhead very effectively.
The RCP is designed to draw high quality textured primitives.
Where possible, use texture-mapping to achieve visual
complexity rather than additional geometry.
Notes on Different Versions
gspFast3D
While the display list is being sent to the RDP, the RSP can execute other DRAM microcode, whose output_buff is different, or audio tasks. When gDPFullSync is not included in the display
list to be sent to the RDP, other RDP display lists can be sent (from the RSP task to other buffers) using other osDpSetNextBuffer commands. However, when gDPFullSync is included in the display list, neither send other RDP display lists using osDpSetNextBuffer nor start gspFast3D or gspFast3D.fifo tasks until completing the display list.
Another way to accomplish this is to use a gspF3DNoN microcode (or its DRAM or fifo version) which does Near Clipping. An object behind the viewer is clipped and an object far from the near plane is visible. However, an object between the near plane and the viewer is also visible. In this way, the near value can be increased without geometry disappearing between the viewpoint and the near plane.
The gspLine3D microcode supports 3D clipping, matrix stack, and gouraud shading.
The .dram version controls output transfer of the RDP display list to the memory buffer (RDRAM) instead of the RDP.
All processing is accurate enough to create high quality images.
All gSP1Triangle commands create the three edges of the triangles to
be drawn. Please note, when drawing two triangles next to each
other, both shared edges are drawn. So, this takes additional
time for processing. The command (gSP1Triangle) is not efficient,
so line microcode should be used only for debugging.
The default RDP state is same as the default state written in gspFast3D
Commands
Simple Code for Displaying a Sprite
Arguments
GBI
Note Regarding Z-Buffering
Warnings, Limitations, and Workarounds
The FIFO version transfers data to the RDP by using DRAM FIFO.
Because it is specific to this microcode, you must change the microcode when you change the structure.
This is 'state' structure, which is linked to each object to be
rendered. This is limited to microcode.
When you change its structure, you must also change the
gtoff.c tool and microcode.
The following structure represents a single triangle, which is one of a list of triangle objects to be rendered. The triangle list has an 8-byte limit. This structure is only 4-bytes, so it is assumed that this triangle is an element in an array. It is also assumed that the array is arranged in 8-byte units.
Because vectors are used efficiently for a calculation of the
triangle attributes, you can calculate the gouraud
shading attributes without limitations
when other attributes are also generated.
Cracks and tears sometimes appear because the calculation
of the edge slope is simplified.
gspFast3D
This microcode comprises the following six object files:
This is the optimized, high-quality, full-featured 3D polygonal geometry RSP microcode. It supports 3D clipping, lighting, texture coordination generation, fog,
and matrix stack.
The gspFast3D, gspFast3D.dram, and gspFast3D.fifo versions of the
microcode are equivalent to the gspF3DNoN, gspF3DNoN.dram, and
gspF3DNoN.fifo versions (respectively), with the difference that
near clipping is performed at the Near Clipping plane in the former 3 versions, and performed at the eyepoint in the latter 3 versions.
GBI
The following GBI command is not supported by the microcode:
All gSPLine3D macros are complied as no-ops, so they have no effect.
In order from the fastest to the slowest, the following types of
triangles can be generated by this microcode:
Triangle attribute computation is heavily vectorized, so generation
of Gouraud shading attributes is essentially free, if you are generating
any other attributes.
There are some differences when calling the DRAM and fifo versions of
this microcode.
The flags field of any task followed by this task should
have OS_TASK_DP_WAIT set. If more than one task using this microcode is called in the same frame, then only the last task should contain a gSPFullSync in its display list. This microcode takes care of sending all output to the RDP. When using this microcode, it is not necessary to specify output_buff or an output_buff_size. (These fields of task
header can be set to 0.)
gspFast3D.dram
Tasks using this microcode need to set the OS_TASK_DP_WAIT flag only if they follow a task using gspFast3D or gspFast3D.fifo. This microcode sends its data to a buffer in DRAM and not to the RDP. The CPU must then cause the buffer to be sent to the RDP. The buffer is pointed to by output_buff in the task header. This must point to a buffer which is at least as big as the maximum RDP display list that can be generated by the task. Remember that when geometry gets clipped RDP lists will expand, so leave extra room. If the buffer is not large enough to store the entire RDP display list, other memory areas will be overwritten. After the RSP finishes its process, the buffer can be sent to the RSP using the osDpSetNextBuffer command. The length of data in the buffer,which is needed for osDpSetNextBuffer, is written at an address specified by rdp_output_len in the task header.
gspFast3D.fifo
A task that uses this microcode and is followed by a gspFast3D task or a osDpSetNextBuffer command needs to set the OS_TASK_DP_WAIT flag. This microcode watches transmission of the display list to the RDP. A buffer specified by output_buf in the task header is used. The buffer must be cache aligned. Output_buff_size must be the pointer for byte followed by last byte of the buffer. The larger the buffer is, the more practical the interface between the RSP and the RDP. When there are multiple tasks in parallel which use fifo microcode, only the last task in a frame must include gDPFullSync. When there are multiple tasks continuously which use fifo microcode, all tasks must use the output_buff buffer. (Each task can use a different buffer, however, it is more efficient to use one large buffer for all tasks.)
Note for Near Clipping
Near Clipping removes geometry either behind the viewer or between the viewer and the Near Clipping plane. In actual circumstances, an object
never disappears when getting closer to the viewpoint, so it should
not happen in a N64 program. One way to achieve this is to locate the
near plane very close to the viewpoint. (By calling guPerspective, make the near value small.) However, it does not always work because the smaller ratio of near/far makes the accuracy of Z and texture mapping worse.
Z buffering never functions in the area between the viewpoint and the
near plane. As a result, objects between the near plane and viewer never
hide each other. For example, in an asteroid type game, when an asteroid
approaches the view point closer than the near plane, the asteroid is
drawn correctly. (objects far from the near plane are hidden.) However,
when two asteroids approach closer than the near plane, they cannot be
hidden correctly.
Default RDP State
Whenever a graphic task is first started, some of the RDP states are
initialized to their default states. The rest of the states keep their
previous values. After restarting from yield, RDP states are restored
with states set at yield. The following are RDP default settings:
gspLine3D
This is optimized, high-quality, completely functional, and 3D line RSP microcode.
The calculation for line attributes is dealt with by using vectors
efficiently so that the load when using gouraud shading attributes
can be ignored when generating other attributes.
gspSprite2D
This is the optimized, high-quality, full-featured 2D sprite geometry microcode. It supports automatic subdivision and loads any size of all of the texture format sizes and types supported in the command, and sends it directly to the RDP. Additionally, images can be scaled up or inverted in the X or Y directions.
The sprite microcode is accessed through the following functions/macros:
Initializes a specified sprite structure, and allowa the application to be run without directly initailizing the sprite structure.
gSPSprite2DBase initializes the common sprite parameters, then zsends the structure to the microcode to begin actual processing. It does not perform actual screen drawing.
Used to specify the X/Y scaling and/or flipping parameters for a sprite. It does not perform actual screen drawing.
Specifies the screen coordinates where the sprite is to be drawn, and starts actual screen drawing using the parameters specified by gSPSprite2DBase and gSPSprite2DScaleFlip.
#include "gu.h"
#include "gbi.h"
uSprite MySprite;
guSprite2DInit(Mysprite, ImagePointer,
TlutPointer, ImageWidth,
RectangleWidth, RetangleHeight,
ImageType, ImageSize,
TextureStartS, TextureStartT);
gSPSprite2DBase(glistp++,
OS_K0_TO_PHYSICAL(MySprite));
gSPSprite2DScaleFlip (glistp++, ScaleX, ScaleY,
FlipTextureX, FlipTextureY);
gSPSprite2DDraw (glistp++, PScreenX, PScreenY)
typedef struct {
void *SourceImagePointer;
void *TlutPointer;
short Stride;
short SubImageWidth;
short SubImageHeight;
char SourceImageType;
char SourceImageBitSize;
short SourceImageOffsetS;
short SourceImageOffsetT;
/* 20 bytes for above */
/* padding to bring structure size to
64-bit alignment */;
char dummy[4];
} uSprite_t;
typedef union {
uSprite_t s;
/* Ensure this is 64-bit aligned */;
long long int force_structure_alignment[3];
} uSprite;
void guSprite2DInit(uSprite *SpritePointer,
void *SourceImagePointer,
void *TlutPointer,
int Stride,
int SubImageWidth,
int SubImageHeight,
int SourceImageType,
int SourceImageBitSize,
int SourceImageOffsetS,
int SourceImageOffsetT);
The pointer to the sprite structure that sets the parameters.
The base pointer of the texture image in memory containing the rectangle to be displayed.
The pointer to the color index used for CI images. Set it to Null when CI images will not be used.
The texel width of the base image in memory.
The texel width of the image to be displayed.
The texel height of the image to be displayed.
Specifies the format of the texture image in memory. All texture formats supported by the hardware are allowed, such as G_IM_FMT_RGB or G_IM_FMT_CI.
The number of bits per texel of the input image. All texture sizes supported by the hardware are allowed, such as G_IM_SIZ_32b or G_IM_SIZ_4b.
Specifies the scale in the X axis for the input screen image as a s 5.10 fixed-point number. A value of 1024 specifies 1 to 1 scaling. A value of 512 enlarges the input texels by 2 times in the output scaling pixels.
Specifies the scale in the Y axis for the input screen image as a s 5.10 fixed-point number. A value of 1024 specifies 1 to 1 scaling. A value of 512 enlarges the input texels by 2 times in the output scaling pixels. Scale values should be (1024 in order to avoid an unnatural feel. Scale values must be positive. Use the FlipTextureY variable to create negatively scaled images.
Specifies whether the image to be displayed should be inverted in the X direction.
Specifies whether the image to be displayed should be inverted in the Y direction.
The offset in texel columns from the origin of the base image. It specifies the starting point of the rectangular region for texel display within the base image.
The offset in texel lines from the origin of the base image. It specifies the starting point of the rectangular region for texel display within the base image.
Specifies the X location in the screen coordinates of the output image. The origin is in the upper-left corner of the screen.
Specifies the Y location in the screen coordinates of the output image. The origin is in the upper-left corner of the screen.
The following GBI commands are not supported by this microcode:
The sprite microcode does not directly support Z-Buffering.
This is unnecessary as Z-Buffering can be accomplished outside of the sprite microcode by setting up the proper rendering mode and making use of the hardware primitive depth registers. Following is a code fragment that does Z-Buffering.
gDPSetRenderMode(glistp++,
G_RM_AA_ZB_OPA_SURF,
G_RM_AA_ZB_OPA_SURF2);
gDPSetDepthSource(glistp++,
_ZS_PRIM);
gDPSetCombineMode(glistp++,
G_CC_DECALRGB, G_CC_DECALRGB);
gDPSetPrimDepth(glistp++,
ZBufferValue, 0);
guSprite2DInit(MySprite, ImagePointer,
TlutPointer, ImageWidth,
RectangleWidth, RectangleHeight,
ImageType, ImageSize,
TextureStartS, TextureStartT);
gSPSprite2DBase(glistp++,
OS_K0_TO_PHYSICAL(MySprite));
gSPSprite2DScaleFlip(glistp++,
ScaleX, ScaleY,
FlipTextureX, FlipTextureY);
gSPSprite2DDraw(glistp++, PScreenX, PScreenY);
Images that have been non-unit scaled and flipped around the Y axis may not be smoothly converted in the vertical direction, depending on
the quantity of sub-pixels. Jumping will occur at a certain quantity.
The solution is to convert non-unit scaling to unit amounts in the
vertical direction.
The Sprite Microcode was designed to be able to scale up images by
any amount. Images can also be scaled down together with some
attendant artifacts. Please note that, while the TextureScaleX and
TextureScaleY parameters are s 5.10 fixed-point numbers, they are
restricted to being positive. Consequently, the largest usable scale
value is 32767, which corresponds to a texel to pixel ratio of 31.999.
Texture images that are either scaled in the Y axis or placed on a
subpixel scanline boundary require filtering by the hardware texture
filter unit. This filtering requires that at least one extra line in
the screen image be loaded in the texture memory so that the filtering can occur.
The texture memory is limited to 4K bytes, so there are some restrictions:
gspTurbo3D
The gspTurbo3D microcode is a reduced-feature, reduced-precision microcode that delivers significantly faster performance.
All three subtypes (.o, .dram.o, and .fifo.o) are low accuracy, simplified 3D polygon geometry RSP microcodes that work effectively for characters and objects that are always displayed near the center of the view area. All processing is done with low accuracy to increase speed. However, this low degree of accuracy is reflected in the objects.
The DRAM version writes its output (RDP display list) into a memory buffer instead of transferring it to the RDP.
Features Not Supported by gspTurbo3D
Turbo Display List
Game programs must not send any geometric objects that require clipping to appear on screen. Scissoring is supported by using the DP command.
Calculation of dynamic lighting is not executed in this
microcode.
Perspective correction in textures is not done.
There is no matrix stack. A single matrix is a part of
the object state. It is used for vertex transformation.
Game programs cannot use anti-aliasing with this microcode because low accuracy
calculations are used. Anti-aliasing is applied to
all edges, but it does not work well because of low
accuracy vertex positioning.
The gspTurbo3D microcode uses a different, simpler format for the display list. The simpler display list is not compatible with other microcodes.
The turbo display list is a linear list of object structures
that ends with a NULL object (the object state is a NULL object).
#include "gt.h"
typedef struct {
gtGlobState *gstatep; // global state, usually NULL
gtState *statep; // when NULL, object
// processing is finished
Vtx *vtxp; // when NULL, point in
// buffer is used
gtTriN *trip; // when NULL,
// nothing is drawn
} gtGfx_t;
typedef union {
gtGfx_t obj;
long long int force_structure_alignment;
} gtGfx;
Each object structure includes 4 pointers (global state, object
state, vertex list, and triangle list) for a total of 16 bytes.
When a global state pointer or vertex list pointer is NULL, the one in
current DMEM is used. When the triangle list pointer is NULL, the triangle
is not generated. When the object state pointer is NULL, the end of display list is assumed.
Turbo Global State
Following is the turbo global state structure.
#include "gt.h"
typedef struct {
u16 perspNorm; // normalization of perspective
u16 pad0;
u32 flag;
Gfx rdpOthermode;
u32 segBases[16]; // segment base address
Vp viewport; // view-port
Gfx *rdpCmds; // RDP data block when NULL block
// ended by gDPEndDisplayList
} gtGlobState_t;
/* Note: Although there are 16 segment
* table entries, the first segment (segment 0)
* is reserved for physical memory mapping.
* Therefore, segment 0 cannot be used. */
typedef union {
gtGlobState_t sp
long long int force_structure_alignment;
} gtGlobState;
The global state includes data that is unlikely to change
and that is also the prime of each object. A format of the global
state structure is exactly the same as DMEM and this
structure is simply copied to DMEM.
The perspNorm field is used while transforming a vertex (see gSPPerspNormalize).
The rdpOthermode field includes the DP command SetOtherMode
which is sent before sending any other DP commands.
The segBases array includes a 16-segment base address. Its entry 0 is reserved for physical memory mapping, so it cannot be used.
The viewport is used while transforming a vertex.
The rdpCmds points to a DP command block. When this pointer
is not NULL, the macros in the DP command block are transferred to the RDP. The list of DP macros in the DP command block must end with the gDPEndDispley macro. Some DP macros (given later on this page) cannot use the DP command block.
Turbo Object State
The turbo object state structure is shown below.
#include "gt.h"
typedef struct {
u32 renderState; // render state
u32 textureState; // texture state
u8 vtxCount; // number of vertex
u8 vtxV0; // vertex load address
u8 triCount; // number of triangles
u8 flag;
Gfx *rdpCmds;
Gfx rdpOthermode;
Mtx transform; // transformation matrix
} gtState_t;
typedef union {
gtState_t sp;
long long int force_structure_alignment;
} gtState; // same as gtStateLite : gtState,
// but not matrix. (see flag)
// This structure must go
// through gtState.
typedef struct {
u32 renderState; // render state
u32 textureState; // texture state
u8 vtxCount; // number of vertex
u8 vtxV0; // vertex load address
u8 triCount; // number of triangles
u8 flag;
Gfx *rdpCmds; // pointer for RDP DL
// (segment address)
Gfx rdpOthermode;
} gtStateL_t;
typedef union {
gtStateL_t sp;
long long int force_structure_alignment;
} gtStateL;
The gtStateL version of the state structure can be used when a new matrix is not necessary. This is good for large objects that
need to be placed among some turbo objects. The same
transformation matrix can be used for all of its parts. You must set the GT_FLAG_NOMTX flag when using the gtStateL version of the state structure.
The renderState field is similar to geometry mode in gbi.h.
It uses the following flags which are bit OR'd together:
The textureState field has a texture tile number in the lower three bits of its field. All primitives in an object are drawn by using the same tile.
Sets Z buffering
Set texture mapping
Perform back-face culling
Perform smooth shading
Turbo Vertex
The vertex list is an aggregation of vertex structures.
It uses the same format as the vertex format in gbi.h.
Please see gSPVertex for details.
The vertex cache in the turbo microcode can read 64 vertices. The vertex is transformed when it is loaded.
Turbo Triangle List
The triangle list is an aggregation of the following structure.
#include "gt.h"
typedef struct {
u8 v0, v1, v2, flag; // flag for flat shading
} gtTriN;
This array must be aligned to an 8-byte boundary.
GBI DL Command
The turbo microcode uses a completely different display list
format, so the GBI DL command is not supported.
However, the global and object states of the DP command block are
supported. These commands are the same format (and same microcode)
as the one in gbi.h. Some DP commands are not supported
because the DP state operation is not appropriate for the
interface between turbo geometry and turbo display list processes.
Unsupported DP GBI Macros
The turbo microcodes do not support the following DP GBI commands:
Most of these can be set by using the gtStateSetOthermode interface.
Performance
This microcode generates the following triangle types
in order of speed, beginning with the fastest:
Z buffering the triangle needs a few additional processes.
Because vectors are used for efficient vertex transformation, it
is the best to operate as many vertices as possible. Loading vertices in a multiple of four is the most effective method.
The RCP is designed to be able to draw high-quality texture
primitives. Texture mapping should be used where possible (instead of additional geometry) in order to achieve more complicated graphics.
Caution
This is first release of this microcode. Its functions
and display list format will be changed in the future.
Copyright © 1999
Nintendo of America Inc. All Rights Reserved
Nintendo and N64 are registered trademarks of Nintendo
Last Updated January, 1999