N64® Functions Menu

al - Audio Library Functions
gDP - DP GBI Macros
gdSP - General GBI Macros
gSP - SP GBI Macros
gt - Turbo Microcode RDP
gu - Graphics Utilities
Math - Math Functions
nuSys - NuSystem
os - N64 Operating System
sp - Sprite Library Functions
uh - Host to Target IO
64DD - N64 Disk Drive

Nintendo® Confidential

   

gspZ-Sort Microcode

Z-Sort microcode was developed to delete obscured screens at the Nintendo 64 (N64) hardware level by using a Z-sort. The Z-Sort microcode creates screens using a procedure that sorts all the graphics to be displayed on the screen in the order of their depth on the screen and then draws them in order from back to front.

The N64 OS/Library supports obscured screen processing using the Z-Buffer. This processing method judges whether or not a graphic is visible on a pixel-by-pixel basis. Compared with Z-Sort, this has the advantage of being able to accurately express the relationship before and after the graphic is displayed. On the other hand, access to RAM increases. With Z-Sort, although the relationship before and after display cannot be processed to the same extent as with the Z-Buffer, the amount of RAM access per graphic decreases. Thus, the amount of graphics displayed on the screen within a specific time increases compared to the Z-Buffer method.

The advantage of Z-Sort is that the improved RAM band makes the RDP processing load lighter. In many applications, the time required to perform RDP processing causes a bottleneck. Thus, lighter processing load is ideal when the volume of graphics is high.

One note of caution, however. RSP processing load does not change significantly. RDP processing load changes according to the size of the area to be filled. With a drawing in a small area in particular, RDP processing ends sooner than RSP processing. Because there are many small drawing areas, RDP processing waits for RSP processing to end, during which time the processing capacity does not change with Z-Sort or with Z-Buffer. When the drawing area is somewhat larger, however, the Z-Sort method is effective. Z-Sort microcode cannot do everything. Carefully consider the screen to be drawn before using Z-Sort.

Drawing Flow Using Z-Sort
Z-Sort microcode supports triangle areas, quadrangle areas, and texture and fill rectangles using RDP commands. In the following discussion, all areas to be drawn by the RDP are called ZObjects.

In Z-Sort microcode, for each ZObject, one screen depth value is found to represent the drawing area. Each ZObject is then sorted by that screen depth and obscured screen processing is executed by drawing the ZObjects in order from the back to the front.

The processing flow for ZObject drawing is as follows:
  1. Multiply the model matrix by the perspective transformation matrix, and so on.
  2. Calculate the coordinate transformation, perspective transformation, and/or screen depth for model vertices.
  3. Determine whether there are vertices in the screen.
  4. Determine the clipping or back plane.
  5. Construct the ZObject data.
  6. Create the ZObject list.
  7. Draw the ZObjects in order per the ZObject list (drawing processing).
In order to draw a ZObject, the information concerning how the ZObject will be drawn must be prepared as data. With conventional Fast3D microcode, the Vertex and Tri commands were combined to draw triangles, but with Z-Sort microcode, drawing is performed by creating ZObject structures.

Not all of these processes are available in Z-Sort microcode. The major difference between Z-Sort and other graphics microcodes is that Z-Sort microcode does not work by itself; the CPU must perform some of the processing related to drawing.

For example, sorting ZObjects in order of screen depth is not available as microcode. Because the CPU does not perform sorting, that job must be handed over to the RSP.

At the very least, the CPU must perform the following processes:
  • Clipping or back screen determination
  • ZObject data construction
  • ZObject list creation
Z-Sort microcode currently does the following tasks. Each process is controlled by a Display List (DL) comprised of one or more GBI commands:
  • Multiplication of the model matrix by the perspective transformation matrix
  • Calculation of the coordinate transformation, perspective transformation, and/or screen depth for model vertices
  • Creation of flags to indicate whether or not vertices are in the screen
  • Drawing ZObjects in order according to the ZObject list (drawing processing)
Naturally, matrix multiplication and coordinate transformation (here, called arithmetic operation processing) could also be performed by the CPU. Dividing these tasks between the CPU and the RSP according to available processor capacity is best. For the remainder of the explanation, however, it is assumed that the RSP will perform arithmetic operation processing. If the CPU is to perform operation processing, read about the arithmetic operation processing explained in the "Z-Sort Microcode" section of the N64 Programming Manual.

For detailed information about drawing and arithmetic operations as well as RSP processing implementation methods, please see the "Z-Sort Microcode" section of the N64 Programming Manual.

Two-Pass Parallel Processing
In graphics processing, the RDP processing time rarely matches the RSP processing time. The FIFO buffer exists to absorb this difference. When the RDP processing time exceeds the RSP processing time, the End Processing RDP command is stored in the FIFO buffer. Because the FIFO buffer size is limited, if the wait is too long, the buffer becomes full.

In other microcodes (Fast3D, F3DEX, S2DEX), when the buffer is full, the RSP waits until space opens up in the FIFO buffer. Merely waiting for RDP processing needlessly consumes the calculation capacity of the RSP.

To eliminate this waste in Z-Sort microcode, the RSP can perform other DL processing (mainly, arithmetic operation processing) while waiting for RDP processing. This combines arithmetic operation processing and drawing processing into a single task for a pseudo-parallel processing called two-pass parallel processing.

In two-pass parallel processing, the DL processed within the RSP stand-by time is called the Sub Display List (Sub DL). Here, as in conventional microcodes, the normal DL is called the Main DL to distinguish it from the Sub DL. Just like the Main DL, the Sub DL has 18 dedicated DL stacks. Because the Sub DL is processed while the RSP is waiting for RDP processing, the GBI commands that can be processed by the Sub DL are limited. Naturally, commands using the RDP cannot be executed. Only commands using the RSP can be used. If GBI commands using the RDP are included in the Sub DL, a malfunction will result. Specific GBI commands that can be included in the Sub DL will be explained later. Mainly arithmetic operation commands can be used.

In actual processing, the RDP processing time usually is not longer than the RSP processing time, and if the RDP drawing area is small, the wasted RSP time mentioned above disappears. When this happens, the Sub DL cannot be processed until expressly called by the Main DL.

The specifications for this microcode assume that there will be inconveniences. The RDP drawing area varies depending on the scene to be drawn, so the RSP stand-by time in which the Sub DL can be processed is not constant. RSP arithmetic operation processing must end within a certain time to ensure the CPU's ZObject creation time. This is why Sub DL processing even outside the RSP stand-by time is so desirable.

For the above reasons, a microcode gspZ-Sort.pl.fifo.o (the Z-Sort.pl microcode) has been prepared that starts each GBI command in the Sub DL, one at a time, each time a certain amount of ZObject processing is completed; even outside the RSP stand-by time. The timing for calling the Sub DL commands differs depending on the type of ZObject drawn. For polygon ZObjects, one Sub DL command is required for every two to four ZObjects.

In contrast to the Z-Sort.pl microcode, the Z-Sort.fifo microcode is for Sub DL processing only during RSP stand-by. Because this additional processing is performed by Z-Sort.pl microcode, the overhead becomes larger than in Z-Sort.fifo microcode. Therefore, Z-Sort.fifo offers slightly better RDP drawing performance. These two types of microcode are identical except for the difference in calling the Sub DL and the larger overhead. Select the type that is best for your game's circumstances.

Drawable Objects (ZObject)
As explained earlier, graphics are drawn in drawing areas called ZObjects. The drawing parameters for each type of ZObject are defined below, according to the corresponding structure:
  • zShTri for triangles with smooth shading
  • zShQuad for quadrangles with smooth shading
  • zTxTri for triangles with textured smooth shading
  • zTxQuad for quadrangles with textured smooth shading
  • zNull for other drawing areas using RDP commands (used for Fill Rectangle and Texture Rectangle)
Unfortunately, due to size limitations, Z-Sort microcode does not provide ZObjects for drawing triangles and quadrangles with flat shading. To draw these, specify the same color for all vertices.

Although the microcode supports only these simple types of graphics, every imaginable type of graphic can be drawn using the libraries in the CPU.

For details about the above-listed data structure formats, ZObject list processing, and Z-Sort processing, please see the "Z-Sort Microcode" section in your N64 Programming Manual.

Controlling RDP Commands with RDPcmd Parameters
Each ZObject structure has one or three RDPcmd areas. The status of the RDP during ZObject drawing processing can be changed by the member variable.

To change the RDP status, use the dedicated DL that lists the GBI commands. This is called the RDP command string.

The RDP command string can contain primarily only commands for controlling the status of the RDP. In other words, the GBI commands that can be used as the RDP command string are limited. The RDP command string and the possible GBIs are shown below. The operation of the GBI commands below is the same as in the Fast3D-compatible microcode. GBI commands not listed below may not work correctly.

GBI Commands Usable in RDP Command Strings
One important note here regarding the inability to use gSPSegment. Although the segment address can be used for gDPSetColorImage, and the like, the value cannot be set with the RDP command string.

Also note that gSPBranchList and gSPDisplayList cannot be used.

It is assumed that the three RDPCmd areas (rdpcmd1, rdpcmd2, and rdpcmd3) will be used as follows:
  • rdpcmd1 for setting RDP rendering mode
  • rdpcmd2 for loading to TMEM (mainly, loading to total TMEM/front half of TMEM)
  • rdpcmd3 for help in loading to TMEM (mainly, loading to TLUT/back half of TMEM)
Given this assumption, use only rdpcmd1 for drawing graphics without texture (zShTri, zShQuad). All three may be specified when drawing textured graphics (zTxTri, zTxQuad).

Z-Sort microcode is different from the microcode that uses the Z-buffer function in that Z-Sort draws objects in order from the back to the front. Thus, it cannot continuously draw only polygons with the same texture. Therefore, when using Z-Sort microcode, ZObjects must be provided with texture information. However, Z-Sort microcode is equipped with a mechanism for minimizing the waste that results when a texture that is already loaded to the TMEM is loaded again.

The pointer to the just-processed RDP command string is memorized. This is compared to the pointer to the RDP command string to be processed by the current ZObject and is sent to the RDP only when it is different.

The microcode contains RDP command pointer memory areas for the three RDP commands rdpcmd1, rdpcmd2, and rdpcmd3 in DMEM (tentatively called rdpcmd1_save, rdpcmd2_save, and rdpcmd3_save). The algorithm for each process is written in C in the "Z-Sort Microcode" section of the N64 Programming Manual. Clear Screen and Other Drawing Processing One important note regarding the use of Z-Sort microcode is its inability to write direct RDP commands to a normal Display List. This is due to the fact that it is internally divided into SP command processing and DP command processing. This determines the number of microcode instructions and processing speed.

Usually, background filling processes, such as Clear Screen, are necessary for drawing all ZObjects. In Fast3D-compatible microcodes, such a GBI string is usually created in a static area and is called from the Display List side. However, because the RDP command string for controlling such DP operations as screen clearing is called from the normal Display List, Z-Sort microcode uses a special GBI command, gSPZRdpCmd.

The GBI commands that can be used in the RDP command string are limited, as are which ones can be used during ZObject drawing. Refer to the previous list for the specific GBI commands usable in RDP command strings.

Display Objects and Arithmetic Operations
As explained previously, Z-Sort microcode can draw four types of polygons (zShTri, zShQuad, zTxTri, and zTxQuad). Though this initially appears to be a small number, many more shapes can be drawn by combining these basic four, just as you only need three prime colors to make millions of others.

The Z-Sort microcode offers the following three principal processing operations:
  • A - gSPZMultMPMtx to produce the screen coordinate vertex data
  • B - gSPZLight or gSPZLightMaterial to produce the color data
  • C - gSPZLight or gSPZLightMaterial to produce the texture coordinate (environment map) data
In all polygon ZObjects, operation A must be performed to find the screen coordinate vertex data. Operation B is required to process light, and operation C is required to process the environment map.

Each GBI used to perform operations A, B, and C, however, is insufficient by itself. The vertex data and transformation parameters (matrices, and so on) must be prepared and the DMEM in the RSP must be loaded before the GBI that performs the operations. In addition, the operation results must be written and returned to the DRAM from the DMEM.

For more information about these operations, please see the "Z-Sort Microcode" area of the N64 Programming Manual.

Work Area for Operations in DMEM
Z-Sort microcode has a GBI for specialized arithmetic operations to perform transformation processing to the 3D model screen coordinate system, lighting calculations, and matrix operations using the RSP.

By combining multiple operations, such values as coordinate and color values necessary to draw ZObjects to the screen can be obtained.

For example, the following GBI command steps transform model coordinates to screen coordinates:
  1. gSPZViewPort sets the viewport.
  2. gSPZPerspNormalize sets the pass normalization factor.
  3. gSPZSetMtx loads the projection matrix to the work area in DMEM.
  4. gSPZSetMtx loads the modelview matrix to the work area in DMEM.
  5. gSPZMtxCat multiplies the projection and modelview matrices.
  6. gSPZSetUMem loads the model coordinate values inside DRAM to the work area in DMEM.
  7. gSPZMultMPMtx transforms the model coordinate values to the screen coordinate values.
  8. gSPZGetUMem outputs the screen coordinate values to DRAM.
In Z-Sort microcode, the work areas used in processing arithmetic operations are stored in DMEM. There are two types of work areas, one for general purpose use (2048 bytes) and one for matrices (192 bytes). The general purpose work area is called the "user area."

The user area occupies the address 0 to 2047. The application creator determines how this area is to be used. For more information, refer to the "Z-Sort Microcode" area of the N64 Programming Manual.

New Z-Sort GBI Macros

For more information on the following macros, please see the "Z-Sort Microcode" area of the N64 Programming Manual.

Z-Sort GBI Macro for Processing the Command List
  • gSPZRdpCmd - Processes the specified Z-Sort microcode RDP command string
Z-Sort GBI Macros for Arithmetic Operations Z-Sort GBI Macros for Other Purposes
Compatibility with Other Microcodes
Z-Sort microcode is not compatible with other Fast3D-compatible microcodes. However, some GBIs will be shared to allow switching by the microcode and self-loading of the F3DEX system. This section explains those GBIs that will likely belong to both microcodes.

The names of the GBIs explained here basically have the new prefix gSPZ instead of the corresponding prefix gSP of the GBI macro in F3DEX.

Z-Sort microcode GBIs include a subset of the F3DEX GBI Level 2. This F3DEX GBI Level 2 is a new and improved GBI set offering faster RSP processing speeds in F3DEX Microcode and will be adopted in the upcoming F3DEX Microcode release.

As a result, Level 2 is not compatible at the binary level with the GBIs adopted in F3DEX Microcode Version 1.23 or earlier. Thus, performing such processing as the microcode and self-loading in the F3DEX microcode system is difficult.

Because Z-Sort microcode uses F3DEX GBI Level 2, when you use Z-Sort microcode, you must define F3DEX_GBI_2 by using the #define directive or compile option D. At the present time, F3DEX_GBI must also be defined.

Common GBI Macros CPU Support Library
In Z-Sort microcode, building plane data from the vertex data (ZObject data) on the screen depends on the CPU. Using arithmetic operation GBI commands, 3D coordinate vertices can be transformed into screen coordinate vertices. The CPU's role is to connect these vertices to build polygons. The CPU performs other processing as well and, therefore, a CPU library must be created by the user to perform this processing. For more information on how to do this, please see the "Z-Sort Microcode" area of the N64 Programming Manual.

See Also
Introduction to N64 Microcode


Nintendo® Confidential

Warning: all information in this document is confidential and covered by a non-disclosure agreement. You are responsible for keeping this information confidential and protected. Nintendo will vigorously enforce this responsibility.

Copyright © 1998
Nintendo of America Inc. All rights reserved
Nintendo and N64 are registered trademarks of Nintendo
Last updated January 1998