Limitations Due to Shader Specifications

Limitations Due to Shader Specifications

Vertex shaders have the following limitations due to hardware characteristics.

Starting and Stopping a Shader

Shaders start execution beginning from a main label.
All output registers specified by #pragma output_map are written (writing all components x, y, z, and w). Execution ends by calling the end instruction, and then starts for the next vertex.
Correct operations will not result if the registers specified by #pragma output_map are not written to. The end instruction must be called explicitly at the end of the routine.

Since all required processing will be assumed to have completed once all output registers are written to , it is undefined whether or not instructions called after the last instruction that writes to an output register will execute.
In addition, sometimes abnormal operations result when an instruction called after the last instruction to write to an output register reads from or writes to a register. Do not call any instruction other than an nop between the last instruction to write to an output register and the end instruction.

Each output register can be written to only once. Operations are not guaranteed if the same output register is written to more than once. The same is true when writing to each component.

Sometimes abnormal operations result if an input register is never read during the processing for a single vertex.
Always use an instruction that reads one or more components of one or more input registers.

Number of Steps

The number of program steps is limited to 512.
The def, defi, defb, ret, else, endif, and endloop instructions are not counted as steps.

Number of Swizzling and Masking Patterns

There is a limit on the combined number of output component masking patterns, input component swizzling patterns, and input component signs.
This limit is 128. Of these, the number of patterns that can be used with a mad instruction is 32.

Example 1

add     r0, r1.xy,  -r2.zw
add     r2, r0.x,    r3
mul     r3, r2.xy,  -r3.zw
add     r4, r2.xxxx, r5.xyzw
With this set of instructions, the first and third instructions use the same pattern.
The combined number of patterns is two, because the second and fourth instructions also have the same pattern.


Example 2
add     r0, r1.xy,  -r2.zw
mul     r2.xy, r0.x,   c2.y
mad     r3,    r2.xy, -r3.zw, r1.w
cmp     0, 1, r1.x, c0.y
With this set of instructions, the first and third instructions are treated as having the same pattern because, except for src2, the combination of operands of the third instruction, mad, is the same as that of the first instruction, add. This pattern can also be used with a mad instruction.
The second and fourth instructions are viewed as having the same pattern because the combination of src0 and src1 of the fourth instruction is the same as the combination of src0 and scr1 of the second instruction (mul).

Limitations on Control Instructions

An instruction that terminates control must be called if an instruction that starts control is called. Specifically, instructions must be called in the following pairs.

ifb (- else) - endif
ifc (- else) - endif
call - ret
callb - ret
callc - ret
loop - endloop
In these control blocks, it is illegal to jump to a location outside the control block using the jpc or jpb jump instruction.
The ret instruction cannot be called within an if block or a loop block.

Operations are undefined when nested call instructions are invoked immediately before a ret instruction.

Also, you cannot use a jpc or jpb jump instruction to jump from inside a block enclosed by the main and endmain labels to an outside location.
Furthermore, you cannot jump from inside a subroutine to outside that subroutine.
You cannot jump from inside a subroutine to a ret instruction. Operations are undefined if this type of control is used.

Registers That Cannot be Used Simultaneously

In general,you cannot specify more than one floating point constant register to an assembler instruction that specifies more than one src operand.
Also, you cannot specify more than one input register.
Note, however, that you can specify several input registers as long as they have the same index.

add     r0, c0, c0  // Error
add     r0, c0, c1  // Error
add     r0, v0, v0  // No error
add     r0, v0, v1  // Error
When using macro instructions, error checking is performed after macro expansion.

Calculation Results of Exceptional Processing

Vertex shaders behave as follows regarding output of exceptional calculations.


The values infinity, negative infinity, and NaN can be used with the cmp instruction. For more information, see cmp - Compare.

Limitations on the Output of Illegal Data

Of all the data that can be output from a vertex shader (data written to an output register), operations are not guaranteed if NaN (Not a Number) is output for a vertex coordinate.
Make sure that NaN is not output as the result of vertex shader calculations or input from an application as a vertex attribute or uniform value.

Revision History

2011/12/20
Initial version.

CONFIDENTIAL