1<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
2<html xml:lang="en-US" lang="en-US" xmlns="http://www.w3.org/1999/xhtml">
3  <head>
4    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
5    <meta http-equiv="Content-Style-Type" content="text/css" />
6    <link rel="stylesheet" href="../css/manpage.css" type="text/css" />
7    <title>Limitations Due to Shader Specifications</title>
8  </head>
9  <body>
10    <h1><a name="top">Limitations Due to Shader Specifications</a></h1>
11    <div class="section">
12      <p>
13        Vertex shaders have the following limitations due to hardware characteristics.<BR>
14      </p>
15    </div>
16
17    <h2><a name="shader_start">Starting and Stopping a Shader</a></h2>
18    <div class="section">
19      <p>
20        Shaders start execution beginning from a <CODE>main</CODE> label.<BR> All output registers specified by #<CODE>pragma output_map</CODE> are written (writing all components x, y, z, and w). Execution ends by calling the <CODE>end</CODE> instruction, and then starts for the next vertex.<BR> Correct operations will not result if the registers specified by <CODE>#pragma output_map</CODE> are not written to. The <CODE>end</CODE> instruction must be called explicitly at the end of the routine.<BR> <br> Since all required processing will be assumed to have completed once all output registers are written to , it is undefined whether or not instructions called after the last instruction that writes to an output register will execute. <BR>In addition, sometimes abnormal operations result when an instruction called after the last instruction to write to an output register reads from or writes to a register. Do not call any instruction other than an <CODE>nop</CODE> between the last instruction to write to an output register and the <CODE>end</CODE> instruction.<BR> <BR>Each output register can be written to only once. Operations are not guaranteed if the same output register is written to more than once. The same is true when writing to each component.<BR> <br> Sometimes abnormal operations result if an input register is never read during the processing for a single vertex.<br> Always use an instruction that reads one or more components of one or more input registers.<br>
21      </p>
22    </div>
23
24    <h2><a name="step">Number of Steps</a></h2>
25    <div class="section">
26      <p>
27        The number of program steps is limited to 512.<BR> The <CODE>def</CODE>, <CODE>defi</CODE>, <CODE>defb</CODE>, <CODE>ret</CODE>, <CODE>else</CODE>, <CODE>endif</CODE>, and <CODE>endloop</CODE> instructions are not counted as steps.<br>
28      </p>
29    </div>
30
31
32    <h2><a name="swizzling">Number of Swizzling and Masking Patterns</a></h2>
33    <div class="section">
34      <p>
35        There is a limit on the combined number of output component masking patterns, input component swizzling patterns, and input component signs.<BR> This limit is 128. Of these, the number of patterns that can be used with a <CODE>mad</CODE> instruction is 32.<BR> <br> Example 1<br>
36<pre class="definition">
37add     r0, r1.xy,  -r2.zw
38add     r2, r0.x,    r3
39mul     r3, r2.xy,  -r3.zw
40add     r4, r2.xxxx, r5.xyzw
41</pre>
42        With this set of instructions, the first and third instructions use the same pattern.<BR> The combined number of patterns is two, because the second and fourth instructions also have the same pattern.<BR> <br> <br> Example 2<br>
43<pre class="definition">
44add     r0, r1.xy,  -r2.zw
45mul     r2.xy, r0.x,   c2.y
46mad     r3,    r2.xy, -r3.zw, r1.w
47cmp     0, 1, r1.x, c0.y
48</pre>
49        With this set of instructions, the first and third instructions are treated as having the same pattern because, except for src2, the combination of operands of the third instruction, <CODE>mad</CODE>, is the same as that of the first instruction, <CODE>add</CODE>. This pattern can also be used with a <CODE>mad</CODE> instruction.<BR> The second and fourth instructions are viewed as having the same pattern because the combination of src0 and src1 of the fourth instruction is the same as the combination of src0 and scr1 of the second instruction (<CODE>mul</CODE>).<BR>
50      </p>
51    </div>
52
53    <h2><a name="flowcontrol">Limitations on Control Instructions</a></h2>
54    <div class="section">
55      <p>
56        An instruction that terminates control must be called if an instruction that starts control is called. Specifically, instructions must be called in the following pairs.<BR>
57<pre class="definition">
58ifb (- else) - endif
59ifc (- else) - endif
60call - ret
61callb - ret
62callc - ret
63loop - endloop
64</pre>
65        In these control blocks, it is illegal to jump to a location outside the control block using the <CODE>jpc</CODE> or <CODE>jpb</CODE> jump instruction.<BR> The <CODE>ret</CODE> instruction cannot be called within an <CODE>if</CODE> block or a <CODE>loop</CODE> block.<br> <br> Operations are undefined when nested <CODE>call</CODE> instructions are invoked immediately before a <CODE>ret</CODE> instruction.<br> <br> Also, you cannot use a <CODE>jpc</CODE> or <CODE>jpb</CODE> jump instruction to jump from inside a block enclosed by the <CODE>main</CODE> and <CODE>endmain</CODE> labels to an outside location.<br> Furthermore, you cannot jump from inside a subroutine to outside that subroutine. <BR>You cannot jump from inside a subroutine to a <CODE>ret</CODE> instruction. Operations are undefined if this type of control is used.<BR>
66      </p>
67    </div>
68
69    <h2><a name="register">Registers That Cannot be Used Simultaneously</a></h2>
70    <div class="section">
71      <p>
72        In general,you cannot specify more than one floating point constant register to an assembler instruction that specifies more than one src operand.<BR> Also, you cannot specify more than one input register.<BR> Note, however, that you can specify several input registers as long as they have the same index.<BR>
73
74<pre class="definition">
75add     r0, c0, c0  // Error
76add     r0, c0, c1  // Error
77add     r0, v0, v0  // No error
78add     r0, v0, v1  // Error
79</pre>
80        When using macro instructions, error checking is performed after macro expansion.<BR>
81      </p>
82    </div>
83
84    <h2><a name="exceptional_result">Calculation Results of Exceptional Processing</a></h2>
85    <div class="section">
86      <p>
87        Vertex shaders behave as follows regarding output of exceptional calculations.<BR> <br>
88        <ul>
89          <li><CODE>NaN</CODE> is output for the logarithm of -0 or negative values.</li>
90          <li><CODE>NaN</CODE> is output for the square root of -0 and negative values.</li>
91          <li><CODE>NaN</CODE> is output for calculations that use <CODE>NaN</CODE> as an input value. (Excluding the <CODE>cmp</CODE> instruction.)</li>
92          <li>Outputs negative infinity when infinity is subtracted from a given numeric value.</li>
93          <li>Outputs negative infinity for the logarithm of 0 or a non-normalized number. (The term &quot;non-normalized number&quot; indicates a number with an exponent of 0 and mantissa other than 0.)</li>
94          <li>Outputs infinity if an overflow occurs.</li>
95          <li>Outputs negative infinity if an underflow occurs.</li>
96          <li>Outputs infinity or negative infinity upon division by +0 or -0.</li>
97          <li>Sometimes <CODE>NaN</CODE> is output if the result of a calculation made using the <CODE>rcp</CODE>, <CODE>rsq</CODE>, <CODE>exp</CODE>, or <CODE>log</CODE> instruction is infinity or negative infinity.</li>
98        </ul>
99        <BR>The values infinity, negative infinity, and <CODE>NaN</CODE> can be used with the <CODE>cmp</CODE> instruction. For more information, see <a href="../flowcntl/cmp.html">cmp - Compare</a>.
100      </p>
101    </div>
102
103    <h2><a name="false_output">Limitations on the Output of Illegal Data</a></h2>
104    <div class="section">
105      <p>
106        Of all the data that can be output from a vertex shader (data written to an output register), operations are not guaranteed if <CODE>NaN</CODE> (Not a Number) is output for a vertex coordinate.<BR> Make sure that <CODE>NaN</CODE> is not output as the result of vertex shader calculations or input from an application as a vertex attribute or uniform value.<BR>
107      </p>
108    </div>
109
110
111  <h2>Revision History</h2>
112  <div class="section">
113    <dl class="history">
114      <dt>2011/12/20</dt>
115      <dd>Initial version.<br />
116      </dd>
117    </dl>
118  </div>
119
120  <hr><p>CONFIDENTIAL</p></body>
121</html>
122