1<html>
2<head>
3<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
4<meta http-equiv="Content-Style-Type" content="text/css">
5<title>Hardware Specifications</title>
6<link rel="stylesheet" href="../css/manpage.css" type="text/css" />
7<link rel="stylesheet" href="../css/timetable.css" type="text/css" />
8</head>
9<body>
10  <a name="top"></a>
11
12  <h1>Hardware Specifications</h1>
13  <div class="section">
14    <p>
15      This section describes the vertex shader processor and the related hardware specifications.<br>
16    </p>
17  </div>
18
19  <h2><a name="vertex_processor">Vertex Shader Processor</a></h2>
20  <div class="section">
21    <p>
22      Following is a description of the hardware specifications for the vertex shader processor.<br>
23    </p>
24
25  <h3><a name="specialty">Features</a></h3>
26  <div class="section">
27  <p>
28    Below are the main features of the vertex shader processor.<br> <br>
29    <ul>
30      <li>Four vertex shader units built into the GPU (of which one is shared with the geometry shader processor).</li>
31      <li>Operations are conducted using 24-bit, floating-point numbers: 1 sign bit, 7 exponent bits and 16 mantissa bits.</li>
32      <li>32-bit fixed length instructions.</li>
33      <li>Support for flow control instructions as well as instructions for indexing.</li>
34      <li>Support for masking of output components and rearranging (swizzling) of input components.</li>
35      <li>Cannot read or write any data other than register data.</li>
36    </ul>
37    <br>
38  </p>
39  </div>
40
41  <h3><a name="stage">Stage Structure</a></h3>
42  <div class="section">
43  <p>
44    The vertex shader processor has the following stages.<br> <br>
45    <table class="members">
46      <thead>
47      <tr>
48<th>Stage name</th>
49<td>Symbol in timetable</td>
50<td>Description</td>
51      </tr>
52      </thead>
53      <tr>
54        <th>Prefetch</th>
55        <td style="text-align: center;">
56          <table class="timetable" style="width: 60px;">
57            <tr><td class="prefetch">p.fetch</td></tr>
58          </table>
59        </td>
60        <td>
61          Prefetches instruction from program RAM into cache.<br>
62        </td>
63      </tr>
64      <tr>
65        <th>Fetch</th>
66        <td style="text-align: center;">
67          <table class="timetable" style="width: 60px;">
68            <tr><td class="fetch">fetch</td></tr>
69          </table>
70        </td>
71        <td>
72Fetches instruction from instruction cache.<br>
73        </td>
74      </tr>
75      <tr>
76        <th>Decode</th>
77        <td style="text-align: center;">
78          <table class="timetable" style="width: 60px;">
79            <tr><td class="decode">decode</td></tr>
80          </table>
81        </td>
82        <td>
83          Decodes fetched instruction.<br>
84        </td>
85      </tr>
86      <tr>
87        <th>Read</th>
88        <td style="text-align: center;">
89          <table class="timetable" style="width: 60px;">
90            <tr><td class="read">read</td></tr>
91          </table>
92        </td>
93        <td>
94          Reads data from register.<br> This stage may not exist for some instructions.<br>
95        </td>
96      </tr>
97      <tr>
98        <th>Execute</th>
99        <td style="text-align: center;">
100          <table class="timetable">
101            <tr>
102              <td class="flow" style="width: 60px;">ifb</td>
103              <td class="dummy"></td>
104              <td class="MOV" style="width: 60px;">mov</td>
105            </tr>
106            <tr><td class="dummy" style="height: 6px;" colspan="3"></td></tr>
107            <tr>
108              <td class="MUL" style="width: 60px;">MUL</td>
109              <td class="dummy"></td>
110              <td class="MAX" style="width: 60px;">MAX</td>
111            </tr>
112            <tr><td class="dummy" style="height: 6px;" colspan="3"></td></tr>
113          </table>
114          etc.
115        </td>
116        <td>
117          Executes the instruction.<br> Conducts such tasks as flow control, copying of registers, and computations in the arithmetic units.<br> Some instructions use multiple arithmetic units for computations, so sometimes this stage can take 2 or more clock cycles.<br>
118        </td>
119      </tr>
120      <tr>
121        <th>Post-processing</th>
122        <td style="text-align: center;">
123          <table class="timetable" style="width: 60px;">
124            <tr><td class="post">post</td></tr>
125          </table>
126        </td>
127        <td>
128          Performs post-processing on the instruction.<br> This stage may not exist for some instructions.<br>
129        </td>
130      </tr>
131      <tr>
132        <th>Write-back</th>
133        <td style="text-align: center;">
134          <table class="timetable" style="width: 60px;">
135            <tr><td class="write">write</td></tr>
136          </table>
137        </td>
138        <td>
139          Writes the result of instruction execution to a register.<br>
140        </td>
141      </tr>
142    </table>
143    <br>
144    <p class="notice">
145      The instruction timetables in the Assembler Reference show the stages from read through write-back.<br> Execution latency is less than the clock ticks shown in the timetable. Processing does not stall even when write-back of instruction takes place at the same clock time as the read stage of the next instruction.<br>
146    </p>
147    <br>
148  </p>
149  </div>
150
151  <h3><a name="arithmetic_unit">Arithmetic Unit Structure</a></h3>
152  <div class="section">
153  <p>
154    The vertex shader processor incorporates the following arithmetic units.<br> <br>
155    <table class="members">
156      <thead>
157      <tr>
158        <th>Unit name</th>
159        <td>Number installed</td>
160        <td>Clock cycles required for operation</td>
161        <td>Symbol in timetable</td>
162        <td>Description</td>
163      </tr>
164      </thead>
165      <tbody>
166      <tr>
167        <th>MUL</th>
168        <td style="text-align: center;">4</td>
169        <td style="text-align: center;">1</td>
170        <td style="text-align: center;">
171          <table class="timetable" style="width: 60px;">
172            <tr><td class="MUL">MUL</td></tr>
173          </table>
174        </td>
175        <td>Calculates the product of two values.</td>
176      </tr>
177      <tr>
178        <th>ADD</th>
179        <td style="text-align: center;">4</td>
180        <td style="text-align: center;">1</td>
181        <td style="text-align: center;">
182          <table class="timetable" style="width: 60px;">
183            <tr><td class="ADD">ADD</td></tr>
184          </table>
185        </td>
186        <td>Calculates the sum of two values.</td>
187      </tr>
188      <tr>
189        <th>RCP / RSQ</th>
190        <td style="text-align: center;">2</td>
191        <td style="text-align: center;">2</td>
192        <td style="text-align: center;">
193          <table class="timetable" style="width: 120px;">
194            <tr><td class="RCP">RCP / RSQ</td></tr>
195          </table>
196        </td>
197        <td>Calculates the reciprocal, reciprocal square root.</td>
198      </tr>
199      <tr>
200        <th>FLOOR</th>
201        <td style="text-align: center;">4</td>
202        <td style="text-align: center;">1</td>
203        <td style="text-align: center;">
204          <table class="timetable" style="width: 60px;">
205            <tr><td class="FLOOR">FLOOR</td></tr>
206          </table>
207        </td>
208        <td>Calculates the greatest integer less than or equal to the specified value.</td>
209      </tr>
210      <tr>
211        <th>LOG</th>
212        <td style="text-align: center;">1</td>
213        <td style="text-align: center;">2</td>
214        <td style="text-align: center;">
215          <table class="timetable" style="width: 120px;">
216            <tr><td class="LOG">LOG</td></tr>
217          </table>
218        </td>
219        <td>Calculates a binary logarithm.</td>
220      </tr>
221      <tr>
222        <th>EXP</th>
223        <td style="text-align: center;">1</td>
224        <td style="text-align: center;">2</td>
225        <td style="text-align: center;">
226          <table class="timetable" style="width: 120px;">
227            <tr><td class="EXP">EXP</td></tr>
228          </table>
229        </td>
230        <td>Calculates a power of two.</td>
231      </tr>
232      <tr>
233        <th>MAX</th>
234        <td style="text-align: center;">4</td>
235        <td style="text-align: center;">1</td>
236        <td style="text-align: center;">
237          <table class="timetable" style="width: 60px;">
238            <tr><td class="MAX">MAX</td></tr>
239          </table>
240        </td>
241        <td>Selects the larger of two given values.</td>
242      </tr>
243      <tr>
244        <th>MIN</th>
245        <td style="text-align: center;">4</td>
246        <td style="text-align: center;">1</td>
247        <td style="text-align: center;">
248          <table class="timetable" style="width: 60px;">
249            <tr><td class="MIN">MIN</td></tr>
250          </table>
251        </td>
252        <td>Selects the smaller of two given values.</td>
253      </tr>
254      <tr>
255        <th>SGE</th>
256        <td style="text-align: center;">4</td>
257        <td style="text-align: center;">1</td>
258        <td style="text-align: center;">
259          <table class="timetable" style="width: 60px;">
260            <tr><td class="SGE">SGE</td></tr>
261          </table>
262        </td>
263        <td>Compares two values to determine if the one is greater than or equal to the other.</td>
264      </tr>
265      <tr>
266        <th>SLT</th>
267        <td style="text-align: center;">4</td>
268        <td style="text-align: center;">1</td>
269        <td style="text-align: center;">
270          <table class="timetable" style="width: 60px;">
271            <tr><td class="SLT">SLT</td></tr>
272          </table>
273        </td>
274        <td>Compares two values to determine if the one is less than than other.</td>
275      </tr>
276      <tr>
277        <th>CMP</th>
278        <td style="text-align: center;">2</td>
279        <td style="text-align: center;">2</td>
280        <td style="text-align: center;">
281          <table class="timetable" style="width: 120px;">
282            <tr><td class="CMP">CMP</td></tr>
283          </table>
284        </td>
285        <td>Compares two values.</td>
286      </tr>
287      </tbody>
288    </table>
289    <br> <br>
290  </p>
291  </div>
292
293  <h3><a name="register_set">Register Set</a></h3>
294  <div class="section">
295  <p>
296    The vertex shader processor incorporates the following register set.<br> <br>
297    <table class="members">
298      <thead>
299      <tr>
300        <th>Register type</th>
301        <td>No. of components</td>
302        <td>Number</td>
303        <td>R/W</td>
304        <td>Bit width</td>
305        <td>Description</td>
306      </tr>
307      </thead>
308      <tbody>
309      <tr>
310        <th>Input registers</th>
311        <td style="text-align: center;">4</td>
312        <td style="text-align: center;">16</td>
313        <td style="text-align: center;">R</td>
314        <td style="text-align: center;">24</td>
315        <td>
316          Floating-point registers storing vertex attribute data.<br>
317        </td>
318      </tr>
319      <tr>
320        <th>Temporary registers</th>
321        <td style="text-align: center;">4</td>
322        <td style="text-align: center;">16</td>
323        <td style="text-align: center;">RW</td>
324        <td style="text-align: center;">24</td>
325        <td>
326          Reusable floating-point registers for temporarily holding the results of calculations.<br> The content is maintained until overwritten. <br>
327        </td>
328      </tr>
329      <tr>
330        <th>Floating-point constant registers</th>
331        <td style="text-align: center;">4</td>
332        <td style="text-align: center;">96</td>
333<td style="text-align: center;"><CODE>R</CODE></td>
334        <td style="text-align: center;">24</td>
335        <td>
336          Floating-point registers storing constants for operations.<br> Supports specification of register index offset by address register and loop counter register. <br>
337        </td>
338      </tr>
339      <tr>
340        <th>Address registers</th>
341        <td style="text-align: center;">2</td>
342        <td style="text-align: center;">1</td>
343        <td style="text-align: center;">RW</td>
344        <td style="text-align: center;">8</td>
345        <td>
346          Integer-type register for specifying the register index offset for the floating-point constant registers.<br>
347        </td>
348      </tr>
349      <tr>
350        <th>Boolean register</th>
351        <td style="text-align: center;">1</td>
352        <td style="text-align: center;">16</td>
353<td style="text-align: center;"><CODE>R</CODE></td>
354        <td style="text-align: center;">1</td>
355        <td>
356          Boolean registers used for conditional branching and jumping.<br> One of these registers (b15) is reserved for use by the geometry shader.<br>
357        </td>
358      </tr>
359      <tr>
360        <th>Integer registers</th>
361        <td style="text-align: center;">1</td>
362        <td style="text-align: center;">4</td>
363<td style="text-align: center;"><CODE>R</CODE></td>
364        <td style="text-align: center;">24</td>
365        <td>
366          Integer registers used for controlling loop instructions.<br>
367        </td>
368      </tr>
369      <tr>
370        <th>Loop-counter register</th>
371        <td style="text-align: center;">1</td>
372        <td style="text-align: center;">1</td>
373<td style="text-align: center;"><CODE>R</CODE></td>
374        <td style="text-align: center;">8</td>
375        <td>
376          Integer-type register that stores the counter value for loop instructions.<br> Can be used for specifying the register index offset for the floating-point constant registers.<br>
377        </td>
378      </tr>
379      <tr>
380        <th>Output registers</th>
381        <td style="text-align: center;">4</td>
382        <td style="text-align: center;">16</td>
383        <td style="text-align: center;">W</td>
384        <td style="text-align: center;">24</td>
385        <td>
386          Floating-point registers for storing the vertex attribute data at the completion of processing by the vertex shader processor.<br> The contents in these registers are output to later stages in the graphics pipeline (or to the geometry shader processor).<br>
387        </td>
388      </tr>
389      <tr>
390        <th>Status registers</th>
391        <td style="text-align: center;">1</td>
392        <td style="text-align: center;">2</td>
393        <td style="text-align: center;">RW</td>
394        <td style="text-align: center;">1</td>
395        <td>
396          Floating-point registers storing vertex attribute data.<br>
397        </td>
398      </tr>
399      </tbody>
400    </table>
401    <br> <br>
402  </p>
403  </div>
404  </div>
405
406
407  <h2><a name="post_vertexcache">Post-Vertex Cache</a></h2>
408  <div class="section">
409  <p>
410The post-vertex cache functions to cache the result of the processing of vertex data by the vertex shader processor.<br><br>If the vertex buffer and vertex index are being used for the input of vertex data, the processing result of the vertex shader processor is stored in cache, and this processed data in cache is output when the same vertex index is input. Cache hits boost performance because processing by the vertex shader processor is skipped.<br><br> Cache for 32 entries is available. When cache is not hit, data is ejected from the cache starting from the oldest of the last-referenced data.<br>The probability of cache hits is higher when there are 32 or fewer sets of vertex data to be repeatedly referenced, and depending on how the vertex index has been created, rendering using <CODE>TRIANGLES</CODE> is more efficient in some cases than using <CODE>TRIANGLE_STRIP</CODE>.<br> <br> <br>
411  </p>
412  </div>
413
414
415  <h2>Revision History</h2>
416  <div class="section">
417    <dl class="history">
418      <dt>2011/12/20</dt>
419      <dd>Initial version.<br />
420      </dd>
421    </dl>
422  </div>
423
424<hr><p>CONFIDENTIAL</p></body>
425</html>