Added and Revised Items in 2.0.0

CPU/GPU Optimization

Reading the Optimization Document
Enabling Optimization Options
Loading Vertex data and Textures in VRAM
Duplicating Command Lists
Avoiding Calls to glClear
Sharing the Same Shaders

CPU Optimization

Calculating Material IDs
Customizing Sort Algorithms for the Render Queue
Optimizing Render Queue Key Creation
Sharing Skeletons
Sharing Materials
Executing Traverses Only when the Scene Tree Structure Changes
Partially Omitting Traverses and Updates
Reusing Command Lists when Displaying in 3D
Disabling Features Not Required by the Build Options
Optimizing Resource Setup (VRAM Transfers)
Speeding Up the Creation of Scene Objects
Using Frame-Format Animation Data
Using Fully-Baked Format Animation Data
Reusing AnimEvaluator when Switching Animations
Using Animation Caches Appropriately
Deleting Unnecessary Animation Members
Automatically Deleting Unnecessary Animation Members
Disabling AnimBinding for Unanimated Nodes
Deleting Alpha Value Animations Unused by Standard Features

Vertex Process Optimization

Selecting a Shader Program According to the Number of Textures
Turning Off Unnecessary Lights
Baking Lights to Vertex Colors
Using User Shaders

Reading the Optimization Document

Read the optimization document included with the SDK.
Note that these optimization tips are not contained in the SDK documentation.

Enabling Optimization Options

Enable the optimization options of the export plug-in.
In some cases, this may have a significant effect on performance.

Loading Vertex data and Textures in VRAM

Load vertex data and textures in VRAM.
As shown below, transfer reservations can be made when initializing resources by configuring memory locations at the graphics file level.

ResGraphicsFile::ForeachTexture(nw::gfx::TextureLocationFlagSetter(NN_GX_MEM_VRAMA | GL_NO_COPY_FCRAM_DMP)); ResGraphicsFile::ForeachIndexStream(nw::gfx::IndexStreamLocationFlagSetter(NN_GX_MEM_VRAMB | GL_NO_COPY_FCRAM_DMP)); ResGraphicsFile::ForeachVertexStream(nw::gfx::VertexStreamLocationFlagSetter(NN_GX_MEM_VRAMB | GL_NO_COPY_FCRAM_DMP));

Duplicating Command Lists

Duplicate command lists to allow for simultaneous command generation and execution. However, you might get display tearing if buffers are not swapped correctly, as transfers from the render buffer to the display buffer are also accumulated as commands.
In the demo library, the CommandListSwapper class duplicates command lists.

Avoiding Calls to `glClear`

The glClear function affects calculations and other processes because it generates CPU interrupts even if the RenderBuffer is located in VRAM.
If possible, render starting from the most distant on-screen items in order clear the screen without calling glClear.
You can do this in NW4C by configuring the mesh materials of the most distant display items as follows.

Depth test: Always pass (Always)
Color depth buffer: Update
Use layer IDs and render priorities to render the most distant model first.

Also set LayerId for SubmitView to prioritize rendering of the celestial sphere, as done in the gfx demo. Whichever approach you take, the most distant model must lie between the near and far clipping planes.

Sharing Skeletons

When creating characters that share motions, share skeletons between the characters to reduce the number calculations required.
For details on sharing skeletons, see Sharing Skeletons.

Sharing Materials

When creating models that use the same materials, share materials between the models to omit redundant material setup.
To share a material, configure the source model to be shared in a SharedMaterialModel object. When destroying objects that use shared materials, you must first destroy all models using the shared materials before destroying the shared model itself.

Executing Traverses and Initializing Only when the Scene Tree Structure Changes

Only execute a traverse or initialize when the tree structure has changed, such as when a child has been added to or removed from the SceneNode, by calling SceneTraverser or SceneInitializer respectively on a SceneNode::Accept object. You do not need to do this again when the Visible property changes.

Partially Omitting Traverses and Updates

If you want to omit running SceneUpdater::UpdateAll or traverses for a particular node, prepare multiple instances of the SceneContext class to allow you to skip some traverse and update processing. For example, you can collect traverse results for each individual SceneContext by separating the group of nodes you want to update and the ones you don't into separate branches, and then traverse only some of the branches. Alternately, split them into separate scene trees and then traverse only some of the trees. Reduce the processing load by skipping traverses and updates for SceneContext instances that are collections of nodes whose status does not change. Use functions such as SceneUpdater::SubmitView or RenderQueue::EnqueueModel to place multiple instances of SceneContext in a single RenderQueue.

Calculating Material IDs

The default sorting algorithm sorts materials using the material IDs. Use the IMaterialIdGenerator function to calculate material IDs. Set the material IDs of materials with the same or similar settings to values that are close to each other in order to reduce the CPU load during rendering. To calculate material IDs, pass a IMaterialIdGenerator instance to the SceneInitializer function, and then call, in order, the SceneInitializer::Begin, SceneNode::Accept, and SceneInitializer::End functions. You can customize the material ID generator algorithm by inheriting from the IMaterialIdGenerator class. Refer to SortingMaterialIdGenerator when implementing your customized algorithm.

Customizing Sort Algorithms for the Render Queue

Customize the sort algorithm by customizing RenderElementCompare and BasicRenderKeyFactory.

Optimizing Render Queue Key Creation

If you do not need to sort by depth, such as when using non-translucent meshes, you can omit depth-related calculations when generating keys or when caching key generation results. Skip these calculations by using the ISceneUpdater::SetDepthSortMode function to set SORT_DEPTH_OF_TRANSLUCENT_MESH for depth-related calculations and then using the key factory created by CreatePriorMaterialAndZeroDepthRenderKeyFactory. Enable the key cache by passing true in the cacheEnabled argument to the RenderQueue::Reset function. Disable the caching of sort keys saved for each mesh by calling InvalidateRenderKeyCache. You can also disable sort key caching on a per mesh basis by using the ResMesh::SetFlags function directly to set the ResMesh::FLAG_VALID_RENDER_KEY_CACHE flag to 0.

Reusing Command Lists when Displaying in 3D

When displaying in 3D, you can cut down on command generation by generating one command list and re-using it for both left and right eyes, instead of generating a separate command list for the left and right eyes. The demo library renders 3D displays using the RenderStereoScene function, which uses the CommandListSwapper class to save and reuse the command list. For the process flow, see Normal Rendering and Stereoscopic Rendering.

Disabling Features Not Required by the Build Options

You can skip unnecessary determination and other processing by using the build options to disable unneeded features. For details, see Macro List

Sharing the Same Shaders

Load shared shaders ahead of time from the binary files.

Use nw::gfx::ResGraphicsFile::Setup or a similar function to set up the models and other binary file resources. In such cases as when the function call returns nw::gfx::RESOURCE_RESULT_NOT_FOUND_SHADER, you can share a single shader by setting up the binary resource for the loaded shader again.

See ResourceDemo for specific examples.

Optimizing Resource Setup (VRAM Transfers)

See Users Handling VRAM Transfers.

Speeding Up the Creation of Scene Objects

Call nw::gfx::SceneBuilder::GetMemorySize or a similar function to calculate the amount of memory required to create objects, and then allocate all the required memory at once to speed up object creation.

Manage such memory by using an allocator class that inherits from os::IAllocator (herein called a "suballocator"). Configure the created suballocator as an allocator for creating objects.

For speedier object creation, use classes such as the SDK's FrameHeap class when implementing your suballocator.

Using Frame-Format Animation Data

Use frame-format animation data to reduce the processing load of evaluating animations. Only skeletal animations can currently be output in frame format.

The drawback to frame format is larger data sizes. The accuracy of evaluating fractional frames also declines when the playback speed is changed.

Using Fully-Baked Format Animation Data

Use fully baked animation data to reduce the processor load for skeletal animation evaluation compared to frame format.

See Advanced Features for details on usage and disadvantages.

Reusing `AnimEvaluator` when Switching Animations

Use ChangeAnim when switching animations for a single model.

This is faster than re-creating an AnimEvaluator instance and then executing Bind again.

Using Animation Caches Appropriately

Using an evaluation result cache is a fast approach when applying animation results to more than one model. Meanwhile, this also increases the overhead if enabled when the cache is not needed. For more details, see Advanced Features.

Deleting Unnecessary Animation Members

You can delete animation members when exporting models and other data from CreativeStudio to binary format. Delete any unnecessary animation members to speed up animation evaluation and blends in particular. Be careful not to delete any necessary animation members.

The following examples show a filter definition file and a script file for binary output of a model where only material constant colors 0 and 1 can be animated.

OptimizeFilter.xml

<?xml version="1.0" encoding="utf-8"?>
<OptimizeAnimationMemberSettings>
        <Filters Mode="positive">
                <Path>Materials["*"].MaterialColor.Constant0</Path>
                <Path>Materials["*"].MaterialColor.Constant1</Path>
        </Filters>
</OptimizeAnimationMemberSettings>

Binarize.py

You must save the script file encoded in UTF-16 with BOM.

CreativeStudio.Execute("FileLoad", "human.cmdl")
CreativeStudio.Execute("FileLoad", "human_all.ctex")
CreativeStudio.Execute("OptimizeAnimationMember", "-sf=OptimizeFilter.xml")
CreativeStudio.Execute("FileSave", "-o=human.bcres", "-t=nw4cBinary")

Run a command similar to the following in the CreativeStudio console to output to binary.

NW4C_CreativeStudioConsole.exe -s=Binarize.py

The following definition file example shows the opposite scenario, where instead of only keeping the specified members, we only delete the specified members. Note here that the Mode attribute of the Filters member is set to negative.

<?xml version="1.0" encoding="utf-8"?>
<OptimizeAnimationMemberSettings>
        <Filters Mode="negative">
                <Path>IsVisible</Path>
                <Path>Meshes["*"].IsVisible</Path>
                <Path>MeshNodeVisibilities["*"].IsVisible</Path>
        </Filters>
</OptimizeAnimationMemberSettings>

The reset value for the Mode attribute is new starting from version 1.2.0. Although the binary export works basically the same for both positive or reset, specifying reset will re-enable the specified members if they have already been deleted and disabled.

The example here uses a script file, but you can also run a binary export from the console panel.

Automatically Deleting Unnecessary Animation Members

CreativeStudio version 1.2.0 and later include an operation to extract necessary members from animation data while also automatically deleting unnecessary members. The following example shows the main way of using this feature. Note that this example also outputs an intermediate file containing information about which members were deleted (disabled). As of version 1.2.0, you cannot check from within CreativeStudio whether a member is enabled or disabled.

Optimizing Loaded Content

Load all objects and animations to optimize.
Select the objects to optimize. If no selection is made, all objects will be optimized.
Enter and execute the following command on the console panel.
```
CreativeStudio.Execute("OptimizeUnusedAnimationMember")
```
This will optimize the relevant items and display the results on the console panel.

Creating a Definition File

Load all animations to optimize.
Note that all animations will be optimized regardless of the selection you make.

Enter and execute the following command on the console panel.

CreativeStudio.Execute("OptimizeUnusedAnimationMember")

CreativeStudio.Execute("OptimizeUnusedAnimationMember", "-sf=[definition_file_name]")

This saves a definition file.
Close all animations and load the objects to optimize.
Select the objects to optimize. If no selection is made, all objects will be optimized.

Enter and execute the following command on the console panel.

CreativeStudio.Execute("OptimizeAnimationMember")

CreativeStudio.Execute("OptimizeAnimationMember", "-sf=[definition_file_name]")

This will optimize the relevant items and display the results on the console panel.

You can also use the definition file created by this procedure from a script file, as described in Deleting Unnecessary Animation Members.

Disabling `AnimBinding` for Unanimated Nodes

If you know at time of creation that a model will not be animated, you can create nodes with animation disabled. This avoids unnecessary processing during scene updates.

For details, see SceneBuilder::IsAnimationEnabled.

Deleting Alpha Value Animations Unused by Standard Features

Deleting the following material animations unused by standard features can reduce memory sizes and improve runtime efficiency.

Emission alpha colors
Specular 0,1 alpha values

You can also automatically delete these items, similar to the script shown in Deleting Unnecessary Animation Members.

CreativeStudio.Execute("FileLoad", "human.cmata")
CreativeStudio.Execute("RemoveUselessAlphaAnimation")
CreativeStudio.Execute("FileSave", "-o=human.bcres", "-t=nw4cBinary")

Selecting a Shader Program According to the Number of Textures

When using the default shader, you can select a shader program according to the number of textures.
Configure this automatic shader program selection as follows.

ResGraphicsFile::ForeachModelMaterial(nw::gfx::DefaultShaderAutoSelector());

Turning Off Unnecessary Lights

When not using fragment lights, disable material fragment lighting settings instead of just turning out lights. Also disable settings for vertex and hemispherical lights. You can skip quaternion calculation by the vertex shader by disabling fragment lighting. Normals are not included in exports from CreativeStudio when all lights are disabled.

Baking Lights to Vertex Colors

If vertex processing is the bottleneck, bake vertex lights to the vertex color. If fill is the bottleneck, bake fragment lights to the vertex color. This should improve both vertex and fill processing.

Using User Shaders

When rendering models with an extremely large number of vertices, use a custom shader to improve processing. However, changing shaders increases the CPU load, so you must make offsetting adjustments to keep the CPU load from increasing, such as applying the custom shader to the background model only and using render priority to render that first. For information on user-defined shaders, see Creating Shaders.

CONFIDENTIAL