Added and Revised Items in 1.2.0

CPU/GPU Optimization

CPU Optimization

Vertex Process Optimization


Read the optimization document

Read the optimization document included with the SDK.
Note that these optimization tips are not included in the document supplied with the SDK.

Enable the optimization option

Enable the optimization option of the export plug-in.
In some cases, this may have a significant effect on performance.

Position vertex data and textures in VRAM

Position vertex data and textures in VRAM.
As shown below, transfer reservations can be made at time of resource initialization by setting positions at the graphics file level.

ResGraphicsFile::ForeachTexture(nw::gfx::TextureLocationFlagSetter(NN_GX_MEM_VRAMA | GL_NO_COPY_FCRAM_DMP));
ResGraphicsFile::ForeachIndexStream(nw::gfx::IndexStreamLocationFlagSetter(NN_GX_MEM_VRAMB | GL_NO_COPY_FCRAM_DMP));
ResGraphicsFile::ForeachVertexStream(nw::gfx::VertexStreamLocationFlagSetter(NN_GX_MEM_VRAMB | GL_NO_COPY_FCRAM_DMP));

Multiplex the command list

Multiplexing the command list allows execution and command generation to be handled simultaneously.
However, tearing will occur if buffers are not swapped correctly, because even transfers from the render buffer to the display buffer can accumulate as commands.
In the demo library, the CommandListSwapper class handles command list multiplexing.

Do not call glClear

Even if RenderBuffer is located in VRAM, glClear affects things such as calculations for interrupts to the CPU.
If possible, clear the screen without calling glClear by rendering beginning from most distant scenery.
Under NW4C, mesh materials of the most distant scenery can be expressed by making the following settings.
Also, set LayerId for SubmitView to prioritize rendering of the celestial sphere as done in the gfx demo. Whatever you do, you must take care that the most distant model lies inside the Near/Far clip.

Share skeletons

When creating characters that share motions, you can reduce calculations by having those characters share skeletons.
For details on sharing skeletons, see Sharing skeletons.

Share materials

When creating models that share materials, you can omit material setup by sharing materials.
To share a material, set the source model to be shared from using the SharedMaterialModel function. If materials are being shared, be sure to destroy the models being shared from only after destroying all models being shared with.

Execute traverse and initialize when the scene tree structure has changed

Only execute traverse using SceneTraverser and initialization using SceneInitializer with SceneNode::Accept when the tree structure has changed, such as when a child has been added to or removed from the SceneNode. Re-execution is not required when Visible has changed.

Omit some traverse and update processing

If you want to omit execution of SceneUpdater::UpdateAll and traverse for a particular node, some traverse and update processing can be omitted by preparing several instances of SceneContext.
For example, you can collect traverse results for each individual SceneContext by separating the group of nodes you want to update and the ones you don't into separate branches and then traversing only some of them, or by splitting them into separate scene trees and then traversing only some of them. Processing can be reduced by skipping traverse and update when it comes to instances of SceneContext used to collect nodes whose status does not change. Multiple instances of SceneContext can be placed in a single RenderQueue using functions like SceneUpdater::SubmitView or RenderQueue::EnqueueModel.

Calculate the material ID

Execute material sort using material IDs with the default sort algorithm.
Material IDs can be calculated using IMaterialIdGenerator.
The CPU load during rendering can be reduced by setting the material IDs of materials with the same or similar settings to values that are close to each other.
To calculate material IDs, pass IMaterialIdGenerator to SceneInitializer, and then call the functions SceneInitializer::Begin, SceneNode::Accept, and SceneInitializer::End in the order given. The material ID generator algorithm can be customized by inheriting IMaterialIdGenerator.
Carry out implementation while referring to SortingMaterialIdGenerator.

Customize the sort algorithm for the render queue

Customize the sort algorithm by customing RenderElementCompare and the render key factory (BasicRenderKeyFactory).

Optimize render queue key creation

If a depth sort is not required for other than translucent meshes, you can omit depth-related calculations during key generation and cache key creation results.
By setting "SORT_DEPTH_OF_TRANSLUCENT_MESH" for depth-related calculations using ISceneUpdater::SetDepthSortMode, you can omit the use a key factory created by CreatePriorMaterialAndZeroDepthRenderKeyFactory.
The key cash can be enabled by setting the cacheEnabled argument to RenderQueue::Reset to true. You can disable the caching of soft keys saved for each mesh by calling InvalidateRenderKeyCache. You can also disable caching of softkeys on a per mesh basis by setting the ResMesh::FLAG_VALID_RENDER_KEY_CACHE flag to zero by using the ResMesh::SetFlags function directly.

Re-use the command list when displaying 3D

When displaying 3D, rather than generating a command list twice for the left and right eyes, you cut down on command generation by generating one command list and re-using it.
Perform rendering during 3D display using the RenderStereoScene function in the demo library. Within this code, use the CommandListSwapper class to save and re-use the command list.
For the process flow, see Standard Rendering and Stereo Rendering.

Disable features not required by the build option

Unnecessary features such as determination processes can be skipped by disabling them using a build option.
For details, see Macro List

Use frame-format animation data

The processing load of evaluating animations can be reduced by using frame format animation data. Currently, only skeletal animations can be otuput in frame format.

The drawback is that the amount of data increases. Also, the accuracy of evaluating fractional frames goes down when the playback speed is changed.

Re-use the AnimEvaluator when switching animations

Use ChangeAnim when switching animations for the same model.

This is faster than re-creating AnimEvaluator and executing Bind again.

Use an animation cache as appropriate

Using an evaluation result cache is a fast approach when applying animation results to more than one model.
On the other hand, there is overhead if enabled when the cache is not needed.
For more details, see High-Speed Features.

Delete unnecessary animation members

Animation members can be deleted when exporting binary, such as models, from CreativeStudio. Animation evaluation and particularly blends can be executed at higher speed by deluding unnecessary animation members. Take care not to delete necessary animation members.

As an example, assume we have created the following filter definition file and script file for binary output of a model for which only material constant color 0 and 1 can be animated.

OptimizeFilter.xml

<?xml version="1.0" encoding="utf-8"?>
<OptimizeAnimationMemberSettings>
	<Filters Mode="positive">
		<Path>Materials["*"].MaterialColor.Constant0</Path>
		<Path>Materials["*"].MaterialColor.Constant1</Path>
	</Filters>
</OptimizeAnimationMemberSettings>

Binarize.py

The script file must be saved using UTF-16 BOM encoding.

CreativeStudio.Execute("FileLoad", "human.cmdl")
CreativeStudio.Execute("FileLoad", "human_all.ctex")
CreativeStudio.Execute("OptimizeAnimationMember", "-sf=OptimizeFilter.xml")
CreativeStudio.Execute("FileSave", "-o=human.bcres", "-t=nw4cBinary")

Binary will be output when the following command is executed on the CreativeStudio console.

NW4C_CreativeStudioConsole.exe -s=Binarize.py

The definition file is made to look as follows when removing only specified members, rather than leaving behind only specified members. Take care that the mode attribute of the Filters member does not go negative.

<?xml version="1.0" encoding="utf-8"?>
<OptimizeAnimationMemberSettings>
	<Filters Mode="negative">
		<Path>IsVisible</Path>
		<Path>Meshes["*"].IsVisible</Path>
		<Path>MeshNodeVisibilities["*"].IsVisible</Path>
	</Filters>
</OptimizeAnimationMemberSettings>

Added reset to Mode attributes beginning from version 1.2.0. Although the basic operations of reset are the same as positive, if the specified member has already been deleted and disabled, reset restores it to enabled status.

Although the method presented here uses a script file, execution is also possible from the console panel.

Automatically delete unnecessary animation members

Beginning from version 1.2.0, an operation for extracting necessary members based on animation data and automatically deleting unnecessary ones has been added to CreativeStudio. Although the main method of use is that given below, note that under the method shown here information on member deletion (disabling) is saved in an intermediate file. As of version 1.2.0, you cannot check whether a member is enabled or disabled from CreativeStudio.

Optimize loaded content.

  1. Load all objects and animations to be optimized.
  2. Select the objects to be optimized. If no selection is made, all objects will be optimized.
  3. Enter and execute the following command on the console panel.
    CreativeStudio.Execute("OptimizeUnusedAnimationMember")
  4. Optimization will be executed and results displayed on the console panel.

Create a definition file.

  1. Load all animations to be optimized.
  2. Note that all animations will be optimized regardless of the selection you make.
  3. Enter and execute the following command on the console panel.
    CreativeStudio.Execute("OptimizeUnusedAnimationMember")
    or
    CreativeStudio.Execute("OptimizeUnusedAnimationMember", "-sf=[definition_file_name]")
  4. The above command saves a definition file.
  5. Close all animations and load the objects to be optimized.
  6. Select the objects to be optimized. If no selection is made, all objects will be optimized.
  7. Enter and execute the following command on the console panel.
    CreativeStudio.Execute("OptimizeAnimationMember")
    or
    CreativeStudio.Execute("OptimizeAnimationMember", "-sf=[definition_file_name]")
  8. Optimization will be executed and results displayed on the console panel.

The definition file created by this procedure can also be used from a script file as described in Deleting Unnecessary Animation Members

Do not create AnimBinding for unanimated nodes

If you know that a model will not be animated at time of creation, you can create nodes for which animation is disabled.
You can thereby avoid unnecessary processing during scene updates.

For details, see SceneBuilder::IsAnimationEnabled.

Select a shader program according to the number of textures

When using the default shader, you can select a shader program according to the number of textures.
Automatic shader program selection can be set as follows.

ResGraphicsFile::ForeachModelMaterial(nw::gfx::DefaultShaderAutoSelector());

Turn off unnecessary lights

If fragment lights are not being used, rather than just turning out lights, disable material fragment lighting settings.
Similarly, disable settings for vertex and hemispherical lights, too. Quaternion calculations by the vertex shader can be omitted by disabling fragment lighting. Normals can be removed when exporting data from CreativeStudio if all lights are disabled.

Bake lights to vertex colors

If vertex processing is a bottleneck, bake vertex lights to the vertex color. If fill is the bottleneck, bake fragment lights to the vertex color.
You can expect improvements for both vertex processing and fill processing.

Use user shaders

Vertex processing can be improved by using a custom shader when rendering models with an extremely large number of vertices.
However, the CPU load increases when changing shaders, so you must make adjustments such as applying the custom shader to the background model only and using render priority to render it first so that the CPU load does not increase.
For information on user-defined shaders, see Creating Shaders.

CONFIDENTIAL