Added and Revised Items in 1.2.0
CPU/GPU Optimization
CPU Optimization
Vertex Process Optimization
Read the optimization document
Read the optimization document included with the SDK.
Note that these optimization tips are not included in the document supplied with the SDK.
Enable the optimization option
Enable the optimization option of the export plug-in.
In some cases, this may have a significant effect on performance.
Position vertex data and textures in VRAM
Position vertex data and textures in VRAM.
As shown below, transfer reservations can be made at time of resource initialization by setting positions at the graphics file level.
ResGraphicsFile::ForeachTexture(nw::gfx::TextureLocationFlagSetter(NN_GX_MEM_VRAMA | GL_NO_COPY_FCRAM_DMP));
ResGraphicsFile::ForeachIndexStream(nw::gfx::IndexStreamLocationFlagSetter(NN_GX_MEM_VRAMB | GL_NO_COPY_FCRAM_DMP));
ResGraphicsFile::ForeachVertexStream(nw::gfx::VertexStreamLocationFlagSetter(NN_GX_MEM_VRAMB | GL_NO_COPY_FCRAM_DMP));
Multiplex the command list
Multiplexing the command list allows execution and command generation to be handled simultaneously.
However, tearing will occur if buffers are not swapped correctly, because even transfers from the render buffer to the display buffer can accumulate as commands.
In the demo library, the
CommandListSwapper class handles command list multiplexing.
Do not call glClear
Even if RenderBuffer is located in VRAM, glClear affects things such as calculations for interrupts to the CPU.
If possible, clear the screen without calling glClear by rendering beginning from most distant scenery.
Under NW4C, mesh materials of the most distant scenery can be expressed by making the following settings.
- Depth test: Always pass (Always)
- Color depth buffer: Update
- User layer IDs and render priority to render the most distant model first
Also, set LayerId for SubmitView to prioritize rendering of the celestial sphere as done in the gfx demo. Whatever you do, you must take care that the most distant model lies inside the Near/Far clip.
Share skeletons
When creating characters that share motions, you can reduce calculations by having those characters share skeletons.
For details on sharing skeletons, see
Sharing skeletons.
Share materials
When creating models that share materials, you can omit material setup by sharing materials.
To share a material, set the source model to be shared from using the
SharedMaterialModel function. If materials are being shared, be sure to destroy the models being shared from only after destroying all models being shared with.
Execute traverse and initialize when the scene tree structure has changed
Only execute traverse using
SceneTraverser and initialization using
SceneInitializer with
SceneNode::Accept when the tree structure has changed, such as when a child has been added to or removed from the SceneNode. Re-execution is not required when Visible has changed.
Omit some traverse and update processing
If you want to omit execution of
SceneUpdater::UpdateAll and traverse for a particular node, some traverse and update processing can be omitted by preparing several instances of
SceneContext.
For example, you can collect traverse results for each individual
SceneContext by separating the group of nodes you want to update and the ones you don't into separate branches and then traversing only some of them, or by splitting them into separate scene trees and then traversing only some of them. Processing can be reduced by skipping traverse and update when it comes to instances of
SceneContext used to collect nodes whose status does not change. Multiple instances of
SceneContext can be placed in a single
RenderQueue using functions like
SceneUpdater::SubmitView or
RenderQueue::EnqueueModel.
Calculate the material ID
Execute material sort using material IDs with the default sort algorithm.
Material IDs can be calculated using
IMaterialIdGenerator.
The CPU load during rendering can be reduced by setting the material IDs of materials with the same or similar settings to values that are close to each other.
To calculate material IDs, pass IMaterialIdGenerator to
SceneInitializer, and then call the functions
SceneInitializer::Begin, SceneNode::Accept
, and SceneInitializer::End
in the order given. The material ID generator algorithm can be customized by inheriting IMaterialIdGenerator.
Carry out implementation while referring to
SortingMaterialIdGenerator.
Customize the sort algorithm for the render queue
Optimize render queue key creation
If a depth sort is not required for other than translucent meshes, you can omit depth-related calculations during key generation and cache key creation results.
By setting "SORT_DEPTH_OF_TRANSLUCENT_MESH" for depth-related calculations using
ISceneUpdater::SetDepthSortMode, you can omit the use a key factory created by
CreatePriorMaterialAndZeroDepthRenderKeyFactory.
The key cash can be enabled by setting the cacheEnabled argument to
RenderQueue::Reset to true. You can disable the caching of soft keys saved for each mesh by calling
InvalidateRenderKeyCache. You can also disable caching of softkeys on a per mesh basis by setting the ResMesh::FLAG_VALID_RENDER_KEY_CACHE flag to zero by using the
ResMesh::SetFlags function directly.
Re-use the command list when displaying 3D
When displaying 3D, rather than generating a command list twice for the left and right eyes, you cut down on command generation by generating one command list and re-using it.
Perform rendering during 3D display using the
RenderStereoScene function in the demo library. Within this code, use the
CommandListSwapper class to save and re-use the command list.
For the process flow, see
Standard Rendering and Stereo Rendering.
Disable features not required by the build option
Unnecessary features such as determination processes can be skipped by disabling them using a build option.
For details, see
Macro List
The processing load of evaluating animations can be reduced by using frame format animation data. Currently, only skeletal animations can be otuput in frame format.
The drawback is that the amount of data increases. Also, the accuracy of evaluating fractional frames goes down when the playback speed is changed.
Re-use the AnimEvaluator when switching animations
Use an animation cache as appropriate
Using an evaluation result cache is a fast approach when applying animation results to more than one model.
On the other hand, there is overhead if enabled when the cache is not needed.
For more details, see High-Speed Features.
Delete unnecessary animation members
Animation members can be deleted when exporting binary, such as models, from CreativeStudio. Animation evaluation and particularly blends can be executed at higher speed by deluding unnecessary animation members. Take care not to delete necessary animation members.
As an example, assume we have created the following filter definition file and script file for binary output of a model for which only material constant color 0 and 1 can be animated.
OptimizeFilter.xml
<?xml version="1.0" encoding="utf-8"?>
<OptimizeAnimationMemberSettings>
<Filters Mode="positive">
<Path>Materials["*"].MaterialColor.Constant0</Path>
<Path>Materials["*"].MaterialColor.Constant1</Path>
</Filters>
</OptimizeAnimationMemberSettings>
Binarize.py
The script file must be saved using UTF-16 BOM encoding.
CreativeStudio.Execute("FileLoad", "human.cmdl")
CreativeStudio.Execute("FileLoad", "human_all.ctex")
CreativeStudio.Execute("OptimizeAnimationMember", "-sf=OptimizeFilter.xml")
CreativeStudio.Execute("FileSave", "-o=human.bcres", "-t=nw4cBinary")
Binary will be output when the following command is executed on the CreativeStudio console.
NW4C_CreativeStudioConsole.exe -s=Binarize.py
The definition file is made to look as follows when removing only specified members, rather than leaving behind only specified members. Take care that the mode attribute of the Filters member does not go negative.
<?xml version="1.0" encoding="utf-8"?>
<OptimizeAnimationMemberSettings>
<Filters Mode="negative">
<Path>IsVisible</Path>
<Path>Meshes["*"].IsVisible</Path>
<Path>MeshNodeVisibilities["*"].IsVisible</Path>
</Filters>
</OptimizeAnimationMemberSettings>
Added reset to Mode attributes beginning from version 1.2.0. Although the basic operations of reset are the same as positive, if the specified member has already been deleted and disabled, reset restores it to enabled status.
Although the method presented here uses a script file, execution is also possible from the console panel.
Automatically delete unnecessary animation members
Beginning from version 1.2.0, an operation for extracting necessary members based on animation data and automatically deleting unnecessary ones has been added to CreativeStudio. Although the main method of use is that given below, note that under the method shown here information on member deletion (disabling) is saved in an intermediate file. As of version 1.2.0, you cannot check whether a member is enabled or disabled from CreativeStudio.
Optimize loaded content.
- Load all objects and animations to be optimized.
- Select the objects to be optimized. If no selection is made, all objects will be optimized.
- Enter and execute the following command on the console panel.
CreativeStudio.Execute("OptimizeUnusedAnimationMember")
- Optimization will be executed and results displayed on the console panel.
Create a definition file.
- Load all animations to be optimized.
- Note that all animations will be optimized regardless of the selection you make.
- Enter and execute the following command on the console panel.
CreativeStudio.Execute("OptimizeUnusedAnimationMember")
or
CreativeStudio.Execute("OptimizeUnusedAnimationMember", "-sf=[definition_file_name]")
- The above command saves a definition file.
- Close all animations and load the objects to be optimized.
- Select the objects to be optimized. If no selection is made, all objects will be optimized.
- Enter and execute the following command on the console panel.
CreativeStudio.Execute("OptimizeAnimationMember")
or
CreativeStudio.Execute("OptimizeAnimationMember", "-sf=[definition_file_name]")
- Optimization will be executed and results displayed on the console panel.
The definition file created by this procedure can also be used from a script file as described in Deleting Unnecessary Animation Members
Do not create AnimBinding for unanimated nodes
If you know that a model will not be animated at time of creation, you can create nodes for which animation is disabled.
You can thereby avoid unnecessary processing during scene updates.
For details, see SceneBuilder::IsAnimationEnabled.
Select a shader program according to the number of textures
When using the default shader, you can select a shader program according to the number of textures.
Automatic shader program selection can be set as follows.
ResGraphicsFile::ForeachModelMaterial(nw::gfx::DefaultShaderAutoSelector());
Turn off unnecessary lights
If fragment lights are not being used, rather than just turning out lights, disable material fragment lighting settings.
Similarly, disable settings for vertex and hemispherical lights, too. Quaternion calculations by the vertex shader can be omitted by disabling fragment lighting. Normals can be removed when exporting data from CreativeStudio if all lights are disabled.
Bake lights to vertex colors
If vertex processing is a bottleneck, bake vertex lights to the vertex color. If fill is the bottleneck, bake fragment lights to the vertex color.
You can expect improvements for both vertex processing and fill processing.
Use user shaders
Vertex processing can be improved by using a custom shader when rendering models with an extremely large number of vertices.
However, the CPU load increases when changing shaders, so you must make adjustments such as applying the custom shader to the background model only and using render priority to render it first so that the CPU load does not increase.
For information on user-defined shaders, see
Creating Shaders.
CONFIDENTIAL