16 Jun 2020 3 min read Announcements

Let's talk about sets

Sets are a powerful and commonly used tool in production pipelines, but they are not without their dangers. In this post we look at the details of how they work, and share some tips for avoiding their pitfalls.

Gaffer is designed to generate scenes in a deferred manner, so that it only generates the locations which have been expanded in the Viewer, and it can stream large scenes to a renderer without needing the whole scene in memory (this was taken to ludicrous extremes in a previous post). One problem with this is that when a scene is first loaded, Gaffer doesn't yet know where anything is, and to find everything, it would need to expand the whole scene anyway. This is particularly problematic for certain renderers which need to receive lights or cameras before they receive anything else.

Our solution to this problem is sets. These can be thought of as lists of locations that are guaranteed to exist when the scene is finally expanded (only the names of locations are stored by a set, not the locations themselves). There is a built in set that lists all the lights and a built in set that lists all the cameras, and these lists can be generated at any time without needing to expand the scene first. The validity of sets is preserved by all nodes that modify the scene, so for instance, a Prune node will remove locations from a set if it will later prune them from the scene itself.

Gaffer also supports user generated sets. These can be loaded from cache files, or generated on the fly using the Set node. The SetFilter node supports complex expressions that combine sets to determine which locations other nodes will operate on (we covered some recent additions to the expression syntax in a previous post). Because the validity of sets is maintained throughout Gaffer, and because they can be easily published by upstream departments, SetFilters have definite advantages over PathFilters, which may need updating manually when locations are added, removed or renamed.

Each set is only generated when it is needed, such as when it is used in a SetFilter node. But sets are not without their dangers. The entire list of locations in a set is computed in one go, unlike the progressive generation of the rest of the scene. Expensive set operations can therefore cause a delay before the scene itself is generated. And because sets are only ever processed in their entirety, generating a huge set and then pruning is not efficient in the same way that pruning the scene itself is. If an Instancer outputs a theoretical 1,000,000 locations that are subsequently pruned down to 1, then only one scene location will actually be generated. But not so with sets. If an Instancer outputs a set with a 1,000,000 locations, the names of all 1,000,000 locations will be added to the set before 999,999 are removed downstream. Not so good.

With this in mind, we can develop a few guidelines for practising "safe sets" :

Don't be afraid of sets in general. They can be a convenient and expressive way of filtering locations.
Avoid large sets where possible. Prefer to include a few parent locations in a set, rather than exhaustively list every leaf location.
Be aware that when you use the Set node to define a new set, you are triggering a search of the scene. If the filter uses /... or other wildcards, this can cause the entire scene to be searched before any results are available. Avoid /... unless absolutely necessary.
Be wary of generating huge sets using Instancers and crowd systems, and then using those sets downstream. Prefer to apply lookdev before instancing/generation, and use as few nodes downstream as possible.
Be cautious of * in set expressions. Used alone it will cause every single set in the scene to be generated, including several built in sets you no doubt don't want. Generally it only makes sense to use it when selecting specific sets, such as assets:* for all sets denoting the root locations of assets (other naming conventions are available).

John Haddon

You might also like...