tensorpack.graph_builder package

class tensorpack.graph_builder.InputDesc[source]

Bases: tensorpack.graph_builder.model_desc.InputDescTuple

Metadata about an input entry point to the graph. This metadata can be later used to build placeholders or other types of input source.

static __new__(cls, type, shape, name)[source]
Parameters:
  • type (tf.DType) –

  • shape (tuple) –

  • name (str) –

build_placeholder()[source]

Build a tf.placeholder from the metadata.

Returns:tf.Tensor
build_placeholder_reuse()[source]

Build a tf.placeholder from the metadata, or return an old one.

Returns:tf.Tensor
static from_placeholder(placeholder)[source]
class tensorpack.graph_builder.ModelDesc[source]

Bases: tensorpack.graph_builder.model_desc.ModelDescBase

A ModelDesc with single cost and single optimizer. It has the following constraints in addition to ModelDescBase:

  1. build_graph(...)() method should return a cost when called under a training context. The cost will be the final cost to be optimized by the optimizer. Therefore it should include necessary regularization.

  2. Subclass is expected to implement optimizer() method.

get_cost()[source]

Being deprecated. You’re recommended to return a cost tensor in build_graph() method directly.

This function takes the self.cost tensor defined by build_graph(), and applies the collection tf.GraphKeys.REGULARIZATION_LOSSES to the cost automatically.

get_optimizer()[source]

Return the memoized optimizer returned by optimizer().

Users of ModelDesc will need to implement optimizer(), which will only be called once per each model.

Returns:a tf.train.Optimizer instance.
optimizer()[source]

Returns a tf.train.Optimizer instance. A subclass is expected to implement this method.

class tensorpack.graph_builder.ModelDescBase[source]

Bases: object

Base class for a model description.

build_graph(*args)[source]

Build the whole symbolic graph. This is supposed to be part of the “tower function” when used with TowerTrainer.

A subclass is expected to implement this method.

Parameters:args ([tf.Tensor]) – tensors that matches the list of inputs defined by inputs().
Returns:In general it returns nothing, but a subclass may require it to return necessary information to build the trainer. For example, SingleCostTrainer expect this method to return the cost tensor.
get_inputs_desc()[source]
Returns:A list of InputDesc, which describes the inputs of this model. The result is cached for each instance of ModelDescBase.
input_names

Returns – [str]: the names of all the inputs.

inputs()[source]

__Create__ and returns a list of placeholders. A subclass is expected to implement this method.

The placeholders __have to__ be created inside this method. Don’t return placeholders created in other methods.

Also, you should never call this method by yourself.

Returns:a list of tf.placeholder, to be converted to InputDesc.
class tensorpack.graph_builder.GraphBuilder[source]

Bases: object

build(**kwargs)[source]
class tensorpack.graph_builder.SyncMultiGPUParameterServerBuilder(towers, ps_device)[source]

Bases: tensorpack.graph_builder.training.DataParallelBuilder

Data-parallel training in ‘ParameterServer’ mode. It builds one tower on each GPU with shared variable scope. It synchronizes the gradients computed from each tower, averages them and applies to the shared variables.

It is an equivalent of --variable_update=parameter_server in tensorflow/benchmarks.

__init__(towers, ps_device)[source]
Parameters:
  • towers (list[int]) – list of GPU id

  • ps_device (str) – either ‘gpu’ or ‘cpu’, where variables are stored.

build(get_grad_fn, get_opt_fn)[source]

Build the graph, and set self.grads to a list of (g, v), containing the averaged gradients.

Parameters:
  • get_grad_fn (-> [(grad, var)]) –

  • get_opt_fn (-> tf.train.Optimizer) – callable which returns an optimizer

Returns:

tf.Operation – the training op

class tensorpack.graph_builder.DataParallelBuilder(towers)[source]

Bases: tensorpack.graph_builder.training.GraphBuilder

__init__(towers)[source]
Parameters:towers (list[int]) – list of GPU ids.
static build_on_towers(towers, func, devices=None, use_vs=None)[source]

Run func on all GPUs (towers) and return the results.

Parameters:
  • towers (list[int]) – a list of GPU id.

  • func – a lambda to be called inside each tower

  • devices – a list of devices to be used. By default will use ‘/gpu:{tower}’

  • use_vs (list[bool]) – list of use_vs to passed to TowerContext

Returns:

List of outputs of func, evaluated on each tower.

class tensorpack.graph_builder.SyncMultiGPUReplicatedBuilder(towers, average, mode)[source]

Bases: tensorpack.graph_builder.training.DataParallelBuilder

Data-parallel training in “replicated” mode, where each GPU contains a replicate of the whole model. It will build one tower on each GPU under its own variable scope. Each gradient update is averaged or summed across or GPUs through NCCL.

It is an equivalent of --variable_update=replicated in tensorflow/benchmarks.

build(get_grad_fn, get_opt_fn)[source]

Build the graph, and set self.grads to #GPU number of lists of (g, v), containing the all-reduced gradients on each device.

Parameters:
  • get_grad_fn (-> [(grad, var)]) –

  • get_opt_fn (-> tf.train.Optimizer) – callable which returns an optimizer

Returns:

(tf.Operation, tf.Operation)

  1. the training op.

  2. the op which sync variables from GPU 0 to other GPUs.

    It has to be run before the training has started. And you can optionally run it later to sync non-trainable variables.

static get_post_init_ops()[source]

Copy values of variables on GPU 0 to other GPUs.

class tensorpack.graph_builder.AsyncMultiGPUBuilder(towers, scale_gradient=True)[source]

Bases: tensorpack.graph_builder.training.DataParallelBuilder

Data-parallel training with async update. It builds one tower on each GPU with shared variable scope. Every tower computes the gradients and independently applies them to the variables, without synchronizing and averaging across towers.

__init__(towers, scale_gradient=True)[source]
Parameters:
  • towers (list[int]) – list of GPU ids.

  • scale_gradient (bool) – if True, will scale each gradient by 1.0/nr_gpu.

build(get_grad_fn, get_opt_fn)[source]
Parameters:
  • get_grad_fn (-> [(grad, var)]) –

  • get_opt_fn (-> tf.train.Optimizer) – callable which returns an optimizer

Returns:

tf.Operation – the training op

class tensorpack.graph_builder.LeastLoadedDeviceSetter(worker_device, ps_devices)[source]

Bases: object

Helper class to assign variables on the least loaded ps-device.

Usage:

with tf.device(LeastLoadedDeviceSetter(...)):
    ...
__init__(worker_device, ps_devices)[source]
Parameters:
  • worker_device – the device to use for compute ops.

  • ps_devices – a list of device to use for Variable ops.

class tensorpack.graph_builder.OverrideCachingDevice(devices, device_for_small_variables, small_variable_size_threshold)[source]

Bases: object

Variable getter which caches variables on the least loaded device.

Variables smaller than a certain threshold are cached on a single specific device, as specified in the constructor. All other variables are load balanced across a pool of devices, by caching each variable on the least loaded device.

tensorpack.graph_builder.override_to_local_variable(enable=True)[source]
Returns:a context where all variables will be created as local.