tensorpack.tfutils package

tensorpack.tfutils.collection module

Parameters:keys (list) – list of collection keys to backup. Defaults to all keys in the graph.
Returns:dict – the backup

Restore from a collection backup.

Parameters:backup (dict) –
Parameters:keys (list) – list of collection keys to freeze.
Returns:a context where the collections are in the end restored to its initial state.

tensorpack.tfutils.gradproc module

class tensorpack.tfutils.gradproc.GradientProcessor[source]

Bases: object

Base class for all gradient processors. Gradient processors can be applied to optimizers by optimizer.apply_grad_processors().

Subclass should override the _process() method.


Process the symbolic gradients.

Parameters:grads (list) – list of (grad, var).
Returns:list – processed gradients, with the same type as input.
class tensorpack.tfutils.gradproc.FilterNoneGrad(verbose=True)[source]

Bases: tensorpack.tfutils.gradproc.GradientProcessor

Skip the update and print a warning (instead of crashing), when the gradient of certain variable is None.

Parameters:verbose (bool) – whether to print warning about None gradients.
class tensorpack.tfutils.gradproc.GlobalNormClip(global_norm)[source]

Bases: tensorpack.tfutils.gradproc.GradientProcessor

Clip by global norm. The global norm is the sum of norm for all gradients.

See tf.clip_by_global_norm() for more information.

Parameters:global_norm (float) – the threshold to clip with.
class tensorpack.tfutils.gradproc.MapGradient(func, regex='.*')[source]

Bases: tensorpack.tfutils.gradproc.GradientProcessor

Apply a function on all gradient if the name matches regex. Keep the other gradients unchanged.

It can be used for gradient clipping, etc.

__init__(func, regex='.*')[source]
  • func – a user-supplied function which takes one or two arguments. The argument(s) can be either a grad tensor, or grad and var. The function should return the new gradient to be used. If it return None, the gradient is discarded (hence no update to the variable will happen).

  • regex (str) – used to match variables. Defaults to match all variables.

class tensorpack.tfutils.gradproc.SummaryGradient(regex='.*', collections=None)[source]

Bases: tensorpack.tfutils.gradproc.MapGradient

For each gradient tensor, summary its histogram and add it to moving summaries.

__init__(regex='.*', collections=None)[source]
class tensorpack.tfutils.gradproc.PrintGradient(regex='.*')[source]

Bases: tensorpack.tfutils.gradproc.MapGradient

Print the gradients every step with symbolic_functions.print_stat().

Parameters:regex (str) – same as in MapGradient.
class tensorpack.tfutils.gradproc.CheckGradient[source]

Bases: tensorpack.tfutils.gradproc.MapGradient

Run tf.check_numerics() for each gradient.

class tensorpack.tfutils.gradproc.ScaleGradient(multipliers, verbose=True)[source]

Bases: tensorpack.tfutils.gradproc.MapGradient

Scale certain gradient by a multiplier.

__init__(multipliers, verbose=True)[source]
  • multipliers (tuple or list) – tuple of (regex, float), or list of such tuples.

  • verbose (bool) – whether to print logs or not


Use double learning rate for all the bias (as in caffe), and freeze layer0:

from tensorpack.tfutils import optimizer, gradproc
opt = optimizer.apply_grad_processors(
    opt, [gradproc.ScaleGradient(
        [('.*/b', 2.), ('layer0/.*', 0.)]

tensorpack.tfutils.tower module


When called inside a TowerContext, returns the TowerContext.

Returns:a BaseTowerContext instance or None, if not called under a TowerContext.
class tensorpack.tfutils.tower.BaseTowerContext(ns_name, vs_name='')[source]

Bases: object

A context where the current model is built in. You need to use TowerContext() to create a BaseTowerContext.

__init__(ns_name, vs_name='')[source]

This is not supposed to be used by users. You need to use TowerContext() to create a BaseTowerContext.

  • ns_name (str) – The name scope of the tower.

  • vs_name (str) – Open a new variable scope with this name.


From a collection, get items that are __added__ to the collection in this tower.

Note that it works by tracking the collection at the beginning and end of the tower function. Therefore it does not guarantee that the items are __created__ in this tower.


Whether this tower is supposed to have its own trainable variables.


Whether this tower is the main (i.e., the first) training tower.


Returns – str - The name scope of the tower.


Returns – str - The name scope of the tower.


Returns – str - The variable scope of the tower.

tensorpack.tfutils.tower.TowerContext(tower_name, is_training, vs_name='')[source]

The context for a tower function, containing metadata about the current tower. Tensorpack trainers use TowerContext to manage tower function. Many tensorpack layers have to be called under a TowerContext.


with TowerContext('', is_training=True):
    # call a tensorpack layer or a tower function
class tensorpack.tfutils.tower.TowerFuncWrapper(tower_fn, inputs_desc)[source]

Bases: object

A wrapper around a tower function (see [tutorial on tower function](http://tensorpack.readthedocs.io/tutorial/trainer.html#tower-trainer)). It keeps track of the name scope, variable scope and input/output tensors each time the function is called.

TowerTrainer needs this so that it knows how to build a predictor.

__init__(tower_fn, inputs_desc)[source]
  • tower_func – a function which builds one tower in the graph. It takes several input tensors and could return anything.

  • inputs_desc ([InputDesc]) – list of InputDesc. They are used to figure out the names for the input tensors.


Returns – a TowerTensorHandles object, that can access the tower handles by either indices or names.

class tensorpack.tfutils.tower.TowerTensorHandle(ctx, input, output, inputs_desc=None)[source]

Bases: object

When a function is called multiple times under each tower, it becomes hard to keep track of the scope and access those tensors in each tower. This class provides easy access to the tensors as well as the inputs/outputs created in each tower.


The same as get_tensor().

get_collection(key=None, name=None)[source]

See BaseTowerContext.get_collection_in_tower().

  • key (str) – the key of the collection

  • name – deprecated


Get a tensor in this tower. The name can be:

  1. The name of the tensor without any tower prefix.

  2. The name of an InputDesc, if it is used when building the tower.

In the second case, this method will return the tensor that’s used as the corresponding input to the tower. Note that this tensor may have a different name (e.g. may be an output of a queue).


Like get_tensor(), but takes a list and returns a list.


Get a variable used in this tower. The name should not contain the variable scope prefix of the tower.

When the tower has the same variable scope and name scope, this is equivalent to get_tensor().


Like get_variable(), but takes a list and returns a list.


The list of input tensors used to build the tower.


The output returned by the tower function.

class tensorpack.tfutils.tower.TowerTensorHandles(handles)[source]

Bases: object

Wrap a list of TowerTensorHandle, to support access to them by index or names.

Parameters:name_or_index (str or int) –
Returns:a TowerTensorHandle.
Returns:A TowerTensorHandles, containing only the inference towers.
Returns:A TowerTensorHandles, containing only the training towers.

tensorpack.tfutils.scope_utils module


A decorator which automatically reuses the current variable scope if the function has been called with the same variable scope before.


def myfunc(x):
    return tf.layers.conv2d(x, 128, 3)

myfunc(x1)  # will inherit parent scope reuse
myfunc(x2)  # will reuse
with tf.variable_scope('newscope'):
    myfunc(x3)  # will inherit parent scope reuse
    myfunc(x4)  # will reuse
tensorpack.tfutils.scope_utils.cached_name_scope(name, top_level=True)[source]

Return a context which either opens and caches a new name scope, or reenter an existing one.

Parameters:top_level (bool) – if True, the name scope will always be top-level. It will not be nested under any existing name scope of the caller.
Parameters:name_scope (str) – the default scope to use. If None, will use the name of the function.
Returns:A decorator which makes the function run under a name scope. The name scope is obtained by the following: 1. The ‘name_scope’ keyword argument when the decorated function is called. 2. The ‘name_scope’ argument of the decorator. 3. (default) The name of the decorated function itself.


def rms(x):
    return tf.sqrt(

rms(tensor)  # will be called under name scope 'rms'
rms(tensor, name_scope='scope')  # will be called under name scope 'scope'


Add a reuse option.

tensorpack.tfutils.optimizer module

tensorpack.tfutils.optimizer.apply_grad_processors(opt, gradprocs)[source]

Wrapper around optimizers to apply gradient processors.

  • opt (tf.train.Optimizer) –

  • gradprocs (list[GradientProcessor]) – gradient processors to add to the optimizer.


a tf.train.Optimizer instance which runs the gradient processors before updating the variables.

class tensorpack.tfutils.optimizer.ProxyOptimizer(opt, name='ProxyOptimizer')[source]

Bases: tensorflow.python.training.optimizer.Optimizer

A transparent proxy which delegates all methods of tf.train.Optimizer

class tensorpack.tfutils.optimizer.PostProcessOptimizer(opt, func, colocate=True)[source]

Bases: tensorpack.tfutils.optimizer.ProxyOptimizer

An optimizer which applies some “post-processing operation” per variable (e.g. clipping, quantization) after the gradient update.

__init__(opt, func, colocate=True)[source]
  • opt (tf.train.Optimizer) –

  • func (tf.Variable -> tf.Operation or None) – the operation needed to perform for this variable after the gradient update.

  • colocate (boolean) – colocate the function with the variable.

class tensorpack.tfutils.optimizer.VariableAssignmentOptimizer(opt, func)[source]

Bases: tensorpack.tfutils.optimizer.PostProcessOptimizer

An optimizer which assigns each variable a new value (e.g. clipping, quantization) after the gradient update.

__init__(opt, func)[source]
  • opt (tf.train.Optimizer) –

  • func (tf.Variable -> tf.Tensor or None) – the new value to be assigned to this variable after the gradient update.

class tensorpack.tfutils.optimizer.AccumGradOptimizer(opt, niter)[source]

Bases: tensorpack.tfutils.optimizer.ProxyOptimizer

An optimizer which accumulates gradients across \(k\) minimize() executions, and apply them together in every \(k\) th minimize() execution. This is roughly the same as using a \(k\) times larger batch size plus a \(k\) times larger learning rate, but uses much less memory.

Note that this implementation may not support all models. E.g., it doesn’t support sparse gradient update.

__init__(opt, niter)[source]
  • opt (tf.train.Optimizer) – the underlying sub-optimizer.

  • niter (int) – number of iterations to accumulate gradients.

tensorpack.tfutils.sesscreate module

class tensorpack.tfutils.sesscreate.NewSessionCreator(target='', config=None)[source]

Bases: tensorflow.python.training.monitored_session.SessionCreator

__init__(target='', config=None)[source]
  • config (target,) – same as Session.__init__().

  • config – a tf.ConfigProto instance, defaults to tfutils.get_default_sess_config()

class tensorpack.tfutils.sesscreate.ReuseSessionCreator(sess)[source]

Bases: tensorflow.python.training.monitored_session.SessionCreator

Parameters:sess (tf.Session) – the session to reuse
class tensorpack.tfutils.sesscreate.SessionCreatorAdapter(session_creator, func)[source]

Bases: tensorflow.python.training.monitored_session.SessionCreator

__init__(session_creator, func)[source]
  • session_creator (tf.train.SessionCreator) – a session creator

  • func (tf.Session -> tf.Session) – takes a session created by

  • and return a new session to be returned by self.create_session (session_creator,) –


tensorpack.tfutils.sessinit module

class tensorpack.tfutils.sessinit.SessionInit[source]

Bases: object

Base class for utilities to load variables to a (existing) session.


Initialize a session

Parameters:sess (tf.Session) – the session
class tensorpack.tfutils.sessinit.ChainInit(sess_inits)[source]

Bases: tensorpack.tfutils.sessinit.SessionInit

Initialize a session by a list of SessionInit instance, executed one by one. This can be useful for, e.g., loading several models from different files to form a composition of models.

Parameters:sess_inits (list[SessionInit]) – list of SessionInit instances.
class tensorpack.tfutils.sessinit.SaverRestore(model_path, prefix=None, ignore=[])[source]

Bases: tensorpack.tfutils.sessinit.SessionInit

Restore a tensorflow checkpoint saved by tf.train.Saver or ModelSaver.

__init__(model_path, prefix=None, ignore=[])[source]
  • model_path (str) – a model name (model-xxxx) or a checkpoint file.

  • prefix (str) – during restore, add a prefix/ for every variable in this checkpoint.

  • ignore (list[str]) – list of tensor names that should be ignored during loading, e.g. learning-rate

class tensorpack.tfutils.sessinit.SaverRestoreRelaxed(model_path, prefix=None, ignore=[])[source]

Bases: tensorpack.tfutils.sessinit.SaverRestore

Same as SaverRestore, but has more relaxed constraints.

It allows upcasting certain variables, or reshape certain variables when there is a mismatch that can be fixed. Another advantage is that it doesn’t add any new ops to the graph. But it is also slower than SaverRestore.

class tensorpack.tfutils.sessinit.DictRestore(variable_dict)[source]

Bases: tensorpack.tfutils.sessinit.SessionInit

Restore variables from a dictionary.

Parameters:variable_dict (dict) – a dict of {name: value}
class tensorpack.tfutils.sessinit.JustCurrentSession[source]

Bases: tensorpack.tfutils.sessinit.SessionInit

This is a no-op placeholder


Get a corresponding model loader by looking at the file name.

Returns:SessInit – either a DictRestore (if name ends with ‘npy/npz’) or SaverRestore (otherwise).

tensorpack.tfutils.summary module

tensorpack.tfutils.summary.add_tensor_summary(x, types, name=None, collections=None, main_tower_only=True)[source]

Summarize a tensor by different methods.

  • x (tf.Tensor) – a tensor to summarize

  • types (list[str]) – summary types, can be scalar/histogram/sparsity/mean/rms

  • name (str) – summary name. Defaults to be the op name.

  • collections (list[str]) – collections of the summary ops.

  • main_tower_only (bool) – Only run under main training tower. If set to True, calling this function under other TowerContext has no effect.


with tf.name_scope('mysummaries'):  # to not mess up tensorboard
        tensor, ['histogram', 'rms', 'sparsity'], name='mytensor')
tensorpack.tfutils.summary.add_param_summary(*summary_lists, **kwargs)[source]

Add summary ops for all trainable variables matching the regex, under a reused ‘param-summary’ name scope. This function is a no-op if not calling from main training tower.

  • summary_lists (list) – each is (regex, [list of summary type]). Summary type is defined in add_tensor_summary().

  • collections (list[str]) – collections of the summary ops.


    ('.*/W', ['histogram', 'rms']),
    ('.*/gamma', ['scalar']),
tensorpack.tfutils.summary.add_activation_summary(x, types=None, name=None, collections=None)[source]

Call add_tensor_summary() under a reused ‘activation-summary’ name scope. This function is a no-op if not calling from main training tower.

  • x (tf.Tensor) – the tensor to summary.

  • types (list[str]) – summary types, defaults to ['sparsity', 'rms', 'histogram'].

  • name (str) – if is None, use x.name.

  • collections (list[str]) – collections of the summary ops.

tensorpack.tfutils.summary.add_moving_summary(*args, **kwargs)[source]

Summarize the moving average for scalar tensors. This function is a no-op if not calling from main training tower.

  • args – scalar tensors to summarize

  • decay (float) – the decay rate. Defaults to 0.95.

  • collection (str or None) – the name of the collection to add EMA-maintaining ops. The default will work together with the default MovingAverageSummary callback.

  • summary_collections ([str]) – the names of collections to add the summary op. Default is TF’s default (tf.GraphKeys.SUMMARIES).



list of tensors returned by assign_moving_average,

which can be used to maintain the EMA.

tensorpack.tfutils.varmanip module


Dump value of all TRAINABLE + MODEL variables to a dict, and save as npz format (loadable by sessinit.get_model_loader()).

Parameters:path (str) – the file name to save the parameters. Must ends with npz.

Load all variables from a checkpoint to a dict.

Parameters:model_path (str) – path to a checkpoint.
Returns:dict – a name:value dict
tensorpack.tfutils.varmanip.save_chkpt_vars(dic, path)[source]

Save variables in dic to path.

  • dic – {name: value}

  • path – save as npz if the name ends with ‘.npz’, otherwise save as a checkpoint.


Work around TF problems in checkpoint path handling.

Parameters:model_path – a user-input path
Returns:str – the argument that can be passed to NewCheckpointReader

tensorpack.tfutils.varreplace module

Parameters:custom_getter – the same as in tf.get_variable()
Returns:The current variable scope with a custom_getter.
tensorpack.tfutils.varreplace.freeze_variables(stop_gradient=True, skip_collection=False)[source]

Return a context to freeze variables, by wrapping tf.get_variable with a custom getter. It works by either applying tf.stop_gradient on the variables, or by keeping them out of the TRAINABLE_VARIABLES collection, or both.


with varreplace.freeze_variable(stop_gradient=False, skip_collection=True):
    x = FullyConnected('fc', x, 1000)   # fc/* will not be trained
  • stop_gradient (bool) – if True, variables returned from get_variable will be wrapped with tf.stop_gradient and therefore has no gradient when used later. Note that the created variables may still have gradient when accessed by other approaches (e.g. by name, or by collection). Also note that this makes tf.get_variable returns a Tensor instead of a Variable, which may break existing code. Therefore, it’s recommended to use the skip_collection option instead.

  • skip_collection (bool) – if True, do not add the variable to TRAINABLE_VARIABLES collection, but to MODEL_VARIABLES collection. As a result they will not be trained by default.


Use fn to map the output of any variable getter.

Parameters:fn (tf.Variable -> tf.Tensor) –
Returns:The current variable scope with a custom_getter that maps all the variables by fn.


with varreplace.remap_variables(lambda var: quantize(var)):
    x = FullyConnected('fc', x, 1000)   # fc/{W,b} will be quantized

Other functions in tensorpack.tfutils module


Return a tf.ConfigProto to use as default session config. You can modify the returned config to fit your needs.

  • mem_fraction (float) – see the per_process_gpu_memory_fraction option

  • TensorFlow's GPUOptions protobuf (in) –

  • https – //github.com/tensorflow/tensorflow/blob/master/tensorflow/core/protobuf/config.proto


tf.ConfigProto – the config to use.

the global_step variable in the current graph. Create if
doesn’t exist.
Returns:int – global_step value in current graph and session

Has to be called under a default session.

Parameters:layers (list or layer) – layer or list of layers to apply the arguments.
Returns:a context where all appearance of these layer will by default have the arguments specified by kwargs.


with argscope(Conv2D, kernel_shape=3, nl=tf.nn.relu, out_channel=32):
    x = Conv2D('conv0', x)
    x = Conv2D('conv1', x)
    x = Conv2D('conv2', x, out_channel=64)  # override argscope
Returns:dict – the current argscope.

An argscope is a dict of dict: dict[layername] = {arg: val}