TensorFlow: Mutating variables and control flow

How to control operations orders and variable mutation in TF

Published in

metaflow-ai

5 min readMar 23, 2017

In this article, we are going to explore deeper TensorFlow capacities in terms of variable mutation and control flow statements.

Mutation

So far, we’ve used Variables exclusively as some weights in our models that would be updated with an optimiser’s operation (like Adam). But optimisers are not the only way to update Variables, there is a whole set of higher order functions to do so (Again, see those functions as a way to add operations in your graph).

The most basic function to make custom updates is the tf.assign() operation. It takes a Variable and a value, and assign the value to the Variable, simple.

let’s start with an example:

tf.assign update values of a Variable

Nothing fancy here. It works just like any other operations: you call it within a Session and the operation ensure that the mutation happens so your Variable gets updated.

Compare this assign call to the usual optimiser train_op call. Both do the same thing: mutate data. The only difference is that the optimizer is doing a whole lot of calculus before applying mutations to your Variables.

TF support many other functions to do manual updates, see them as helper functions. All of them could be replaced by some clever tensor operations followed by a tf.assign call, but that would be cumbersome. So, TF provides two kinds of mutation operations for us:

Those to apply sparse updates (update only a subset of elements of your variables): https://www.tensorflow.org/api_guides/python/state_ops#Sparse_Variable_Updates
Those to apply dense updates (update the whole Variable at once): https://www.tensorflow.org/api_guides/python/state_ops#Variable_helper_functions

I won’t dig into all those helper functions. Some of them can be hard to wrap your head around, my best advice is just to experiment with them on a very simple script before using them into your models, you will earn time…

One last word about mutation: what if we would like to change the shape of our Variables? For example, adding a row/column on the fly right inside our graph? So far I’ve been only talking about “assigning” new values.

That’s possible but trickier:

First, a tf.Variable has a parameter validate_shape defaulting to True. It prevents you from updating the shape of it so we have to set it to False.
This parameter also exists in the tf.assign function itself, so we have to turn it off again.

Let’s see an example:

tf.assign function updating the shape of a Variable

OK! That was not too hard, let’s move on.

Control dependency

We can update Variables, but if you start to put assign calls all around your code, you will soon end up calling multiple times sess.run to control them. This is not practical nor efficient. Remember, the more we stay in the graph, the more efficient we are.

Welcome in the realm of control flow. TF provides a set of functions to order your operations when they are not fully dependent.

Let’s start simple: we will build a graph doing a simple multiplication between a placeholder and a Variable. We would like to increment this Variable before each call we make to our multiplication. How do we actually do that?

If we start the naive way, by just adding a tf.assign call, we will end up with something like this:

Graph computation without control dependency

It doesn’t work at all: our Variable is not incremented and we keep outputting 2.

If you look at the code above and try to build mentally a computation graph, you will clearly see that this graph doesn’t need to compute the assign_op to compute the output of the multiplication between x and y: y is already perfectly defined with the initialised value 2.

To fix this, we need a way to force TF to run the assign_op.

Hopefully, that does exist! We can add what is called a control dependency. This works just like Graph or Variables scope, we use it in conjunction with the python statement with.

Let’s see an example:

Graph computation with control dependency

Everything works fine. TF see a dependency so it runs the assign_op before computing anything under the dependency scope, here is a visualisation:

On the left, the graph just doesn’t care to compute the assign_op
On the right, the control dependency force the graph to compute the assign_op before computing the multiplication operation

One pitfall

Earlier I’ve talked about mutating the shape of a Variable. Sadly, using shape mutations with control dependency leads us into the dark side of TF code optimiser.

Before trying to explain anything here is a piece of code showing the result:

Subtle behaviour with mutations and control flow

Look closely to the code and the outputs:

The print operation is dependent on the assign_op, it should only be computed after x has been updated.
Yet x looks like it has not been updated when we print it…
But in fact, it has been since I can get the true value of x using a special the read_value function.

What the hell is happening? This behaviour can be misleading and this is probably closer to a bug than a feature, but TF is caching aggressively to optimise your computations. This happens to be one of the drawbacks you can encounter, be careful!

That’s it!

So, how could you use that?

One idea that top of my mind is with people in NLP having to deal with <unk> words. Now you could “technically” update the shape of your embeddings online (while learning) to add words as you encounter them!

ONE BIG REMARK: I have no idea if such a model would still learn a useful (dynamic) word embedding, but if you test this I would love to hear about your experiments!

TensorFlow best practice series

This article is part of a more complete series of articles about TensorFlow. I’ve not yet defined all the different subjects of this series, so if you want to see any area of TensorFlow explored, add a comment! So far I wanted to explore those subjects (this list is subject to change and is in no particular order):

A primer
How to handle shapes in TensorFlow
TensorFlow saving/restoring and mixing multiple models
How to freeze a model and serve it with a python API
TensorFlow: A proposal of good practices for files, folders and models architecture
TensorFlow howto: a universal approximator inside a neural net
How to optimise your input pipeline with queues and multi-threading
Mutating variables and control flow (this one :) )
How to handle preprocessing with TensorFlow.
How to control the gradients to create custom back-prop operations.
How to monitor and inspect my models to gain insight into them.

Note: TF is evolving fast right now, those articles are currently written for the 1.0.0 version.

Reference

How to do a while loop: http://stackoverflow.com/questions/38994037/tensorflow-while-loop-for-training
A nice implementation of what we’ve seen: https://github.com/PrajitR/fast-pixel-cnn/blob/master/fast_pixel_cnn_pp/fast_nn.p
Some explanation about the dark side of optimisations: https://github.com/tensorflow/tensorflow/issues/7782