TensorFlow: Mutating variables and control flow
How to control operations orders and variable mutation in TF
In this article, we are going to explore deeper TensorFlow capacities in terms of variable mutation and control flow statements.
So far, we’ve used Variables exclusively as some weights in our models that would be updated with an optimiser’s operation (like Adam). But optimisers are not the only way to update Variables, there is a whole set of higher order functions to do so (Again, see those functions as a way to add operations in your graph).
The most basic function to make custom updates is the
tf.assign() operation. It takes a Variable and a value, and assign the value to the Variable, simple.
let’s start with an example:
Nothing fancy here. It works just like any other operations: you call it within a
Session and the operation ensure that the mutation happens so your Variable gets updated.
assign call to the usual optimiser
train_op call. Both do the same thing: mutate data. The only difference is that the optimiser is doing a whole lot of calculus before applying mutations to your Variables.
TF support many other functions to do manual updates, see them as helper functions. All of them could be replaced by some clever tensor operations followed by a
tf.assign call, but that would be cumbersome. So, TF provides two kinds of mutation operations for us:
- Those to apply sparse updates (update only a subset of elements of your variables): https://www.tensorflow.org/api_guides/python/state_ops#Sparse_Variable_Updates
- Those to apply dense updates (update the whole Variable at once): https://www.tensorflow.org/api_guides/python/state_ops#Variable_helper_functions
I won’t dig into all those helper functions. Some of them can be hard to wrap your head around, my best advice is just to experiment with them on a very simple script before using them into your models, you will earn time…
One last word about mutation: what if we would like to change the shape of our Variables? For example, adding a row/column on the fly right inside our graph? So far i’ve been only talking about “assigning” new values.
That’s possible but trickier:
tf.Variablehave a parameter
True.It prevents you from updating the shape of it so we have to set it to
- This parameter also exists in the
tf.assignfunction itself, so we have to turn it off again.
Let’s see an example:
OK! That was not too hard, let’s move on.
We can update Variables, but if you start to put assign calls all around your code, you will soon end up calling multiple times
sess.run to control them. This is not practical nor efficient. Remember, the more we stay in the graph, the more efficient we are.
Welcome in the realm of control flow. TF provides a set of functions to order your operations when they are not fully dependent.
Let’s start simple: we will build a graph doing a simple multiplication between a
placeholder and a
Variable. We would like increment this
Variable before each call we make to our multiplication. How do we actually do that ?
If we start the naive way, by just adding an
tf.assign call, we will end up with something like this:
It doesn’t work at all: our
Variable is not incremented and we keep outputting
If you look at the code above and try to build mentally a computation graph, you will clearly see that this graph doesn’t need to compute the
assign_op to compute the output of the multiplication between
y is already perfectly defined with the initialised value
To fix this, we need a way to force TF to run the
Hopefully, that does exist! We can add what is called a control dependency. This works just like Graph or Variables scope, we use it in conjunction with the python statement
Let’s see an example:
Everything works fine. TF see a dependency so it runs the
assign_op before compute anything under the dependency scope, here is a visualisation:
- On the left, the graph just doesn’t care to compute the
- On the right, the control dependency force the graph to compute the
assign_opbefore compute the multiplication operation
Earlier I’ve talked about mutating the shape of a Variable. Sadly, using shape mutations with control dependency leads us into the dark side of TF code optimiser.
Before trying to explain anything here is a piece of code showing the result:
Look closely to the code and the outputs:
assign_op, it should only be computed after
xhas been updated.
xlooks like it has not been updated when we print it…
- But in fact it has been since i can get the true value of
xusing a special the
What the hell is happening? This behaviour can be misleading and this is probably closer to a bug than a feature, but TF is caching aggressively to optimise your computations. This happens to be one of the drawbacks you can encounter, be careful!
So, how could you use that?
One idea that top of my mind is with people in NLP having to deal with
<unk> words. Now you could “technically” update the shape of your embeddings online (while learning) to add words as you encounter them!
ONE BIG REMARK: I have no idea if such a model would still learn a useful (dynamic) word embedding, but if you test this i would love to hear about your experiments!
TensorFlow best practice series
This article is part of a more complete series of articles about TensorFlow. I’ve not yet defined all the different subjects of this series, so if you want to see any area of TensorFlow explored, add a comment! So far I wanted to explore those subjects (this list is subject to change and is in no particular order):
- A primer
- How to handle shapes in TensorFlow
- TensorFlow saving/restoring and mixing multiple models
- How to freeze a model and serve it with a python API
- TensorFlow: A proposal of good practices for files, folders and models architecture
- TensorFlow howto: a universal approximator inside a neural net
- How to optimise your input pipeline with queues and multi-threading
- Mutating variables and control flow (this one :) )
- How to handle preprocessing with TensorFlow.
- How to control the gradients to create custom back-prop with, or fine-tune my models.
- How to monitor my training and inspect my models to gain insight about them.
Note: TF is evolving fast right now, those articles are currently written for the 1.0.0 version.
- How to do a while loop: http://stackoverflow.com/questions/38994037/tensorflow-while-loop-for-training
- A nice implementation of what we’ve seen: https://github.com/PrajitR/fast-pixel-cnn/blob/master/fast_pixel_cnn_pp/fast_nn.p
- Some explanation about the dark side of optimisations: https://github.com/tensorflow/tensorflow/issues/7782