TensorFlow: Shapes and dynamic dimensions
Here is a simple HowTo to understand the concept of shapes in TensorFlow and hopefully avoid losing hours of debugging them.
What is a tensor?
A tensor is an array of n-dimension containing the same type of data (int32, bool, etc.)
A tensor can be described with what we call a shape: it is a list (or tuple) of numbers describing the size of each dimension of our tensor, for example:
- For a tensor of n dimensions: (D_0, D_1, …, D_n-1)
- For a tensor of size W x H (usually called a matrix): (W, H)
- For a tensor of size W (usually called a vector): (W,)
- For a simple scalar (those are equivalent): () or (1,)
Note: (D_*, W and H are integers)
Note on the vector (1-D tensor): it is impossible to determine if a vector is a row or column vector by looking at the vector shape in TensorFlow, and in fact it doesn’t matter. For more information please look at this stack overflow answer about NumPy notation ( which is roughly the same as TensorFlow notation): http://stackoverflow.com/questions/22053050/difference-between-numpy-array-shape-r-1-and-r
A tensor looks like this in TensorFlow:
We can see we have a Tensor object:
- It has a name used in a key-value store to retrieve it later: Const:0
- It has a shape describing the size of each dimension: (6, 3, 7)
- It has a type: float32
Now, here is the most important piece of this article: Tensors in TensorFlow have 2 shapes!
The static shape AND the dynamic shape
Tensor in TensorFlow has 2 shapes! The static shape AND the dynamic shape
The static shape
The static shape is the shape you provided when creating a tensor OR the shape inferred by TensorFlow when you define an operation resulting in a new tensor. It is a tuple or a list.
TensorFlow will do its best to guess the shape of your different tensors (between your different operations) but it won’t always be able to do it. Especially if you start to do operations with placeholder defined with unknown dimensions (like when you want to use a dynamic batch size).
The static shape is a tuple or a list.
To use the static shape (Accessing/changing) in your code, you will use the different functions which are attached to the Tensor itself and have an underscore in their names:
Note: The static shape is very useful to debug your code with
The dynamic shape
The dynamic shape is the actual one used when you
run your graph. It is itself a tensor describing the shape of the original tensor.
If you defined a placeholder with undefined dimensions (with the
None type as a dimension), those
None dimensions will only have a real value when you feed an input to your placeholder and so forth, any variable depending on this placeholder.
The dynamic shape is itself a tensor describing the shape of the original tensor
To use the dynamic shape(Accessing/changing) in your code, you will use the different functions which are attached to the main scope and don’t have an underscore in their names:
The dynamic shape is very handy for dealing with dimensions that you want to keep dynamic.
A real use case: the RNN
So here we are, we like dynamic inputs because we want to build a dynamic RNN which should be able to handle any different length of inputs.
In the training phase we will define a placeholder with a dynamic batch_size, and then we will use the TensorFlow API to create a LSTM. You will end up with something like this:
And now you need to initialize the init_state with
init_state = cell.zero_state(batch_size, tf.float32) ...
But what the batch_size input should be equal to? Remember you want it to be dynamic so what are our options? TensorFlow allows different types here, if you read the source code you will find:
batch_size: int, float, or unit Tensor representing the batch size.
int and float can’t be used because when you define your graph, you actually don’t know what the batch_size will be (that’s the point).
The interesting piece is the last type: “unit Tensor representing the batch size”. If you dig the doc up from there, you will find that a unit Tensor is a “0-d Tensor” which is just a scalar. So how do you get that scalar-tensor anyway?
If you try with the static shape:
batch_size will be the
Dimension(None) type (printed as ‘?’). This type can only be used as a dimension for placeholders.
What you actually want to do is to keep the dynamic
batch_size “flow” though the graph, so you must use the dynamic shape:
batch_size will be a TensorFlow
0-d Tensor (Scalar Tensor) type describing the batch dimension, hooray!
- Use the static shape for debugging
- Use the dynamic shape everywhere else especially when you have undefined dimensions
Remark: In the RNN API, Tensorflow is taking care of the init_state and initiliaze it to the zero_state, why would i need to manually define it this way?
You might want to control the initialization of the init_state when you run your graph. By having access to the variable init_state this way, we can do it because when you run a graph, you can actually use the feed_dict to feed any variable at hand in your graph!
And now you can predict words Ad vitam æternam, Cheers! 🍺
Dive deep by reading the doc and the code!
TensorFlow best practice series
This article is part of a more complete series of articles about TensorFlow. I’ve not yet defined all the different subjects of this series, so if you want to see any area of TensorFlow explored, add a comment! So far I wanted to explore those subjects (this list is subject to change and is in no particular order):
- A primer
- How to handle shapes in TensorFlow (this one :) )
- TensorFlow saving/restoring and mixing multiple models
- How to freeze a model and serve it with a python API
- TensorFlow: A proposal of good practices for files, folders and models architecture
- TensorFlow howto: a universal approximator inside a neural net
- How to optimise your input pipeline with queues and multi-threading
- Mutating variables and control flow
- How to handle preprocessing with TensorFlow.
- How to handle hyper-parameters saving/loading and configuration files
- How to control the gradients to create custom back-prop with, or fine-tune my models.
- How to monitor my training and inspect my models to gain insight about them.
Note: TF is evolving fast right now, those articles are currently written for the 1.0.0 version.