Basic Elements in TF

Struggling with the second project of the Udacity course, image classification, it’s a key point to understand the process as a whole. Starting with:

1. What is a tensor?

First of all, consider everything in a graph mode.


TensorFlow is a programming system in which you represent computations as graphs. Nodes in the graph are called ops ( operations). An op takes zero or more Tensors, performs some computation, and produces zero or more Tensors. A Tensor is a typed multi-dimensional array. For example, you can represent a mini-batch of images as a 4-D array of floating point numbers with dimensions [batch, height, width, channels]

Alt text
Then we can claim: TensorFlow ~ Tensor + Flow

I.e. tensors (units of array-like data) flowing through nodes (different kinds of operations, inner product, sigmoid, softmax, relu, etc), all the nodes forms a neural network graph. Demo of a small NN.

Alt text

2. What is a session?

1
2
$ sess = tf.Session()
$ sess.run(x, feed_dict={~})

It’s actually bringing the graph framework into implementation.

On the other hand, like the lazy computation of Spark RDDs, tf.placeholder() simply allocates a block of memory for future use in sess.run(), thus the computation graph/pipeline can be built ahead of any real data flow. Example of some commands that are part of the graph:

1
2
3
$ input = tf.placeholder(tf.float32, (None, h, w, d))
$ filter_weights = tf.Variable(tf.truncated_normal((H, W, in_d, out_d)))
$ bias = tf.Variable(tf.zeros(conv_num_outputs))
1
2
3
4
5
$ conv_layer = tf.nn.conv2d(input, weight, strides=[1, 2, 2, 1], padding='SAME')
$ conv_layer = tf.nn.bias_add(conv_layer, bias)
$ conv_layer = tf.nn.relu(conv_layer)
$ conv_layer = tf.nn.max_pool( conv_layer, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
$ session.run(optimizer, feed_dict={x: feature_batch, y: label_batch, keep_prob: keep_probability})

3. Why CNN?

Alt text

  • Regular neural nets: nodes in a single layer are independent, will generate huge scale of weights.
  • Full connectivity is a waste of “adjacent” info, number of parameters lead to overfitting.
  • The depth of filter is just like different nodes in a single layer.(To capture different levels of info)
  • Number of filters ~ depth of filters
  • For the same filter (same depth), share all parameters. Example:
    • Input shape 32x32x3 (H, W, D_Channel)
    • 20 filters of shape 8x8x3, stride 2, padding 1.
    • Output shape -> 14x14x20
    • No P sharing: (8x8x3+1)x(14x14x20)
    • With P sharing: (8x8x3+1)x20
Cunyuan(Anthony) Huang wechat
Scan QR code to add me on Wechat