{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Deep Learning & Neural Networks"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Project 1A - non-linear classification"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this exericse, we will implement a simple neural network that classifies whether a 3D point is inside the unit sphere centered at 0. This neural network essentially *learns*, or tries to approximate, what the shape of a sphere is based on *limited data*.\n",
"\n",
"The training data will be 100 arbitrary $\\left(x, y, z\\right)$ pairs together with labels indicating whether the points are in the sphere or not, i.e. whether for pair $(x, y, z)$:\n",
"\n",
"$$ x^2 + y^2 + z^2 \\le 1 $$\n",
"\n",
"I'll explain a little later the format of the labels (this is somewhat important!)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Setup\n",
"\n",
"The first thing we'll do is load all the Python libraries we'll be using. See the comments below on what each library is for:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# Load tensorflow\n",
"import tensorflow as tf\n",
"# Load numpy - adds MATLAB/Julia-style math to Python\n",
"import numpy as np\n",
"# Load matplotlib for plotting\n",
"import matplotlib.pyplot as plt\n",
"%matplotlib inline\n",
"# This is so that we can do some 3D plotting\n",
"from mpl_toolkits.mplot3d import Axes3D\n",
"# This is a convenient package for generating cartesian products\n",
"import itertools"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We'll define a Python function that returns **True** if $(x, y, z)$ is contained inside a sphere and **False** otherwise. This is our *ground truth function*:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"def classify_inside_sphere(x, y, z):\n",
" return x**2 + y**2 + z**2 < np.ones(x.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The following code plots the sphere as a sanity check. We will be using the matplotlib library (loaded above) to generate a scatter plot."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"N_POINTS = 20\n",
"\n",
"# generate some evenly spaced 3D points that are inside the sphere and we will plot them, so that what shows up is a sphere\n",
"points = filter(lambda arg: classify_inside_sphere(arg[0], arg[1], arg[2]),\n",
" itertools.product(np.linspace(-1,1, N_POINTS), repeat=3))\n",
"xs = map(lambda x: x[0], points)\n",
"ys = map(lambda x: x[1], points)\n",
"zs = map(lambda x: x[2], points)\n",
"fig = plt.figure()\n",
"ax = fig.add_subplot(111, projection='3d')\n",
"ax.scatter(xs, ys, zs)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Just like in a lot of introductory machine learning classes, we will randomly generate training data.\n",
"\n",
"For the labels we will use a 1-hot encoding. That is, each label is either the pair $(1, 0)$ if the corresponding point is outside the sphere and $(0, 1)$ if it's inside.\n",
"\n",
"Think of this as something like data for a logistic regression problem with two classes. We will generalize this idea later (several classes, multinomial regression) with the MNIST project. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Generate 100 random training data points\n",
"N_TRAIN = 100\n",
"SCALE = 1\n",
"xs = np.random.normal(scale=SCALE, size=N_TRAIN)\n",
"ys = np.random.normal(scale=SCALE, size=N_TRAIN)\n",
"zs = np.random.normal(scale=SCALE, size=N_TRAIN)\n",
"# Generate labels for the training points\n",
"inside_sphere = classify_inside_sphere(xs, ys, zs).astype(int)\n",
"true_labels = np.zeros((N_TRAIN, 2))\n",
"for i in range(N_TRAIN):\n",
" true_labels[i, inside_sphere[i]] = 1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we have our training data, we will finally start using *TensorFlow*! \n",
"\n",
"We need to describe, in Python code, a model for our neural network. In TensorFlow this model is called a **computation graph**, or simply a **graph**, and it tells us what mathematical functions should be applied to data.\n",
"\n",
"If you're familiar with JuMP, you will see that TensorFlow is based on exactly the same idea of describing some mathematical model first and then running computations based on it!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Entry points to our graph\n",
"\n",
"The first thing to describe is what data will be used in computations (that are executed only **after** the graph is defined!).\n",
"\n",
"For this we use the ``tf.placeholder`` function which returns us a *Tensor*, which is really just a multi-dimensional array. \n",
"\n",
"For our application we want a graph that calculates some loss function from training data, so our input tensors will be a batch of training points with their labels, so 2 tensors called: ``inputs`` and ``exp_output``.\n",
"\n",
"When defining this placeholder Tensor, we should specify its dimensions but we want to give ourselves flexibility in what batch size (number of training points) is used. For this reason the dimensions of the ``inputs`` tensor will be $(?, 3)$ and for ``exp_output`` $(?, 2)$ where $?$ is a wildcard. That is, the batch size is unspecified and each training point consists of a 3D input vector and 2D expected binary output vector. In the Python code the wildcard is given by ``None`` reference."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"inputs = tf.placeholder(tf.float32, shape=[None, 3], name=\"inputs\")\n",
"exp_output = tf.placeholder(tf.float32, [None,2])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Ok, so we have now defined our input and we will later be able to feed in data -- two numpy arrays with shape ``(?, 3)`` and ``(?, 2)``, for the training data and labels respectively. In the definition, we also supplied a ``name`` parameter: this is just to make debugging easier.\n",
"\n",
"Let's now describe in our model what we will do with the ``inputs`` tensor -- we'll worry about ``exp_output`` tensor later.\n",
"\n",
"In our neural network, the first layer is a fully connected one, so they operation we want to model is\n",
"\n",
"$$Wx + b $$\n",
"\n",
"where $W$ and $b$ are tensor *variables* for the weights, and biases, and $x$ stands for the input tensor. Let's define these variables and let's have 20 neurons for the hidden layer:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"N_HIDDEN = 20\n",
"hidden_W = tf.Variable(tf.random_normal([3, N_HIDDEN], stddev=0.3), name=\"hidden_weights\")\n",
"hidden_b = tf.Variable(tf.zeros([N_HIDDEN]), name=\"hidden_bias\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We are now ready to describe the operation and using the TensorFlow functions we return and store a reference to the output of this fully connected layer (the result of $Wx + b$). The reference will be stored in the Python variable ``hidden_in`` (that is, the input to the hidden layer)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"hidden_in = tf.matmul(inputs, hidden_W) + hidden_b\n",
"hidden_out = tf.sigmoid(hidden_in)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We have immediately passed the ``hidden_in`` tensor to the sigmoid activation function. This adds a node to our graph which represents the element-wise sigmoid function, and we call the output tensor ``hidden_out``.\n",
"\n",
"**Question:** What is the dimension of the ``hidden_out`` tensor? Why?\n",
"\n",
"**Answer:** $(?, N)$ where $N$ is the number of hidden layer neurons. Recall that we are multiplying a $(?, 3)$ tensor with $W$, the latter having dimensions $(3, N)$."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that we can easily check his using TensorFlow:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"hidden_out.get_shape()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's add another fully connected layer. This time the output will have dimension $(?, 2)$ -- use the previously mentioned function to check this!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"output_W = tf.Variable(tf.random_normal([N_HIDDEN, 2], stddev=0.3), name=\"output_weights\")\n",
"output_b = tf.Variable(tf.zeros([2]), name=\"output_bias\")\n",
"output = tf.matmul(hidden_out, output_W) + output_b\n",
"output.get_shape()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now the final thing our neural network needs to do is give us a 2-dimensional vector as output whose entries are non-negative and sum to 1 (probability distribution). We do this using the softmax function.\n",
"\n",
"Recall the softmax function is defined as:\n",
"\n",
"$$\\sigma(y)_k = \\frac{e^{y_k}}{\\sum_j e^{y_j}}$$"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"net_output = tf.nn.softmax(output)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Since this like the sigmoid function is elementwise, the ``net_output`` tensor has shape $(?,2)$. Again - please check this for yourself!\n",
"\n",
"Voila! We now have the complete definition of our neural network, which has as an input layer, the tensor ``inputs``, as well as 1 hidden layer. When the network is instantiated, all its weights and bias parameters will be initialized to random values. However, the point is we want to train the parameters using the data, so in the following code we will define the operations that are necessary for this.\n",
"\n",
"But before we go ahead with that, let's see how we can get some output from our neural network. We need to do this with a ``Session`` object."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"session = tf.Session()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we have our ``session`` let's first do the random initialization of parameters. First we create the operation:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"init = tf.initialize_all_variables()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Then we get the ``session`` object to run this initialization:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"session.run(init)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Our network is now initialized, and since it's not been trained at all, will output garbage. But let's just see that it's the right format.\n",
"\n",
"To run a forward pass through the newtwork, we need to *run* the operation corresponding to ``net_output`` using some Numpy array that we shall feed in:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# First mash the inputs to answer\n",
"xs_tensor = np.vstack((xs, ys, zs))\n",
"\n",
"# Then feed into network by running the session\n",
"session.run(net_output, feed_dict={inputs: xs_tensor.T})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can convince yourself, as an exercise, that this output is a bad classifier of points in a sphere. \n",
"\n",
"Let's now train our network, to do this we need to define a loss function. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"cross_entropy = tf.reduce_mean(-exp_output*tf.log(net_output))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We have essentially extended our computation graph with yet more operations on tensors and this time notice how we finally got to use our ``exp_output`` placeholder. The final output, referenced by ``cross_entropy``, represents a scalar value that we want to minimize.\n",
"\n",
"The minimization will be done using Stochastic Gradient Descent (SGD). This is another step in our code (but behind the scenes TensorFlow adds all the low level operations that occur during backpropogation, building the graph that computes gradients etc.)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"optimizer = tf.train.GradientDescentOptimizer(0.5)\n",
"train_step = optimizer.minimize(cross_entropy)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The 0.5 passed into the ``tf.train.GradientDescentOptimizer`` constructor is the learning rate for SGD. Note that the reference for the minimization step is the Python variable ``train_step``."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Training and running the network\n",
"\n",
"We have finished describing the network in our code and TensorFlow has \"built\" a training algorithm from all this.\n",
"\n",
"We run the gradient step 10,000 times by **running** the train step operation and feeding in appropriate Numpy arrays for both the ``inputs`` and ``exp_output`` tensors."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"scrolled": true
},
"outputs": [],
"source": [
"# Before running the algorithm, we need to mash our training data into\n",
"# a tensor using numpy\n",
"xs_tensor = np.vstack((xs, ys, zs))\n",
"\n",
"# Run some gradient steps and plot the cross-entropy loss over time\n",
"N_STEPS = 10000\n",
"errors = []\n",
"for i in xrange(N_STEPS):\n",
" train_error, _ = session.run((cross_entropy, train_step), \n",
" feed_dict={inputs : xs_tensor.T, exp_output: true_labels})\n",
" if i%1000==0: print i, train_error\n",
" errors.append(train_error)\n",
"plt.plot(range(N_STEPS), errors, 'b-')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The training is now complete and we will want to measure the accuracy of our network on training data. We do this by building another TensorFlow operation!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Now we are going to create an operation for measuring the network's accuracy, \n",
"# that is the percentage of samples that the network classifies correctly.\n",
"# Think of this as an alternative loss function, which is **not** the one being minimized\n",
"correct_points = tf.cast(tf.equal(tf.argmax(exp_output, 1), tf.argmax(net_output, 1)), tf.float32)\n",
"accuracy = tf.reduce_mean(correct_points)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can run the new operation using the exact same Numpy arrays as inputs like before:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# We know the cross-entropy is low -- what about the accuracy?\n",
"# By calling the accuracy operation inside the session and feeding in the training data we'll get \n",
"session.run(accuracy, feed_dict={inputs : xs_tensor.T, exp_output : true_labels})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We know from Machine Learning principles that just because our network fits the training data well, doesn't always mean it generalizes. We are now going to check what actually happens when we test \"edge cases\", points close to the surface of the sphere which are more likely to be mis-classified."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# We can see that the accuracy is close to 100% on the training data (good!)\n",
"# However, did we just overfit or did our classifier accurately learn the shape of the sphere?\n",
"\n",
"# We're going to generate some test data around the surface of one of the sphere's quadrants\n",
"# The idea is we want to essentially test these difficult 'corner cases'\n",
"thetas = np.arange(0.2, np.pi/2, 0.02)\n",
"phis = np.arange(0.2, np.pi/2, 0.02)\n",
"xs_test = []\n",
"ys_test = []\n",
"zs_test = []\n",
"for phi in phis:\n",
" z = np.cos(phi)\n",
" r = np.sin(phi)\n",
" for theta in thetas:\n",
" y = r*np.cos(theta)\n",
" x = r*np.sin(theta)\n",
" xs_test.append(x)\n",
" ys_test.append(y)\n",
" zs_test.append(z)\n",
"xs_test = np.asarray(xs_test)\n",
"ys_test = np.asarray(ys_test)\n",
"zs_test = np.asarray(zs_test)\n",
"xs_test += np.random.normal(scale=0.5, size=len(xs_test))\n",
"\n",
"# Let's try to trick our classifier by using points outside the sphere but close to its surface\n",
"bad_indices = filter(lambda i: not classify_inside_sphere(xs_test[i], ys_test[i], zs_test[i]), range(len(xs_test)))\n",
"xs_test = xs_test[bad_indices]\n",
"ys_test = ys_test[bad_indices]\n",
"zs_test = zs_test[bad_indices]\n",
"xs_test_tensor = np.vstack((xs_test, ys_test, zs_test))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The above code generated the edge cases: points close to the surface but outside of the sphere. Now we define yet another TensorFlow operation to count and average out the number of mistakes. We immediately run it in our ``session`` to get the answer"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Finally we can just get the accuracy by subtracting from 1 the fraction of false positives \n",
"# (remember there are no false negatives)\n",
"classifications = tf.argmax(net_output, 1)\n",
"1 - np.mean(session.run(classifications, feed_dict={inputs : xs_test_tensor.T}))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So unfortunately, our network doesn't perform that well out-of-sample. Using the ``classifications`` operations again (which simply takes argmax of the softmax output) we can plot the boundary of our classifier"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Let's plot what the boundary of our non-linear classifier looks like\n",
"points_array = np.asarray([elem for elem in itertools.product(np.linspace(-1,1, N_POINTS), repeat=3)])\n",
"classified = session.run(classifications, feed_dict={inputs: points_array}).astype(bool)\n",
"points_to_plot = points_array[filter(lambda i: classified[i], range(len(points_array)))]\n",
"xs = map(lambda x: x[0], points_to_plot)\n",
"ys = map(lambda x: x[1], points_to_plot)\n",
"zs = map(lambda x: x[2], points_to_plot)\n",
"fig = plt.figure()\n",
"ax = fig.add_subplot(111, projection='3d')\n",
"ax.scatter(xs, ys, zs)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Looks somewhat like a deformed sphere, but this is what happens when you have a complex model like a neural network and not enough training points."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Project 1B. Exercise: non-linear regression\n",
"\n",
"In this exercise, you will implement a neural network that learns the Rosenbrock function, defined as:\n",
"\n",
"$$ f(x,y) = (a-x)^2 + b(y -x^2)^2 $$\n",
"\n",
"The architecture should almost the same as the previous network, only differences are:\n",
"\n",
" * 2-dimensional input layer\n",
" * 1-dimensional linear output layer - so **no** softmax!\n",
" * Square loss function used for training $\\frac{1}{2}(f(x) - y)^2$\n",
"\n",
"But remember to use (like before):\n",
" * 1 hidden layer with sigmoid activation function\n",
" * 20 neurons in the hidden layer\n",
" \n",
"The only thing I haven't shown yet is that ``tf.square`` is the function for implementing the elementwise squaring operation in TensorFlow.\n",
"\n",
"Once you've finished the exercise check out the slides to see what extensions you can try for this and the previous project."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# We'll use the classic Rosenbrock function\n",
"def rosenbrock(x,y):\n",
" a, b = 1.0, 100.0\n",
" return (a - x) ** 2 + b * (y - x**2) ** 2\n",
"# Lets plot our function\n",
"x = np.outer(np.linspace(-2,+2,100), np.ones(100))\n",
"y = np.outer(np.ones(100), np.linspace(-1,+4,100))\n",
"z = rosenbrock(x, y)\n",
"fig = plt.figure()\n",
"ax = plt.axes(projection='3d')\n",
"ax.plot_surface(x, y, z, cmap=plt.cm.jet)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we've defined the Rosenbrock function, let's get some training data to fit our network to:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# Lets generate some training data\n",
"N_TRAIN = 500\n",
"# Sample x from -2 to +2\n",
"x_train = np.random.rand(N_TRAIN) * 4 - 2\n",
"# Sample y from -1 to 4\n",
"y_train = np.random.rand(N_TRAIN) * 5 - 1\n",
"# Calculate z\n",
"z_train = rosenbrock(x_train, y_train)\n",
"# Plot the training set\n",
"fig = plt.figure()\n",
"ax = plt.axes(projection='3d')\n",
"ax.plot_surface(x, y, z, cmap=plt.cm.jet, alpha=0.2)\n",
"ax.scatter(x_train, y_train, z_train, c=\"white\", s=10)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Exercise:** build the network! Make sure you call the input ``net_input`` and labels ``exp_output`` and the training operation, ``train_step`` (so that you can run the code that follows after this cell). Basically, fill in the parts marked below with comments:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"N_HIDDEN = 20\n",
"\n",
"net_input = tf.placeholder(tf.float32, [None,2])\n",
"hidden_W = tf.Variable(tf.random_uniform([2,N_HIDDEN], -1.0, +1.0))\n",
"hidden_b = tf.Variable(tf.random_uniform([ N_HIDDEN], -1.0, +1.0))\n",
"hidden_in = tf.nn.bias_add(tf.matmul(net_input, hidden_W), hidden_b)\n",
"hidden_out = tf.nn.sigmoid(hidden_in)\n",
"\n",
"output_W = tf.Variable(tf.random_uniform([N_HIDDEN,1], -1.0, +1.0))\n",
"output_b = tf.Variable(tf.random_uniform([1], -1.0, +1.0))\n",
"output_in = tf.nn.bias_add(tf.matmul(hidden_out, output_W), output_b)\n",
"net_output = output_in # linear!\n",
"\n",
"exp_output = tf.placeholder(tf.float32, [None,1])\n",
"sq_error = tf.square(net_output - exp_output)\n",
"mse = tf.reduce_mean(sq_error)\n",
"\n",
"\n",
"optimizer = tf.train.GradientDescentOptimizer(0.0005)\n",
"train_step = optimizer.minimize(mse)\n",
"\n",
"init = tf.initialize_all_variables()\n",
"sess = tf.Session()\n",
"sess.run(init)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The below cells will train and test your network. If your solution is correct, the accuracy should be close to 100% on training data."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Run the network to get the output on our training data\n",
"# Before we do, we need to mash our training data into\n",
"# a tensor\n",
"train_xy_tensor = np.vstack((x_train, y_train)).T\n",
"train_z_tensor = np.reshape(z_train, (N_TRAIN,1))\n",
"initial_z = sess.run(net_output, feed_dict={\n",
" net_input: train_xy_tensor,\n",
" exp_output: train_z_tensor})\n",
"initial_z = np.reshape(initial_z, len(initial_z))\n",
"fig = plt.figure()\n",
"ax = plt.axes(projection='3d')\n",
"ax.scatter(x_train, y_train, z_train, c=\"white\", s=10)\n",
"ax.scatter(x_train, y_train, initial_z, c=\"purple\", s=10)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"train_error, trained_z = sess.run((mse,net_output), feed_dict={\n",
" net_input: train_xy_tensor,\n",
" exp_output: train_z_tensor})\n",
"trained_z = np.reshape(trained_z, len(trained_z))\n",
"fig = plt.figure()\n",
"ax = plt.axes(projection='3d')\n",
"ax.scatter(x_train, y_train, z_train, c=\"white\", s=10)\n",
"ax.scatter(x_train, y_train, trained_z, c=\"purple\", s=10)\n",
"print z_train[1:10]\n",
"print trained_z[1:10]\n",
"print train_error"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now you've built the function approximator, you can look at what it thinks the function is"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"x_test_mat = np.outer(np.linspace(-2,+2,100), np.ones(100))\n",
"y_test_mat = np.outer(np.ones(100), np.linspace(-1,+4,100))\n",
"x_test = np.reshape(x_test_mat, (100*100,1))\n",
"y_test = np.reshape(y_test_mat, (100*100,1))\n",
"test_xy_tensor = np.hstack((x_test, y_test))\n",
"net_z_test = sess.run(net_output, feed_dict={net_input: test_xy_tensor})\n",
"z_test_mat = np.reshape(net_z_test, (100,100))\n",
"fig = plt.figure()\n",
"ax = plt.axes(projection='3d')\n",
"ax.plot_surface(x_test_mat, y_test_mat, z_test_mat, cmap=plt.cm.jet)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can also check out its accuracy"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"x = np.outer(np.linspace(-2,+2,100), np.ones(100))\n",
"y = np.outer(np.ones(100), np.linspace(-1,+4,100))\n",
"z = rosenbrock(x, y)\n",
"fig = plt.figure()\n",
"ax = plt.axes(projection='3d')\n",
"ax.plot_surface(x, y, z, cmap=plt.cm.jet)\n",
"print np.mean((z - z_test_mat)**2)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.12"
}
},
"nbformat": 4,
"nbformat_minor": 0
}