gradient descent using python and numpy -


def gradient(x_norm,y,theta,alpha,m,n,num_it):     temp=np.array(np.zeros_like(theta,float))     in range(0,num_it):         h=np.dot(x_norm,theta)         #temp[j]=theta[j]-(alpha/m)*(  np.sum( (h-y)*x_norm[:,j][np.newaxis,:] )  )         temp[0]=theta[0]-(alpha/m)*(np.sum(h-y))         temp[1]=theta[1]-(alpha/m)*(np.sum((h-y)*x_norm[:,1]))         theta=temp     return theta    x_norm,mean,std=featurescale(x) #length of x (number of rows) m=len(x) x_norm=np.array([np.ones(m),x_norm]) n,m=np.shape(x_norm) num_it=1500 alpha=0.01 theta=np.zeros(n,float)[:,np.newaxis] x_norm=x_norm.transpose() theta=gradient(x_norm,y,theta,alpha,m,n,num_it) print theta 

my theta above code 100.2 100.2, should 100.2 61.09 in matlab correct.

i think code bit complicated , needs more structure, because otherwise you'll lost in equations , operations. in end regression boils down 4 operations:

  1. calculate hypothesis h = x * theta
  2. calculate loss = h - y , maybe squared cost (loss^2)/2m
  3. calculate gradient = x' * loss / m
  4. update parameters theta = theta - alpha * gradient

in case, guess have confused m n. here m denotes number of examples in training set, not number of features.

let's have @ variation of code:

import numpy np import random  # m denotes number of examples here, not number of features def gradientdescent(x, y, theta, alpha, m, numiterations):     xtrans = x.transpose()     in range(0, numiterations):         hypothesis = np.dot(x, theta)         loss = hypothesis - y         # avg cost per example (the 2 in 2*m doesn't matter here.         # consistent gradient, include it)         cost = np.sum(loss ** 2) / (2 * m)         print("iteration %d | cost: %f" % (i, cost))         # avg gradient per example         gradient = np.dot(xtrans, loss) / m         # update         theta = theta - alpha * gradient     return theta   def gendata(numpoints, bias, variance):     x = np.zeros(shape=(numpoints, 2))     y = np.zeros(shape=numpoints)     # straight line     in range(0, numpoints):         # bias feature         x[i][0] = 1         x[i][1] =         # our target variable         y[i] = (i + bias) + random.uniform(0, 1) * variance     return x, y  # gen 100 points bias of 25 , 10 variance bit of noise x, y = gendata(100, 25, 10) m, n = np.shape(x) numiterations= 100000 alpha = 0.0005 theta = np.ones(n) theta = gradientdescent(x, y, theta, alpha, m, numiterations) print(theta) 

at first create small random dataset should this:

linear regression

as can see added generated regression line , formula calculated excel.

you need take care intuition of regression using gradient descent. complete batch pass on data x, need reduce m-losses of every example single weight update. in case, average of sum on gradients, division m.

the next thing need take care track convergence , adjust learning rate. matter should track cost every iteration, maybe plot it.

if run example, theta returned this:

iteration 99997 | cost: 47883.706462 iteration 99998 | cost: 47883.706462 iteration 99999 | cost: 47883.706462 [ 29.25567368   1.01108458] 

which quite close equation calculated excel (y = x + 30). note passed bias first column, first theta value denotes bias weight.


Comments

Popular posts from this blog

html5 - What is breaking my page when printing? -

html - Unable to style the color of bullets in a list -

c# - must be a non-abstract type with a public parameterless constructor in redis -