吴恩达第二课第二周编程实例

匿名 (未验证) 提交于 2019-12-03 00:43:02

吴恩达第2课第2周编程习题

目标:使用mini―batch来加快学习速度;比较梯度下降,momentum,adam的效果

核心:指数加权平均值得计算及其意义,它是momentum,RMSProp,Adam算法的基石

不足:本例程没有使用学习率衰减的步骤,同时本例程只适于3层的二分法的神经网络

常记点:

1. 偏差修正时是除以,此处是-,t从1开始;

2. L=len(parameters) //2 ,这个L不等于网络层数,range(1,L+1)=range(1,len(layers_dims))

3. Adam算法求s时,需要平方(np.square),便于后面分母除根号(np.sqrt)

4. np.random.permutation(m),把range(m)重排列,用于把样本打乱,每一代都要打乱一次

5. arr[:,:]:逗号前面表示行的选取,后面表示列的选取

  1. ‘‘‘‘‘
  2. 1.
  3. 2.mini-batch
  4. 3.momentum
  5. 4.Adam
  6. ‘‘‘
  7. import
  8. import
  9. import
  10. import
  11. import
  12. import
  13. import
  14. import
  15. plt.rcParams[‘figure.figsize‘
  16. plt.rcParams[‘image.interpolation‘]=‘nearest‘
  17. plt.rcParams[‘image.cmap‘]=‘gray‘
  18. #,
  19. def
  20. #parametersWb
  21. forin#L1L
  22. ‘W‘+str(l)]=parameters[‘W‘+str(l)]-learning_rate*grads[‘dW‘
  23. ‘b‘+str(l)]=parameters[‘b‘+str(l)]-learning_rate*grads[‘db‘
  24. return
  25. ‘‘‘‘‘
  26. mini-batch
  27. ‘‘‘
  28. def
  29. forin
  30. #mini_batch_size
  31. if
  32. return
  33. ‘‘‘‘‘
  34. momentum
  35. ‘‘‘
  36. #v
  37. def
  38. forin
  39. ‘dW‘+str(l)]=np.zeros_like(parameters[‘W‘
  40. ‘db‘+str(l)]=np.zeros_like(parameters[‘b‘
  41. return
  42. #
  43. def
  44. forin
  45. ‘dW‘+str(l)]=beta*v[‘dW‘+str(l)]+(1-beta)*grads[‘dW‘
  46. ‘db‘+str(l)]=beta*v[‘db‘+str(l)]+(1-beta)*grads[‘db‘
  47. ‘W‘+str(l)]=parameters[‘W‘+str(l)]-learning_rate*v[‘dW‘
  48. ‘b‘+str(l)]=parameters[‘b‘+str(l)]-learning_rate*v[‘db‘
  49. return
  50. ‘‘‘‘‘
  51. Adam
  52. ‘‘‘
  53. #vs
  54. def
  55. forin
  56. ‘dW‘+str(l)]=np.zeros_like(parameters[‘W‘
  57. ‘db‘+str(l)]=np.zeros_like(parameters[‘b‘
  58. ‘dW‘+str(l)]=np.zeros_like(parameters[‘W‘
  59. ‘db‘+str(l)]=np.zeros_like(parameters[‘b‘
  60. return
  61. #
  62. def
  63. #t
  64. forin
  65. #
  66. ‘dW‘+str(l)]=beta1*v[‘dW‘+str(l)]+(1-beta1)*grads[‘dW‘
  67. ‘db‘+str(l)]=beta1*v[‘db‘+str(l)]+(1-beta1)*grads[‘db‘
  68. #
  69. ‘dW‘+str(l)]=v[‘dW‘
  70. ‘db‘+str(l)]=v[‘db‘
  71. #
  72. ‘dW‘+str(l)]=beta2*s[‘dW‘+str(l)]+(1-beta2)*np.square(grads[‘dW‘
  73. ‘db‘+str(l)]=beta2*s[‘db‘+str(l)]+(1-beta2)*np.square(grads[‘db‘
  74. #
  75. ‘dW‘+str(l)]=s[‘dW‘
  76. ‘db‘+str(l)]=s[‘db‘
  77. ‘W‘+str(l)]=parameters[‘W‘+str(l)]-learning_rate*(v_corrected[‘dW‘+str(l)]/np.sqrt(s_corrected[‘dW‘
  78. ‘b‘+str(l)]=parameters[‘b‘+str(l)]-learning_rate*(v_corrected[‘db‘+str(l)]/np.sqrt(s_corrected[‘db‘
  79. #vss=0sepsilon
  80. return
  81. ‘‘‘‘‘
  82. ‘‘‘
  83. def
  84. #paramvs
  85. if‘gd‘
  86. pass
  87. elif‘momentum‘
  88. elif‘adam‘
  89. else
  90. print(
  91. #
  92. forin
  93. #mini_batches
  94. forin
  95. #mini_batchX,Y
  96. #
  97. #
  98. #
  99. #
  100. if‘gd‘
  101. elif‘momentum‘
  102. elif‘adam‘
  103. if
  104. ifand
  105. print(+str(i)+‘:‘
  106. if
  107. ‘cost‘
  108. ‘epoch‘
  109. return
  110. ‘‘‘‘‘
  111. ‘‘‘
  112. "gd"
  113. "momentum"
  114. "adam"
  115. ‘‘‘‘‘
  116. adam
  117. ‘‘‘

原文:https://www.cnblogs.com/sytt3/p/9363326.html

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!