梯度下降法Python代碼詳解

學習機器學習演算法必不可少的就是梯度下降法。而Python作為一種易學易用的編程語言，自然也有很多開源庫可以實現梯度下降法，如Numpy和SciPy等。本文將從多個方面詳細探討梯度下降法Python代碼的實現。

一、梯度下降法Python代碼初始值

梯度下降法通過不斷迭代未知參數的值，達到求出最優解的目的。在使用Python實現梯度下降法之前，我們需要確定一些初始超參數，例如學習率和迭代次數等等。

學習率是控制參數每次迭代移動的程度，設置太小會導致梯度下降過慢，設置太大則可能會因為過度擬合而形成局部極小值。而迭代次數則是控制演算法的時間長短和精度的高低，迭代次數太少可能無法得到最優解，迭代次數太多則會浪費時間和內存。

在實際應用中，我們可以通過多次試驗不同的學習率和迭代次數，通過交叉驗證來確定合適的參數，以達到最佳的模型。

二、梯度下降法Python實現

下面是使用Python實現梯度下降法的基本步驟：

初始化模型參數
計算代價函數
計算代價函數對模型參數的偏導數
更新模型參數
重複步驟2~4，直到達到收斂或者達到最大迭代次數

具體的Python代碼如下所示：

import numpy as np

def gradient_descent(x, y, theta, alpha, iterations):
    m = len(y)
    for i in range(iterations):
        h = np.dot(x, theta)
        loss = h - y
        gradient = np.dot(x.T, loss) / m
        theta = theta - alpha * gradient
    return theta

其中，參數x和y表示輸入數據的向量和標籤的向量，theta表示參數的初始值向量，alpha表示學習率，iterations表示迭代次數。其中np.dot函數表示向量之間的點積操作，/表示矩陣分量之間進行除法。

三、梯度下降法Python代碼二元函數

二元函數的梯度下降演算法在Python中也可以輕鬆實現。下面的代碼是一個簡單的實現例子：

import numpy as np
import matplotlib.pyplot as plt

def gradient_descent(x, y, theta, alpha, iterations):
    m = len(y)
    J_history = np.zeros(iterations)

    for i in range(iterations):
        h = np.dot(x, theta)
        loss = h - y
        J_history[i] = np.sum(loss ** 2) / (2 * m)
        gradient = np.dot(x.T, loss) / m
        theta = theta - alpha * gradient

    return theta, J_history

def plot_data(x, y):
    plt.plot(x, y, 'o')
    plt.show()

def plot_cost(J_history, iterations):
    plt.plot(np.arange(iterations), J_history, 'r')
    plt.xlabel('Iterations')
    plt.ylabel('Cost Function')
    plt.show()

def main():
    x = np.array([1, 2, 3, 4, 5, 6])
    y = np.array([3, 6, 9, 12, 15, 18])
    x = x[:, np.newaxis]
    y = y[:, np.newaxis]
    m = len(y)
    iterations = 1000
    alpha = 0.01
    theta = np.zeros((2, 1))
    ones = np.ones((m, 1))
    x = np.hstack((ones, x))

    theta, J_history = gradient_descent(x, y, theta, alpha, iterations)

    plot_data(x[:, 1], y)
    plot_cost(J_history, iterations)

if __name__ == '__main__':
    main()

其中，代碼首先聲明了plot_data和plot_cost兩個函數，分別用於繪製數據和繪製成本函數。然後在main函數中，我們構造了一個簡單的一元線性模型，其假設函數為y = 3x，然後使用梯度下降法求解得出最優解，其中iterations=1000，alpha=0.01。

最後，我們繪製了數據的散點圖和成本函數的變化趨勢。可以看到，隨著迭代次數的增加，成本函數J的值不斷減小，最終收斂到最優解。

四、隨機梯度下降法Python代碼

隨機梯度下降法（Stochastic Gradient Descent，SGD）是梯度下降法的一種變體，用於訓練大數據集。SGD計算每次更新時僅選取一個樣本進行計算代價函數和梯度，而不是全樣本。下面是一個簡單的實現例子：

import numpy as np

def stochastic_gradient_descent(x, y, theta, alpha, iterations):
    m = len(y)

    for i in range(iterations):
        random_index = np.random.randint(m)
        x_i = x[random_index : random_index + 1]
        y_i = y[random_index : random_index + 1]
        h = np.dot(x_i, theta)
        loss = h - y_i
        gradient = np.dot(x_i.T, loss)
        theta = theta - alpha * gradient

    return theta

其中，代碼首先聲明了sgd函數，表示SGD的求解過程。在函數中，我們首先通過np.random.randint從樣本中隨機選取一個樣本，然後在計算梯度時僅使用該樣本。最後，函數返回求得的最優參數theta。

五、Python梯度下降法原理

梯度下降法的核心思想是通過求解代價函數的梯度，從而不斷更新參數的值，以達到最優的模型解。在Python中，該演算法的基本原理可以概括為以下幾個步驟：

初始化參數的值
計算代價函數的值
計算代價函數的梯度
更新參數的值
重複步驟2~4，直到滿足收斂條件

需要注意的是，梯度下降法的收斂速度較慢，因此在實際應用中需要仔細調整學習率和迭代次數等超參數，以獲得較優的結果。

六、Python實現梯度下降

下面是一個簡單的二元函數的Python梯度下降實現過程：

import numpy as np
import matplotlib.pyplot as plt

def gradient_descent(x, y, theta, alpha, iterations):
    m = len(y)
    J_history = np.zeros(iterations)

    for i in range(iterations):
        h = np.dot(x, theta)
        loss = h - y
        J_history[i] = np.sum(loss ** 2) / (2 * m)
        gradient = np.dot(x.T, loss) / m
        theta = theta - alpha * gradient

    return theta, J_history

def plot_data(x, y):
    plt.plot(x, y, 'o')
    plt.show()

def plot_cost(J_history, iterations):
    plt.plot(np.arange(iterations), J_history, 'r')
    plt.xlabel('Iterations')
    plt.ylabel('Cost Function')
    plt.show()

def main():
    x = np.array([1, 2, 3, 4, 5, 6])
    y = np.array([3, 6, 9, 12, 15, 18])
    x = x[:, np.newaxis]
    y = y[:, np.newaxis]
    m = len(y)
    iterations = 1000
    alpha = 0.01
    theta = np.zeros((2, 1))
    ones = np.ones((m, 1))
    x = np.hstack((ones, x))

    theta, J_history = gradient_descent(x, y, theta, alpha, iterations)

    plot_data(x[:, 1], y)
    plot_cost(J_history, iterations)

if __name__ == '__main__':
    main()

可以看到，該代碼實現了一個簡單的一元函數線性擬合，其中學習率alpha=0.01，迭代次數iterations=1000。在運行完成後，代碼還會繪製出數據的散點圖和成本函數的變化趨勢圖。

七、小批量梯度下降Python

小批量梯度下降法（Mini-batch Gradient Descent）是介於梯度下降法和隨機梯度下降法之間的一種演算法。該演算法通過綜合全樣本和單個樣本的梯度，從而兼顧了批量演算法和隨機演算法的優缺點。

下面是一個簡單的實現例子：

import numpy as np

def minibatch_gradient_descent(x, y, theta, alpha, iterations, batch_size):
    m = len(y)
    for i in range(iterations):
        random_index = np.random.randint(m, size=batch_size)
        x_batch = x[random_index]
        y_batch = y[random_index]
        h = np.dot(x_batch, theta)
        loss = h - y_batch
        gradient = np.dot(x_batch.T, loss) / batch_size
        theta = theta - alpha * gradient
    return theta

其中，參數batch_size表示每一次迭代時所選取的樣本數量，該演算法會在全樣本和單個樣本演算法之間進行權衡，以達到更快的學習和更穩定的效果。

八、總結

本文從多個方面詳細展示了梯度下降法Python代碼的實現方式，涵蓋了梯度下降法的基礎知識、二元函數、隨機梯度下降法以及小批量梯度下降法等內容。在實際應用中，我們需要仔細挑選超參數，並通過多次試驗和評估來求得最佳的模型解。希望本文對您有所幫助！

原創文章，作者：CTGLT，如若轉載，請註明出處：https://www.506064.com/zh-tw/n/331244.html