最佳 Python 機器學習庫

機器學習是一門對計算機進行編程的科學，通過它他們可以從不同類型的數據中學習。根據機器學習對亞瑟·塞繆爾的定義——「賦予計算機學習能力而無需明確編程的研究領域」。機器學習的概念基本上用於解決不同類型的生活問題。

在以前，用戶通常通過手動編碼所有演算法並使用數學和統計公式來執行機器學習任務。

與 Python 庫、框架和模塊相比，這個過程耗時、低效且令人厭倦。但是在當今世界，用戶可以使用 Python 語言，這是機器學習中最流行和最有效的語言。Python 已經取代了許多語言，因為它是一個龐大的庫集合，它使工作變得更加容易和簡單。

在本教程中，我們將討論用於機器學習的最佳 Python 庫:

NumPy
我的天啊
Scikit-learn
提亞諾
TensorFlow
硬
PyTorch
Pandas
Matplotlib

NumPy

NumPy 是 Python 中最流行的庫。該庫用於通過使用大量高級數學函數和公式來處理大型多維數組和矩陣形成。它主要用於機器學習中基礎科學的計算。它廣泛用於線性代數、傅立葉變換和隨機數功能。還有其他高端庫，如張量流，用戶將其作為操縱張量的內部功能。

示例:


import numpy as nup

# Then, create two arrays of rank 2
K = nup.array([[2, 4], [6, 8]])
R = nup.array([[1, 3], [5, 7]])

# Then, create two arrays of rank 1
P = nup.array([10, 12])
S = nup.array([9, 11])

# Then, we will print the Inner product of vectors
print ("Inner product of vectors: ", nup.dot(P, S), "\n")

# Then, we will print the Matrix and Vector product
print ("Matrix and Vector product: ", nup.dot(K, P), "\n")

# Now, we will print the Matrix and matrix product
print ("Matrix and matrix product: ", nup.dot(K, R))

輸出:

Inner product of vectors: 222 

Matrix and Vector product: [ 68 156] 

Matrix and matrix product: [[22 34]
                                                   [46 74]]

我的天啊

SciPy 是機器學習開發人員中流行的庫，因為它包含許多用於執行優化、線性代數、集成和統計的模塊。SciPy 庫不同於 SciPy 棧，因為 SciPy 庫是組成 SciPy 棧的核心包之一。SciPy 庫用於圖像處理任務。

例 1:


from scipy import signal as sg
import numpy as nup
K = nup.arange(45).reshape(9, 5)
domain_1 = nup.identity(3)
print (K, end = 'KK')
print (sg.order_filter (K, domain_1, 1))

輸出:

r (K, domain_1, 1))
Output:
[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]
 [25 26 27 28 29]
 [30 31 32 33 34]
 [35 36 37 38 39]
 [40 41 42 43 44]] KK [[ 0\.  1\.  2\.  3\.  0.]
 [ 5\.  6\.  7\.  8\.  3.]
 [10\. 11\. 12\. 13\.  8.]
 [15\. 16\. 17\. 18\. 13.]
 [20\. 21\. 22\. 23\. 18.]
 [25\. 26\. 27\. 28\. 23.]
 [30\. 31\. 32\. 33\. 28.]
 [35\. 36\. 37\. 38\. 33.]
 [ 0\. 35\. 36\. 37\. 38.]]

例 2:


from scipy.signal import chirp as cp
from scipy.signal import spectrogram as sp
import matplotlib.pyplot as plot
import numpy as nup
t_T = nup.linspace(3, 10, 300)
w_W = cp(t_T, f0 = 4, f1 = 2, t1 = 5, method = 'linear')
plot.plot(t_T, w_W)
plot.title ("Linear Chirp")
plot.xlabel ('Time in Seconds)')
plot.show()

輸出:

Scikit-learn

Scikit-learn 是一個 Python 庫，用於經典的機器學習演算法。它建立在 Python 的兩個基本庫之上，即 NumPy 和 SciPy。Scikit-learn 在機器學習開發人員中很受歡迎，因為它支持有監督和無監督的學習演算法。這個庫也可以用於數據分析和數據挖掘過程。

示例:


from sklearn import datasets as ds
from sklearn import metrics as mt
from sklearn.tree import DecisionTreeClassifier as dtc

# load the iris datasets
dataset_1 = ds.load_iris()

# fit a CART model to the data
model_1 = dtc()
model_1.fit(dataset_1.data, dataset_1.target)
print(model)

# make predictions
expected_1 = dataset_1.target
predicted_1 = model_1.predict(dataset_1.data)

# summarize the fit of the model
print (mt.classification_report(expected_1, predicted_1))
print(mt.confusion_matrix(expected_1, predicted_1))

輸出:

DecisionTreeClassifier()
              precision    recall f1-score   support

           0       1.00      1.00      1.00        50
           1       1.00      1.00      1.00        50
           2       1.00      1.00      1.00        50

    accuracy                           1.00       150
   macro avg       1.00      1.00      1.00       150
weighted avg       1.00      1.00      1.00       150

[[50  0  0]
 [ 0 50  0]
 [ 0  0 50]]

提亞諾

antao 是著名的 Python 庫，用於定義、評估和優化數學表達式，這也高效地涉及多維數組。

它是通過優化 CPU 和 GPU 的利用率來實現的。由於機器學習是關於數學和統計的，所以 Anano 使用戶能夠輕鬆地進行數學運算。

它廣泛用於單元測試和自我驗證，以檢測和診斷不同類型的錯誤。antao 是一個強大的庫，可用於大規模計算密集型科學項目。這是一個簡單易懂的庫，個人可以在他們的項目中使用。

示例:


import theano as th
import theano.tensor as Tt
k = Tt.dmatrix('k')
r = 1 / (1 + Tt.exp(-k))
logistic_1 = th.function([k], r)
logistic_1([[0, 1], [-1, -2]])

輸出:

array([[0.5, 0.71135838],
       [0.26594342, 0.11420192]])

TensorFlow

TensorFlow 是一個開源的 Python 庫，用於高性能的數值計算。這是一個受歡迎的圖書館，由谷歌的大腦團隊開發。張量流是一個涉及定義和運行張量計算的框架。TensorFlow 可用於訓練和運行深度神經網路，深度神經網路可用於開發多個人工智慧應用。

示例:


import tensorflow as tsf

# Initialize two constants
K_1 = tsf.constant([2, 4, 6, 8])
K_2 = tsf.constant([1, 3, 5, 7])

# Multiply
result = tsf.multiply(K_1, K_2)

# Initialize the Session
sess_1 = tsf.Session()

# Print the result
print (sess_1.run(result))

# Close the session
sess_1.close()

輸出:

[ 2 12 30 56]

硬

Keras 是一個高級神經網路 API，能夠運行在 TensorFlow、CNTK 和 antao 庫之上。它是機器學習開發人員中非常著名的 Python 庫。它可以在中央處理器和圖形處理器上無故障運行。它使機器學習初學者和設計神經網路變得非常簡單。它也用於快速原型製作。

示例:


import numpy as nup
from tensorflow import keras as ks
from tensorflow.keras import layers as ls
number_classes = 10
input_shapes = (28, 28, 1)

# Here, we will import the data, and split it between train and test sets
(x_1_train, y_1_train), (x_2_test, y_2_test) = ks.datasets.mnist.load_data()

# now, we will Scale images to the [0, 1] range
x_1_train = x_1_train.astype("float32") / 255
x_2_test = x_2_test.astype("float32") / 255
# we have to make sure that the images have shape (28, 28, 1)
x_1_train = nup.expand_dims(x_1_train, -1)
x_2_test = nup.expand_dims(x_2_test, -1)
print ("x_train shape:", x_1_train.shape)
print (x_1_train.shape[0], "Training samples")
print (x_2_test.shape[0], "Testing samples")

# Then we will convert class vectors to binary class matrices
y_1_train = ks.utils.to_categorical(y_1_train, number_classes)
y_2_test = ks.utils.to_categorical(y_2_test, number_classes)
model_1 = ks.Sequential(
    [
        ks.Input(shape = input_shapes),
        ls.Conv2D(32, kernel_size = (3, 3), activation = "relu"),
        ls.MaxPooling2D(pool_size = (2, 2)),
        ls.Conv2D(64, kernel_size = (3, 3), activation = "relu"),
        ls.MaxPooling2D(pool_size = (2, 2)),
        ls.Flatten(),
        ls.Dropout(0.5),
        ls.Dense(number_classes, activation = "softmax"),
    ]
)

model_1.summary()

輸出:

x_train shape: (60000, 28, 28, 1)
60000 Training samples
10000 Testing samples
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 26, 26, 32)        320       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 11, 11, 64)        18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
flatten (Flatten)            (None, 1600)              0         
_________________________________________________________________
dropout (Dropout)            (None, 1600)              0         
_________________________________________________________________
dense (Dense)                (None, 10)                16010     
=================================================================
Total params: 34,826
Trainable params: 34,826
Non-trainable params: 0
_________________________________________________________________

PyTorch

PyTorch 也是基於 Torch 的機器學習開源 Python 庫，用 C 語言實現，用於機器學習。它在計算機版本、自然語言處理(NLP) 和許多其他機器學習程序上支持許多工具和庫。該庫還允許用戶使用 GPU 加速在 Tensor 上執行計算任務。

示例:


import torch as tch
d_type = tch.float
device_1 = tch.device("cpu")
# Use device = tch.device("cuda:0") for GPU

# Here, N_1 is batch size; D_in_1 is input dimension;
# H_1 is hidden dimension; D_out_1 is output dimension.
N_1 = 62
D_in_1 = 1000
H_1 = 110
D_out_1 = 11

# Now, we will create random input and output data
K = tch.randn(N_1, D_in_1, device = device_1, dtype = d_type)
R = tch.randn(N_1, D_out_1, device = device_1, dtype = d_type)

# Then, we will Randomly initialize weights
K_1 = tch.randn(D_in_1, H_1, device = device_1, dtype = d_type)
K_2 = tch.randn(H_1, D_out_1, device = device_1, dtype = d_type)

learning_rate_1 = 1e-6
for Q in range(500):
    # Now, we will put Forward pass: compute predicted y
    h_1 = K.mm(K_1)
    h_relu_1 = h_1.clamp(min = 0)
    y_pred_1 = h_relu_1.mm(K_2)

    # Compute and print loss
    loss = (y_pred_1 - R).pow(2).sum().item()
    print (Q, loss)

    # Then we will Backprop to compute gradients of w1 and w2 with respect to loss
    grad_y_pred = 2.0 * (y_pred_1 - R)
    grad_K_2 = h_relu_1.t().mm(grad_y_pred)
    grad_h_relu = grad_y_pred.mm(K_2.t())
    grad_h = grad_h_relu.clone()
    grad_h[h_1 < 0] = 0
    grad_K_1 = K.t().mm(grad_h)

    # Then we will Update the weights by using gradient descent
    K_1 -= learning_rate_1 * grad_K_1
    K_2 -= learning_rate_1 * grad_K_2

輸出:

0 35089116.0
1 33087792.0
2 42227192.0
3 56113208.0
4 61125684.0
5 45541204.0
6 21011108.0
7 6972017.0
8 2523046.5
9 1342124.5
10 950067.5625
11 753290.25
12 620475.875
13 519006.71875
14 437975.9375
15 372063.125
16 317840.8125
17 272874.46875
18 235348.421875
.
.
.
497 7.426088268402964e-05
498 7.348413055296987e-05
499 7.258950790856034e-05

Pandas

Pandas是一個 Python 庫，主要用於數據分析。用戶必須在使用數據集訓練機器學習之前準備數據集。Pandas 讓開發者很容易，因為它是專門為數據提取而開發的。它有各種各樣的工具來詳細分析數據，提供高級別的數據結構。

示例:


import pandas as pad

data_1 = {"Countries": ["Bhutan", "Cape Verde", "Chad", "Estonia", "Guinea", "Kenya", "Libya", "Mexico"],
       "capital": ["Thimphu", "Praia", "N'Djamena", "Tallinn", "Conakry", "Nairobi", "Tripoli", "Mexico City"],
       "Currency": ["Ngultrum", "Cape Verdean escudo", "CFA Franc", "Estonia Kroon; Euro", "Guinean franc", "Kenya shilling", "Libyan dinar", "Mexican peso"],
       "population": [20.4, 143.5, 12.52, 135.7, 52.98, 76.21, 34.28, 54.32] }

data_1_table = pad.DataFrame(data_1)
print(data_1_table)

輸出:

    Countries      capital             Currency  population
0      Bhutan      Thimphu             Ngultrum       20.40
1  Cape Verde        Praia  Cape Verdean escudo      143.50
2        Chad    N'Djamena            CFA Franc       12.52
3     Estonia      Tallinn  Estonia Kroon; Euro      135.70
4      Guinea      Conakry        Guinean franc       52.98
5       Kenya      Nairobi       Kenya shilling       76.21
6       Libya      Tripoli         Libyan dinar       34.28
7      Mexico  Mexico City         Mexican peso       54.32

Matplotlib

Matplotlib 是一個用於數據可視化的 Python 庫。當開發人員想要可視化數據及其模式時，就會使用它。這是一個二維繪圖庫，用於創建二維圖形和繪圖。

它有一個模塊 pyplot，用於繪製圖形，並為控制線樣式、字體屬性、格式化軸等提供不同的功能。Matplotlib 提供了不同類型的圖表，如直方圖、誤差圖、條形圖等。

例 1:


import matplotlib.pyplot as plot
import numpy as nup

# Prepare the data
K = nup.linspace(2, 4, 8)
R = nup.linspace(5, 7, 9)
Q = nup.linspace(0, 1, 3)

# Plot the data
plot.plot(K, K, label = 'K')
plot.plot(R, R, label = 'R')
plot.plot(Q, Q, label = 'Q')

# Add a legend
plot.legend()

# Show the plot
plot.show()

輸出:

例 2:


import matplotlib.pyplot as plot

# Creating dataset-1
K_1 = [8, 4, 6, 3, 5, 10, 
      13, 16, 12, 21]

R_1 = [11, 6, 13, 15, 17, 5, 
      3, 2, 8, 19]

# Creating dataset2
K_2 = [6, 9, 18, 14, 16, 15,
      11, 16, 12, 20]

R_2 = [16, 4, 10, 13, 18, 
      20, 6, 2, 17, 15]

plot.scatter(K_1, R_1, c = "Black", 
            linewidths = 2, 
            marker = "s", 
            edgecolor = "Brown", 
            s = 50)

plot.scatter(K_2, R_2, c = "Purple",
            linewidths = 2,
            marker = "^", 
            edgecolor = "Grey", 
            s = 200)

plt.xlabel ("X-axis")
plt.ylabel ("Y-axis")
print ("Scatter Plot")
plt.show()

輸出:

結論

在本教程中，我們討論了用於執行機器學習任務的不同 Python 庫。我們還展示了每個庫的不同示例。

原創文章，作者：小藍，如若轉載，請註明出處：https://www.506064.com/zh-tw/n/238545.html

最佳 Python 機器學習庫

NumPy

我的天啊

Scikit-learn

提亞諾

TensorFlow

硬

PyTorch

Pandas

Matplotlib

例 2:

結論

相關推薦

發表回復