當我們在 Python 的 Pandas DataFrame 中處理數據時,經常會遇到時間序列數據。Panday 是一個強大的工具,可以在 Python 中處理時間序列數據,我們可能需要在給定的數據集中將字元串轉換為 Datetime 格式。
在本教程中,我們將學習如何將字元串的 DataFrame 列轉換為日期時間格式「dd/mm/yy」。如果日期不是所需的格式,用戶就不能對其執行任何基於時間序列的操作。為了解決這個問題,我們需要將日期轉換成所需的日期時間格式。
Python 中轉換數據類型格式的不同方法:
在本節中,我們將討論將 Pandas DataFrame 列的數據類型從字元串更改為日期時間的不同方法:
方法 1:使用 pandas.to_datetime()函數
在這種方法中,我們將使用「pandas.to_datetime()」函數來轉換 Pandas DataFrame 列中的數據類型。
示例:
import pandas as pnd
# Creating the dataframe
data_frame = pnd.DataFrame({'Date':['12/05/2021', '11/21/2018', '01/12/2020'],
'Event':['Music- Dance', 'Poetry- Songs', 'Theatre- Drama'],
'Cost':[15400, 7000, 25000]})
# Print the dataframe
print ("The data is: ")
print (data_frame)
# Here, we are checking the data type of the 'Date' column
data_frame.info()
輸出:
The data is:
Date Event Cost
0 12/05/2021 Music- Dance 15400
1 11/21/2018 Poetry- Songs 7000
2 01/12/2020 Theatre- Drama 25000
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Date 3 non-null object
1 Event 3 non-null object
2 Cost 3 non-null int64
dtypes: int64(1), object(2)
memory usage: 200.0+ bytes
在這裡,在輸出中,我們可以看到數據幀中「日期」列的數據類型是「對象」,這意味著它是一個字元串。現在,我們將使用「pnd.to_datetime()」函數將數據類型轉換為日期時間格式:
import pandas as pnd
# Creating the dataframe
data_frame = pnd.DataFrame({'Date':['12/05/2021', '11/21/2018', '01/12/2020'],
'Event':['Music- Dance', 'Poetry- Songs', 'Theatre- Drama'],
'Cost':[15400, 7000, 25000]})
# Print the dataframe
print ("The data is: ")
print (data_frame)
# For converting the 'Date' column of DataFrame into datetime format
data_frame['Date'] = pnd.to_datetime(data_frame['Date'])
# Here, we are checking the data type of the 'Date' column
data_frame.info()
輸出:
The data is:
Date Event Cost
0 12/05/2021 Music- Dance 15400
1 11/21/2018 Poetry- Songs 7000
2 01/12/2020 Theatre- Drama 25000
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Date 3 non-null datetime64[ns]
1 Event 3 non-null object
2 Cost 3 non-null int64
dtypes: datetime64[ns](1), int64(1), object(1)
memory usage: 200.0+ bytes
現在,我們可以看到數據幀中「數據」列的格式已經更改為日期時間格式。
方法 2:使用 DataFrame.astype()函數。
在這種方法中,我們將使用「DataFrame.astype()」函數來轉換 Pandas DataFrame 列中的數據類型。
示例:
import pandas as pnd
# Creating the dataframe
data_frame = pnd.DataFrame({'Date':['12/05/2021', '11/21/2018', '01/12/2020'],
'Event':['Music- Dance', 'Poetry- Songs', 'Theatre- Drama'],
'Cost':[15400, 7000, 25000]})
# Print the dataframe
print ("The data is: ")
print (data_frame)
# Here, we are checking the data type of the 'Date' column
data_frame.info()
輸出:
The data is:
Date Event Cost
0 12/05/2021 Music- Dance 15400
1 11/21/2018 Poetry- Songs 7000
2 01/12/2020 Theatre- Drama 25000
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Date 3 non-null object
1 Event 3 non-null object
2 Cost 3 non-null int64
dtypes: int64(1), object(2)
memory usage: 200.0+ bytes
在這裡,在輸出中,我們可以看到數據幀中「日期」列的數據類型是「對象」,這意味著它是一個字元串。現在,我們將使用「數據幀類型()」函數將數據類型轉換為日期時間格式:
import pandas as pnd
# Creating the dataframe
data_frame = pnd.DataFrame({'Date':['12/05/2021', '11/21/2018', '01/12/2020'],
'Event':['Music- Dance', 'Poetry- Songs', 'Theatre- Drama'],
'Cost':[15400, 7000, 25000]})
# Print the dataframe
print ("The data is: ")
print (data_frame)
# For converting the 'Date' column of DataFrame into datetime format
data_frame['Date'] = data_frame['Date'].astype('datetime64[ns]')
# Here, we are checking the data type of the 'Date' column
data_frame.info()
輸出:
The data is:
Date Event Cost
0 12/05/2021 Music- Dance 15400
1 11/21/2018 Poetry- Songs 7000
2 01/12/2020 Theatre- Drama 25000
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Date 3 non-null datetime64[ns]
1 Event 3 non-null object
2 Cost 3 non-null int64
dtypes: datetime64[ns](1), int64(1), object(1)
memory usage: 200.0+ bytes
現在,我們可以看到,通過使用 data_frame[‘Date’],DataFrame 中「Data」列的格式已更改為 datetime 格式。astype(‘datetime64[ns])。
方法 3:
假設我們在 DataFrame 列中有一個「yymmdd」格式的日期,我們必須將其從字元串轉換為日期時間格式。
示例:
import pandas as pnd
# Now, we will initialize the nested list with Dataset
play_list = [['210302', 67000], ['210901', 62000], ['210706', 61900],
['210402', 59000], ['210802', 74000],
['210804', 54050], ['210109', 57650], ['210509', 67300], ['210209', 76600]]
# Creating a pandas DataFrame
data_frame = pnd.DataFrame(play_list,columns = ['Date','Patient Number'])
# Print the dataframe
print ("The data is: ")
print (data_frame)
# Here, we are checking the data type of the 'Date' column
print (data_frame.dtypes)
輸出:
The data is:
Date Patient Number
0 210302 67000
1 210901 62000
2 210706 61900
3 210402 59000
4 210802 74000
5 210804 54050
6 210109 57650
7 210509 67300
8 210209 76600
Date object
Patient Number int64
dtype: object
這裡,在輸出中,我們可以看到數據幀中「日期」列的數據類型是「對象」,也就是說,它是字元串。現在,我們將使用「data frame[‘ Date ‘]= PND . to datetime(data _ frame[‘ Date ‘],format = ‘%y%m%d ‘)」函數將數據類型轉換為 datetime 格式。
import pandas as pnd
# Now, we will initialize the nested list with Dataset
play_list = [['210302', 67000], ['210901', 62000], ['210706', 61900],
['210402', 59000], ['210802', 74000],
['210804', 54050], ['210109', 57650], ['210509', 67300], ['210209', 76600]]
# creating a pandas dataframe
data_frame = pnd.DataFrame(play_list,columns = ['Date','Patient Number'])
# Print the dataframe
print ("The data is: ")
print (data_frame)
# For converting the 'Date' column of DataFrame into datetime format
data_frame['Date'] = pnd.to_datetime(data_frame['Date'], format = '%y%m%d')
# Here, we are checking the data type of the 'Date' column
print (data_frame.dtypes)
輸出:
The data is:
Date Patient Number
0 210302 67000
1 210901 62000
2 210706 61900
3 210402 59000
4 210802 74000
5 210804 54050
6 210109 57650
7 210509 67300
8 210209 76600
Date datetime64[ns]
Patient Number int64
dtype: object
在上面的代碼中,我們通過使用「PND . to datetime(data frame[‘ Date ‘],format = ‘%y%m%d ‘)」函數,將「Date」列的數據類型從「object」更改為「datetime64[ns]」。
方法 4:
我們可以使用「pandas.to_datetime()」函數將多列從「string」轉換為「datetime」格式,這意味著「YYYYMMDD」格式。
# Initializing the nested list with Data set
Dataset_list = [['20210612', 54000, '20210812'],
['20210814', 65000, '20210614'],
['20210316', 71500, '20210316'],
['20210519', 45000, '20210119'],
['20210221', 98000, '20210221'],
['20210124', 23000, '20210724'],
['20210929', 12000, '20210924']]
# creating a pandas dataframe
data_frame = pnd.DataFrame(
Dataset_list, columns = ['Treatment_starting_Date',
'Patients Number',
'Treatment_ending_Date'])
# Print the dataframe
print ("The data is: ")
print (data_frame)
# Here, we are checking the data type of the 'Date' column
print (data_frame.dtypes)
輸出:
The data is:
Treatment_starting_Date Patients Number Treatment_ending_Date
0 20210612 54000 20210812
1 20210814 65000 20210614
2 20210316 71500 20210316
3 20210519 45000 20210119
4 20210221 98000 20210221
5 20210124 23000 20210724
6 20210929 12000 20210924
Treatment_starting_Date object
Patients Number int64
Treatment_ending_Date object
dtype: object
在這裡,在輸出中,我們可以看到數據幀中「日期」列的數據類型是「對象」,這意味著它是一個字元串。現在,我們將使用「pnd.to_datetime(data_frame[”],format = ‘%y%m%d ‘)函數將數據類型「Date」列轉換為 datetime 格式。
import pandas as pnd
# Initializing the nested list with Data set
Dataset_list = [['20210612', 54000, '20210812'],
['20210814', 65000, '20210614'],
['20210316', 71500, '20210316'],
['20210519', 45000, '20210119'],
['20210221', 98000, '20210221'],
['20210124', 23000, '20210724'],
['20210929', 12000, '20210924']]
# creating a pandas dataframe
data_frame = pnd.DataFrame(
Dataset_list, columns = ['Treatment_starting_Date',
'Patients Number',
'Treatment_ending_Date'])
# Print the dataframe
print ("The data is: ")
print (data_frame)
# For converting the multiple columns of DataFrame into datetime format
data_frame['Treatment_starting_Date'] = pnd.to_datetime(
data_frame['Treatment_starting_Date'],
format = '%Y%m%d'
)
data_frame['Treatment_ending_Date'] = pnd.to_datetime(
data_frame['Treatment_ending_Date'],
format = '%Y%m%d'
)
# Here, we are checking the data type of the 'Date' column
print (data_frame.dtypes)
輸出:
The data is:
Treatment_starting_Date Patients Number Treatment_ending_Date
0 20210612 54000 20210812
1 20210814 65000 20210614
2 20210316 71500 20210316
3 20210519 45000 20210119
4 20210221 98000 20210221
5 20210124 23000 20210724
6 20210929 12000 20210924
Treatment_starting_Date datetime64[ns]
Patients Number int64
Treatment_ending_Date datetime64[ns]
dtype: object
在上面的輸出中,我們可以看到「治療開始日期」和「治療結束日期」的數據類型已經通過使用「pnd.to_datetime()」函數更改為日期時間格式。
結論
在本教程中,我們學習了在 Python 中將 Pandas DataFrame 的列類型從字元串轉換為日期時間的不同方法。
原創文章,作者:AE2NK,如若轉載,請註明出處:https://www.506064.com/zh-tw/n/127496.html