Convert the Column Type from String to Datetime Format in Pandas DataFrame

Convert the Column Type from String to Datetime Format in Pandas DataFrame

When we work with data in Pandas DataFrame of Python, it is pretty usual to encounter time series data. Panday is a strong tool that can handle time-series data in Python, and we might need to convert the string into Datetime format in the given dataset.

In this tutorial, we will learn how to convert the DataFrame column of string into datetime format, “dd/mm/yy”. The user cannot execute any time-series based operations on the dates if they are not in the required format. To deal with this, we need to convert the dates into the required date-time format.

Different Approaches for Converting Datatype Format in Python:

In this section, we will discuss different approaches we can use for changing the datatype of Pandas DataFrame column from string to datetime:

Approach 1: Using pandas.to_datetime() Function

In this approach, we will use “pandas.to_datetime()” function for converting the datatype in Pandas DataFrame column.

Example:

Output:

The data is: 
         Date           Event                     Cost
0  12/05/2021    Music- Dance  15400
1  11/21/2018   Poetry- Songs   7000
2  01/12/2020  Theatre- Drama  25000

RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   Date    3 non-null      object 
 
1   Event   3 non-null      object
 2   Cost    3 non-null      int64 
dtypes: int64(1), object(2)
memory usage: 200.0+ bytes

Here, in the output, we can see that the Datatype of the “Date” column in the DataFrame is “object”, which means it is a string. Now, we will convert the Datatype into datetime format by using the “pnd.to_datetime()” function:

Output:

The data is: 
         Date           Event                     Cost
0  12/05/2021    Music- Dance  15400
1  11/21/2018   Poetry- Songs   7000
2  01/12/2020  Theatre- Drama  25000
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype  
 ---  ------  --------------  -----         
 0   Date    3 non-null      datetime64[ns] 
 
1   Event   3 non-null      object        
 2   Cost    3 non-null      int64         
dtypes: datetime64[ns](1), int64(1), object(1)
memory usage: 200.0+ bytes

Now, we can see that the format of the “Data” column in the DataFrame has been changed to the datetime format.

Approach 2: Using DataFrame.astype() Function.

In this approach, we will use “DataFrame.astype()” function for converting the datatype in Pandas DataFrame column.

Example:

Output:

The data is: 
         Date           Event                     Cost
0  12/05/2021    Music- Dance  15400
1  11/21/2018   Poetry- Songs   7000
2  01/12/2020  Theatre- Drama  25000
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   Date    3 non-null      object 
 
1   Event   3 non-null      object
 2   Cost    3 non-null      int64 
dtypes: int64(1), object(2)
memory usage: 200.0+ bytes

Here, in the output, we can see that the Datatype of the “Date” column in the DataFrame is “object”, which means it is a string. Now, we will convert the datatype into datetime format by using the “Data_Frame.astype()” function:

Output:

The data is: 
         Date           Event   Cost
0  12/05/2021    Music- Dance  15400
1  11/21/2018   Poetry- Songs   7000
2  01/12/2020  Theatre- Drama  25000
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype         
---  ------  --------------  -----         
 0   Date    3 non-null      datetime64[ns] 

1   Event   3 non-null      object        
 2   Cost    3 non-null      int64         
dtypes: datetime64[ns](1), int64(1), object(1)
memory usage: 200.0+ bytes

Now, we can see that the format of the “Data” column in the DataFrame has been changed to the datetime format by using data_frame[‘Date’].astype(‘datetime64[ns]’.

Approach 3:

Suppose we have a date in “yymmdd” format in the DataFrame column, and we have to convert it from a string to a datetime format.

Example:

Output:

The data is: 
     Date          Patient Number
0  210302           67000
1  210901           62000
2  210706           61900
3  210402           59000
4  210802           74000
5  210804           54050
6  210109           57650
7  210509           67300
8  210209           76600
 Date              object 

Patient Number     int64
dtype: object

Here, in the output we can see that the Datatype of the “Date” column in the DataFrame is “object”, that means, it is string. Now, we will convert the datatype into datetime format by using “data_frame[‘Date’] = pnd.to_datetime(data_frame[‘Date’], format = ‘%y%m%d’)” function.

Output:

The data is: 
     Date         Patient Number
0  210302           67000
1  210901           62000
2  210706           61900
3  210402           59000
4  210802           74000
5  210804           54050
6  210109           57650
7  210509           67300
8  210209           76600
 Date              datetime64[ns] 

Patient Number             int64
dtype: object

In the above code, we have changed the datatype of the column “Date” from “object” to “datetime64[ns]” by using “pnd.to_datetime(data_frame[‘Date’], format = ‘%y%m%d’)” function.

Approach 4:

We can convert multiple columns from “string” to “datetime” format, which means “YYYYMMDD” format, by using the “pandas.to_datetime()” function.

Output:

The data is: 
  Treatment_starting_Date   Patients Number     Treatment_ending_Date
0   20210612                54000                20210812
1   20210814                65000                20210614
2   20210316                71500                20210316
3   20210519                45000                20210119
4   20210221                98000                20210221
5   20210124                23000                20210724
6   20210929                12000                20210924
 Treatment_starting_Date    object 

Patients Number             int64
 Treatment_ending_Date      object 

dtype: object

Here, in the output, we can see that the Datatype of the “Date” column in the DataFrame is “object”, which means it is a string. Now, we will convert the datatype “Date” column into datetime format by using “pnd.to_datetime(data_frame[”], format = ‘%y%m%d’)” function.

Output:

The data is: 
  Treatment_starting_Date  Patients Number Treatment_ending_Date
0                20210612            54000              20210812
1                20210814            65000              20210614
2                20210316            71500              20210316
3                20210519            45000              20210119
4                20210221            98000              20210221
5                20210124            23000              20210724
6                20210929            12000              20210924
 Treatment_starting_Date    datetime64[ns] 

Patients Number                     int64
 Treatment_ending_Date      datetime64[ns] 

dtype: object

In the above output, we can see the datatype of “Treatment_starting_Date” and “Treatment_ending_Date” has been changed to datetime format by using the “pnd.to_datetime()” function.

Conclusion

In this tutorial, we learned different methods of converting the column type of Pandas DataFrame from string to datetime in Python.


原创文章,作者:ItWorker,如若转载,请注明出处:https://blog.ytso.com/tech/courses/263370.html

(0)
上一篇 2022年5月30日 18:47
下一篇 2022年5月30日 18:47

相关推荐

发表回复

登录后才能评论