Convert Dates from Excel Format to Standard with Python

Excel files store dates as serial numbers, representing the number of days since January 1, 1900 (Windows system) or January 1, 1904 (Mac system). To work properly with dates in Python, these numbers need to be converted into datetime objects.

1. Install the Required Library

The openpyxl library is commonly used to read Excel files (.xlsx). To install it:

pip install openpyxl

2. Conversion Example

In the following example, we will read an Excel file and convert the cell values containing dates into the standard ISO format (YYYY-MM-DD).

from openpyxl import load_workbook
from datetime import datetime, timedelta

# Load the Excel file
workbook = load_workbook('example.xlsx')
sheet = workbook.active

# Function to convert Excel number to date
def excel_date_to_datetime(excel_serial, date_system='windows'):
    if date_system == 'windows':
        start_date = datetime(1899, 12, 30)  # Excel treats 1900 as a leap year
    elif date_system == 'mac':
        start_date = datetime(1904, 1, 1)
    else:
        raise ValueError("Invalid system. Use 'windows' or 'mac'.")

    return start_date + timedelta(days=excel_serial)

# Extract and convert dates from the first column
for row in sheet.iter_rows(min_row=2, max_col=1, values_only=True):
    excel_value = row[0]
    if isinstance(excel_value, (int, float)):
        date = excel_date_to_datetime(excel_value)
        print(date.strftime('%Y-%m-%d'))
    else:
        print(f'Unconvertible value: {excel_value}')

3. Considerations

  • Always check the date reference system (Windows or Mac).
  • Cells with non-numeric values or already in datetime format should not be converted.
  • Excel treats 1900 as a leap year, which introduces a 1-day error for dates prior to March 1900.

Conclusion

With Python, it is easy to convert Excel-formatted dates into standard datetime objects. This allows for proper handling and manipulation of dates in further applications and analysis.

Back to top