This article will discuss best practices for parsing dates and getting a specific month, day, or year in the past or the future.
We will use different Python libraries to manipulate dates and times. Apart from the standard datetime package (documentation), we will leverage other modules like dateparser and dateutil to extract localized dates in strings.
For example, the packages should be able to parse strings like “20 days ago”, “two months and two days ago”, “yesterday”, “445 days ago at noon”, et cetera.
Getting the first and the last day of any month
In this case, the datetime package can do the job. All we need to do is get the date today and then set the day to 1.
1 2 3 4 |
from datetime import datetime, date #import packages, and functions first_day = datetime.today().replace(day=1) print(first day) |
Output:
2022-05-01 09:11:00.894081
If you are only interested in the date and not the time, you can call date.today() in line two, then set the day value to 1.
Syntax:
<datetime.datetime(year, month, day, hour=0, minute=0, second=0, microsecond=0, tzinfo=None, …)>
<datetime.date(year, month, day)>
We can get the last day of the month using an inbuilt package called calendar in Python. Precisely, we will make use of the monthrange() function. This function takes the year and month and returns a tuple of the two values – the weekday of the first day of the given month and the number of days in the month. The weekday is integer coded from 0 being a Monday through 6 being Sunday.
Syntax:
<weekday_first_day, number_of_days = calendar.monthrange(year, month)>
1 2 3 4 5 6 |
import calendar a = calendar.monthrange(2024, 2) # February of a leap year b = calendar.monthrange(2022, 5) # May of 2022 print(a) print(b) |
Output:
(3, 29) (6, 31)
From this example, we can note that February 2024 has 29 days (as expected, being a leap year) and that the first day of the month is 3 (Thursday).
We can then use the calendar.monthrange() function to get the last day of the month as follows:
1 2 3 4 5 6 7 |
from datetime import date, timedelta import calendar year, month = date.today().year, date.today().month #current year and month #date.today() = 2022-05-17 last_day_of_prev_month = date.today().replace(day=calendar.monthrange(year, month)[1]) - timedelta(days=1) print(last_day_of_prev_month) |
Output:
2022-05-30
Note that we have just borrowed the ideas we already know from how we got the month’s first day and used the calendar.monthrange() function to pick the number of days in a month for the last day of the month.
Getting the last day of the previous month
Since we already know how to get the first day of the month, we can use the timedelta() function to subtract one day from that.
<datetime.timedelta(days=0, seconds=0, microseconds=0, milliseconds=0, minutes=0, hours=0, weeks=0)>
1 2 3 4 |
from datetime import datetime, date, timedelta last_day_of_prev_month = date.today().replace(day=1) - timedelta(days=1) print(last_day_of_prev_month) |
Output:
2022-04-30
Getting the day of the week for a given date
Let us discuss two approaches here:
Approach #1 Using date.strftime(format)
The function strftime() returns a string representation of the given date. An explicit format string controls it. To get the weekday, we will use the %A directive as follows (For a complete list of formatting directives, you can read the documentation of strftime)
1 2 3 4 |
from datetime import datetime, date print(datetime.today().strftime("%A")) print(date(2021, 11, 21).strftime("%A")) |
Output:
Tuesday Sunday
Today is Tuesday, based on the first line and November 11, 2021, was on Sunday.
Approach #2 Using date.weekday()
This function returns the day of the week as an integer, where Monday is 0 and Sunday is 6. We can proceed to convert the integer into the full name accordingly if we choose to. We can use a dictionary to do this conversion or calendar.day_name[].
1 2 3 4 5 6 |
from datetime import date intd = date(2022, 1, 30).weekday() days_week = ["Monday", "Tuesday", "Wednesday", "Thursday",\ "Friday", "Saturday", "Sunday"] print(days_week[intd]) |
Or,
1 2 3 4 5 |
import calendar from datetime import date intd = date(2022, 1, 30).weekday() print(calendar.day_name[intd]) |
Output:
Sunday
Smart Dates
This section will discuss the parsing of localized dates found in string formats that may not fit the string format required by datetime.strftime(), which we discussed in the previous section.
We will be parsing relative dates like “tomorrow”, “in 20 days”, “2 years and 2 weeks ago”, “yesterday”, etc. Let’s now discuss the two packages we can use to parse relative dates
Used case: Parsing of relative dates is crucial when dealing with dates that have been recorded in different formats.
Parsing relative dates using the dateparser library
The parse() function in dateparser can parse relative dates by factoring features like time zones, language, lookup dates in long strings, and even supporting different calendar systems. The general syntax for the parser is
Syntax:
<dateparser.parse(date_string, date_formats=None, languages=None, locales=None, region=None, settings=None, detect_languages_function=None)>
Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
import dateparser from datetime import timedelta date1 = dateparser.parse('12/12/12') print(date1) #2012-12-12 00:00:00 # No time specified so defaulting to 0H. date2 = dateparser.parse("today EST") print(date2) #2022-05-17 13:23:29.404542-05:00 # time now at Eastern Time Zone time zone date3 = dateparser.parse("aujourd'hui +3.00", languages=["fr"]) print(date3) # 2022-05-17 21:23:29.429235 # time now with today written in french and time zoning having # +3 hours offset. date4 = dateparser.parse('12 August 2012 at 11:02am', languages=['en']) print(date4) #2012-08-12 11:02:00 date5 = dateparser.parse('next month', languages=['en']) print(date5) #2022-06-17 21:23:29.433097 #same day next month date6 = dateparser.parse('in 2 months', languages=['en']) print(date6) #2022-07-17 21:23:29.434164 date7 = dateparser.parse('2 years, 2 months and 2 weeks ago') print(date7) # 2020-03-03 21:23:29.435484 date8 = dateparser.parse('2 years, 2 months and 2 weeks ago 2hours') print(date8) #2020-03-03 19:23:29.436993 # Worked fine even with shoddy string description date9 = dateparser.parse('445 days ago midnight', languages=["en"]) print(date9) #2021-02-26 00:00:00 |
Parsing dates with dateutil package
Like dateparser.parser(), parse() function in dateutil is used to parse dates to remove ambiguity in date formats in the dataset. Here are some examples of what dateutil can do
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
from dateutil.parser import parse from datetime import datetime default = datetime(year=2022, month=5, day=2) # setting default date explicitly # otherwise, the default is set to today's date at 0H date1 = parse("Wed Sep 30") print(date1) # 2022-09-30 00:00:00 date1 = parse("Wed Sep 30 at 2:09pm", default=default) print(date1) #2022-09-30 14:09:00 date1 = parse("October 1") print(date1) #2022-10-01 00:00:00 date1 = parse("1pm May 3") print(date1) #2022-05-03 13:00:00 date1 = parse("13hours May 3") print(date1) #2022-05-03 13:00:00 date1 = parse("13 May 1:00pm") print(date1) #2022-05-13 13:00:00 date1 = parse("13 May 13:00") print(date1) #2022-05-13 13:00:00 |
That is just an intro to what this beautiful package can do. You can read more about the different uses of dateutil in the documentation.