preface
Time format is inescapable knowledge for any engineer, and so is reptile engineer. Crawler engineers need to store the same type of content from different websites in the same data table. Common examples are:
* The time format of site A is 2018-5 * The time format of site B is 3 days ago * The time format of site C is 5-10 8:25Copy the code
What time format should be used for database storage? Should I select DATE or DATETIME when creating a field? What is YEAR?
Python has built-in time functions such as time and datetime. When to use time? When do you choose DateTime?
Mysql > alter table time
When creating a database table, there are usually 5 field types to choose from: TIME, DATE, DATETIME, TIMESTAMP, and YEAR.
The storage space and time format of each type are as follows:
- TIME type: Storage space [3 bytes] - TIME format [HH:MM:SS] - TIME range [-838:59:59 to ~ 838:59:59] - DATE type: Storage space [3 bytes] - Time format [YYYY-MM-DD] - Time range [1000-01-01 to 9999-12-31] - DATETIME Type: Storage space [8 bytes] - Time format [YYYY-MM-DD HH:MM:SS] - Time range [1000-01-01 00:00:00 to 9999-12-31 23:59:59] - TIMESTAMP type : Storage space [4 bytes] - Time format [YYYY-MM-DD HH:MM:SS] - Time range [1970-01-01 00:00:01 to 2038-01-19 03:14:07] (in seconds) - YEAR Type: Storage Space [1 bytes] - Time format [YYYY] - Time range [1901 to 2155](in years)Copy the code
TIME formats like YEAR are rarely used, while TIME is also rarely used. The common ones are DATE, DATETIME and TIMESTAMP.
Python 的 time
Python provides three time functions: the time module, datetime, and Calendar. Python’s Time module has many functions to convert common date formats. For example, the time.time() function is used to get the current timestamp:
import time
timestamp = time.time()
print(timestamp, type(timestamp))
Copy the code
The output timestamp is of type float:
1544788687.041193 < class'float'>
Copy the code
Timestamp units are best for date operations. But dates before 1970 can’t be represented that way. Dates that are too far away won’t work either; UNIX and Windows only support up to 2038. The Time module contains the following built-in functions, both for Time processing and for converting Time formats:
Python’s calendar
Calendar functions are calendar-related, with Monday being the default first day of the week and Sunday the default last day. Changing the Settings calls the calendar.setfirstweekday() function. The module contains the following built-in functions:
The Calendar module, Calendar, is used less often (in practice in crawler and Django development). The time module and dateteime module appear more. What is the relationship between time and datetime?
The -time module -- which is closer to the underlying -datetime module -- adds a lot of functionality based on time and provides more functionsCopy the code
Use the compare
1. Get the current time
import datetime, time
"""Cui Qingcai | Jingmi, Wei Shidong | Quinn invites you to follow the wechat official number [Attack Coder]"""
print(time.time())
print(datetime.datetime.now())
Copy the code
The resulting output is:
1544789253.025471 the 20:07:33 2018-12-14. 025502Copy the code
2. Format the current time
import datetime, time
"""Cui Qingcai | Jingmi, Wei Shidong | Quinn invites you to follow the wechat official number [Attack Coder]"""
"""Time Current time"""
localtime = time.localtime(time.time())
print("Current time tuple :", localtime)
print("No formatting :", time.time())
res1 = time.strftime('%Y-%m-%d', localtime)
print("Strftime formats time as a date :", res1)
res2 = time.strftime('%Y-%m-%d %H:%M:%S', localtime)
print("Strftime converts time to date and time :", res2)
# -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
"""Datetime Current time"""
time_now = datetime.datetime.now()
res3 = datetime.datetime.now().strftime("%Y-%m-%d")
res4 = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
print("Unformatted current time :", time_now)
print("Datetime can also be used like this :", res3)
print("Datetime can also be used like this :", res4)
Copy the code
The result is:
Current time tuple: time.struct_time(tm_year=2018, tm_mon=12, tm_mday=14, tm_hour=20, tm_min=18, tm_sec=11, tm_wday=4, tm_yday=348, Tm_isdst =0) unformatted: 1544789891.681039 strfTime can convert time to date and time: 2018-12-14 20:18:11.681079 datetime 2018-12-14 20:18:11.681079 datetime 2018-12-14 20:18:11.681079 The 2018-12-14 20:18:11Copy the code
As you can see here, neither module gets a time that is human-readable, and both need to be formatted using the strftime function.
3. Text time conversion
Here I’m referring to the time of other websites that crawlers get, usually in several formats:
- Long time — 2018-01-06 18:35:05, 2018-01-06 18:35
- Date – the 2018-01-06
- Month — January 2018
- Time – 18:35:05
All the time that crawlers get is for people to read, just with different delimiters. At the time of storage, reptile engineers hope that their time format should be unified, such as year month day hour minute second or year month day day. If possible, time stamp can be used for convenient calculation (year month day hour minute second corresponds to year month day hour minute second, year month day hour minute second cannot be directly converted into year month day hour minute second).
If the date type is 2018-01-06, you cannot use the function to change the time format directly to the long time format (such as 2018-01-06 18:35:05). In this case, there is no way to operate directly, but to convert the time. Conversion is divided into two types, the same time format conversion and different time format conversion:
Case one
Target: 2018-01-06 18:35:05 converted to 2018/01/06 18:35:05
It can be satisfied in two ways
The logic of method 1 is that the time conversion of different formats should first be converted to the time array, and then the time array is formatted to the desired type:
import datetime,time
a = "The 2013-10-10 23:40:00" # want to convert to a = "2013/10/10 23:40:00"
timeArray = time.strptime(a, "%Y-%m-%d %H:%M:%S")
otherStyleTime = time.strftime("%Y/%m/%d %H:%M:%S", timeArray)
print(timeArray)
print(otherStyleTime)
Copy the code
From the output results:
time.struct_time(tm_year=2013, tm_mon=10, tm_mday=10, tm_hour=23, tm_min=40, tm_sec=0, tm_wday=3, tm_yday=283, tm_isdst=-1)
2013/10/10 23:40:00
Copy the code
As you can see, I first use time.strptime to convert it to a time array, and then use time.strftime to format the time array into the format I want.
Method 2, since the final formatting time is also a string STR, we can also use replace directly when this is the case:
a = "The 2013-10-10 23:40:00" # want to convert to a = "2013/10/10 23:40:00"
print(a.replace("-"."/"))
Copy the code
The output is:
2013/10/10 23:40:00
Copy the code
The second scenario
Target: 2018-01-06 converted to 2018-01-06 18:35:05
It also has two ways to satisfy it
Its logic is to concatenate the year, month and day strings into hours, minutes and seconds, and then convert them using the above two methods, for example:
a = "2013-10-10" # want to convert to a = "2013/10/10 23:40:00"
ac = a + "00:00:00"
print(ac.replace("-"."/"))
Copy the code
Get the output
2013/10/10 00:00:00
Copy the code
The third scenario
Target: 2018-01-06 18:35:05 converted to 2018-01-06
Convert to time array, and then format by time array:
import datetime,time
a = "The 2013-10-10 23:40:00" # want to convert to a = "2013/10/10"
timeArray = time.strptime(a, "%Y-%m-%d %H:%M:%S")
otherStyleTime = time.strftime("%Y/%m/%d", timeArray)
print(type(timeArray))
print(otherStyleTime)
Copy the code
TimeArray is of type time.struct_time.
<class 'time.struct_time'>
2013/10/10
Copy the code
4. Time comparison operation
We know that strings cannot be compared, so we need to use a different format. Time strptime to time array is not operable, but datetime is.
First, the time format is the same
import datetime,time
d1 = datetime.datetime.strptime('the 2012-03-05 17:41:20'.'%Y-%m-%d %H:%M:%S')
d2 = datetime.datetime.strptime('the 2012-03-05 16:41:20'.'%Y-%m-%d %H:%M:%S')
delta = d1 - d2
print(type(d1))
print(delta.seconds)
print(delta)
Copy the code
The resulting output is:
<class 'datetime.datetime'>
3600
1:00:00
Copy the code
Can be seen from the results, the format of the same two kinds of time, can pass a datetime. Datetime. Strptime transform again after operation, in the result can pass. Seconds to calculate number of seconds and through the difference. Days differ to calculate the number of days
The second, if the time format is different, but the converted type is the same, can also be compared:
import datetime,time
d1 = datetime.datetime.strptime('2012/03/05 17:41:20'.'%Y/%m/%d %H:%M:%S')
d2 = datetime.datetime.strptime('the 2012-03-05 16:41:20'.'%Y-%m-%d %H:%M:%S')
delta = d1 - d2
print(delta.seconds)
print(delta)
Copy the code
In this code, the time is a different string, but it can be compared using the same function.
Third, the calculation of year, month, day, hour, minute, second and year, month, day is actually the same principle, after conversion, their format is the same, so it can be calculated, 2012/03/05 17:41:20 and 2012-03-05 time difference:
import datetime,time
d1 = datetime.datetime.strptime('2012/03/05 17:41:20'.'%Y/%m/%d %H:%M:%S')
d2 = datetime.datetime.strptime('2012-03-01'.'%Y-%m-%d')
delta = d1 - d2
print(delta.days,delta.seconds)
print(delta)
print(type(delta))
Copy the code
The output is
4 63680
4 days, 17:41:20
<class 'datetime.timedelta'>
Copy the code
The result of print is ‘datetime.timedelta’, not STR. The result of print is ‘datetime.timedelta’
For example, counting the time after 3 days:
import datetime,time
now = datetime.datetime.now()
delta = datetime.timedelta(days=3)
n_days = now + delta
print(type(n_days))
print(n_days.strftime('%Y-%m-%d %H:%M:%S'))
Copy the code
The result is:
<class 'datetime.datetime'>
2018-01-21 10:26:14
Copy the code
Take datetime.timedelta for 3 days, then add the current time to 3 days to get ‘datetime.datetime’, which is formatted for human reading by strftime. Finally get what you want.
5. Timestamp
Convert string time to timestamp:
import datetime,time
a = "The 2013-10-10 23:40:00"
# Convert to time array
timeArray = time.strptime(a, "%Y-%m-%d %H:%M:%S")
# convert to timestamp:
timeStamp = time.mktime(timeArray)
print(timeArray)
print(timeStamp)
Copy the code
The output is:
time.struct_time(tm_year=2013, tm_mon=10, tm_mday=10, tm_hour=23, tm_min=40, tm_sec=0, tm_wday=3, tm_yday=283, 1381419600.0 tm_isdst = 1)Copy the code
You can see that the time array of time and the timestamp are not the same thing, they are different
Strftime and strptime
These two are commonly used in Python
The strftime function:
- The function receives a time tuple and returns the local time as a readable string in the format of the argument format.
- time.strftime(format[, t])
- Format — Format string. T — The optional argument t is a struct_time object.
- Returns the local time as a readable string.
import time
t = (2009, 2, 17, 17, 3, 38, 1, 48, 0)
t = time.mktime(t)
print(time.strftime("%Y-%m-%d %H:%M:%S", time.gmtime(t)))
Copy the code
Get the resulting output:
The 2009-02-17 09:03:38Copy the code
The strptime() function parses a time string into a time tuple according to the specified format. Time. strpTime (string[, format]) string — Time string. Format — Formats a string. Return the struct_time object.
import datetime,time
d1 = datetime.datetime.strptime('20120305 17:41:20'.'%Y%m%d %H:%M:%S')
d2 = datetime.datetime.strptime('2012-03-01'.'%Y-%m-%d')
print(d1)
print(d2)
Copy the code
Results obtained:
The 2012-03-05 17:41:20 2012-03-05 00:00:00Copy the code
Time format and warehousing
So much foreshadowing in front, the ultimate purpose or need to warehouse. Here are four database time types:
- Field name => Data type
- r_time => time
- r_date => date
- r_datetime => datetime
- r_timestamp => timestamp
The time format of Mysql is as follows:
- Time format => Data type
- 17:35:05 => time
- 2018-3-1 => date
- 2018/3/1 17:35 => datetime
- 2018/3/1 17:35 => timestamp
Time type
The format of the time type is 17:35:05, which cannot be replaced by (17-35-05 or 17/35/05)
It can be shortened to 17:35, the database will automatically complete 00, and the final data is 17:35:00 after storage
If the abbreviation is 17, it becomes 00:00:17 after storage
Of course, the more exotic 17:,17:35: will give you an error
The date type
The format of the date type is specified as 2018-3-1 and 2018/3/1, and the final entry format is (2018-03-01), which is auto-complete
Can be abbreviated as [18/3/1], [17/3/1], [07/3/1], [97/3/1], the database will automatically complete the previous years, the final data 2018-03-01, 2017-03-01, 2007-03-01, 1997-03-01 after entering the database
The value cannot be abbreviated to [2017] or [2017/3]. Errors may be reported. The value must be a complete date format
A datetime type
The format of the datetime type is specified as 2018-3-1 17:35:00 and 2018/3/1 17:35:00, and the final entry format is 2018-03-01 17:35:00
It is a combination of date and time and has many common features
Can be abbreviated as [18/3/1 17:35:05], [17/3/1 caught], [17] 07/3/1, [17] 97/3/1, The database will automatically complete the previous years, and the final data is 2018-03-01 17:35:05, 2017-03-01 17:35:00, 2007-03-01 17:00:00, and 1997-03-01 17:00:00. As you can see, it automatically completes the time format into a unified format. The difference here is that if you write 17 instead of minutes and seconds, time will default to 17 as seconds, whereas in this case, it will default to hours.
Like Date, year, month and day cannot be omitted. They must be in year, month and day format
Timestamp type
According to the above description, the format of timestamp is the same as datetime, but the difference is the time range and storage space. Its format and usage are the same as datetime