When sending and receiving express delivery address filling, we will often manually input the address to let the program intelligent identification, the standard address, such as xx city xx county/district XX road XX, but sometimes you can simply write: XX city XX county/district XX road XX, or XX province XX county/district XX road XX, or XX city XX road XX.
But some are not legitimate addresses, such as xx Street XX Province xx, or XX city XX province XX district XX.
So the question is, how do you know if an address is valid, and specifically, how do you program to know if a Chinese address is valid?
Although our brains can recognize it at a glance, it is not easy for a calculator to recognize it, the fundamental reason is that although the description of the address looks simple, it is still relatively complex context-dependent grammar.
For example, “Shanghai Beijing East Road XX, Nanjing Beijing East Road XX”, when scanning To Beijing East Road, whether the number behind it constitutes a correct address depends on the following, that is, the city name.
Fortunately, the context of the address is relatively simple and limited, although we can violently enumerate all provinces, cities, districts and streets. But what works is a finite state machine.
Every a finite state machine has a beginning and an end state, and a number of intermediate state, not on a arc with a state to the next state conditions, such as the current state of the above if it is a province, if you encounter a phrase and the area of into the area, if you meet a phrase and the city of entering the city.
An address is valid if it can go from the start state of the state machine to several intermediate states of the state machine and finally to the end state. Otherwise, it is invalid.
For example, xx City, xx province, XX District, XX is an invalid address, unable to walk from the city to the province.
Now we do this with a simple priority state machine, the code is annotated and easy to read
from enum import Enum
def isAddress(address: str) - >bool:
# define state
State = Enum("State"["STATE_INITIAL".# start
"STATE_PROVINCE".# province
"STATE_CITY".#,
"STATE_AREA".# district/county
"STATE_STREET".# street
"STATE_NUM".# no.
"STATE_END".# end
"STATE_ILLEGAL".# error status
])
def toAddressType(addr_slice : str) -> State:
if "Save" in addr_slice:
return State.STATE_PROVINCE
elif "The city" in addr_slice:
return State.STATE_CITY
elif "Area" in addr_slice or "County" in addr_slice:
return State.STATE_AREA
elif "The road" in addr_slice or "Street" in addr_slice:
return State.STATE_STREET
elif "No." in addr_slice:
return State.STATE_NUM
else:
return State.STATE_ILLEGAL
Define state transitions
transfer = {
It can be converted to provinces or municipalities initially
State.STATE_INITIAL: {
State.STATE_PROVINCE,
State.STATE_CITY,
},
Provinces can be transferred to cities or districts or counties
State.STATE_PROVINCE:{
State.STATE_CITY,
State.STATE_AREA,
},
# Cities can be subdistricts or streets
State.STATE_CITY: {
State.STATE_AREA,
State.STATE_STREET,
},
# Districts and counties can switch streets
State.STATE_AREA: {
State.STATE_STREET,
},
# Streets can be switched or terminated
State.STATE_STREET: {
State.STATE_NUM,
State.STATE_END,
},
The # sign can only be terminated
State.STATE_NUM: {
State.STATE_END,
},
}
st = State.STATE_INITIAL
for ch in address:
current_state = toAddressType(ch)
if current_state not in transfer[st]:
return False
st = current_state
return st in [State.STATE_STREET, State.STATE_NUM,State.STATE_END]
if __name__ == '__main__':
address1 = [Jiangsu Province."Suzhou".Wuzhong District.Zhongshan North Road."208"]
address2 = ["Suzhou".Wuzhong District.Zhongshan North Road."208"]
address3 = ["Suzhou"."Wujiang District".Zhongshan North Road."208"]
address4 = ["Suzhou"."Wujiang District"."208"]
address5 = ["Suzhou".Zhongshan North Road]
assert isAddress(address1)
assert isAddress(address2)
assert isAddress(address3)
assert isAddress(address5)
assert isAddress(address4) == False
Copy the code
Instead of dividing the entire address string, the address is directly written in the form of a list, mainly to illustrate the implementation and application of the state machine. The above code can only ensure that the address is valid from the format, but not ensure that the address is real and effective. If you want to judge that the address is real and effective, It is necessary to establish a hash table for all provinces, cities, districts, counties and streets in the country. The number of doors can be represented by the scope, and then the judgment of state transfer is made.
The transfer of the above code is a hash table, which is equivalent to enumerating all the correct transfer cases. It exhausts all the states that need to be escaped in any case and corresponding to any input.
The last word
This article shares how to implement a simple finite state machine, the code is more general, the above programming problem, let a person can not stop using this code to achieve, if it is helpful to you, please also like, pay attention to support, give people in the look, hand left lingering sweet.
Open source implementation of finite state machines:
- django-fsm
- python-state-machine
Follow me and learn a Little Python technique every day.