Pac Man: Understanding and Application of AI Search Algorithm (1)

Author: Light fire

Email address: [email protected]

UC Berkeley’s CS188: Introduction to AI is a well-structured and detailed course for getting started with AI. As the homework of its search algorithm chapter, Pac Man is more ingenious and exquisite in design, which is worthy of repeated study and thinking. Considering that there are only a few articles related to mining rare earth, the author plans to publish a series of articles to analyze Pac Man’s tasks in detail, so as to facilitate readers to learn and understand artificial intelligence and search algorithms.

Once you have the source code, the first step is to understand the overall structure of the project:

assetsA folder is used to store static resources. Initially, there is only one folder in itdemo.pngThe picture

The layouts folder houses map resources and allows us to make our own maps for testing. For a specific.lay file, it contains the following elements:
- %: stands for wall
- .: Bonus beans
- o: A capsule that can scare monsters
- P: The player’s (Pac-Man) initial position
- G: Initial position of monsters (multiple monsters can be placed)
By placing these elements, we can create a map of our own. This is a common way to store maps using text files for small games, and.lay is essentially the same as regular.txt.
Code files we need to read:
- $utils.py:$ It’s implemented insideStack,Queue,PriorityQueue,PriorityWithFunction,CounterAnd other data structures. Among themPriorityQueueIs implemented based on the small top heap, whose inner element is a triplet, but we only need to focus on the elementsitemAnd its prioritypriorityCan.PriorityQueueWithFunctionIt’s inheritedPriorityQueueAllows users to pass in custom evaluation functions.
- $pacman.py:$ Defines theGameStateClass and provides a set of interfaces through which you can not only learn the location and number of Pac-man and monster, but also specify themagentThe substates that occur after performing a particular action (critical in game problems). Of course, food, capsules, scores also support access. So, in general, yesGameStateClass, you can get the full state of the game.
- $game.py:$ Defines thePac ManSome of the basic classes of the game that need to be read are marked in the source code. Among them, special attention should be paid toGridClass, which will be used when rewriting evaluation functions.
We need to write the code file:
- $search.py:$ Here, we should implementDFS,BFS,UCS,A*The algorithm is applied to the path finding problem.
- SearchAgents. Py: searchAgents. Py: searchAgents. Py: at this point, we should be custom heuristic function, and in view of the maze, two concrete by modifying the cost function, let the pac-man as much as possible to get high marks.
- $multiAgents.py:$ Here, we should implementMinimaxThe algorithm,Alpha-BetaPrune and modify the rating function to create a smart Pac-Man mini-game.

If you are unfamiliar with some of the above terms, don’t worry, we will go through the basics and code structure in more detail later in the article.

Temporarily ignore the monster, respectively implement DFS, BFS, UCS, A* four search algorithms, let Pac Man eat A food in the maze.

This task requires us to write code in search.py. In fact, the file already declares that the above four functions should return a sequence of actions from which Pac-Man will act.

Each of the four functions needs to receive a problem parameter, which is actually an object of the PositionSearchProblem class in SearchAgents.py. It tells us where Pac-Man is and whether he has reached his destination.

By printing problem.getStartState(), we see that state refers to Pac-Man’s current position (x,y), as prompted by the source code comment. Given pac-Man’s mobility, we should use graph search and introduce a set of explorations to avoid expanding the same node.

Depth-first search

def depthFirstSearch(problem) :
    explored = set()
    result = util.Stack()
    frontier = util.Stack()

    result.push([])
    frontier.push(problem.getStartState())

    while True:
        if frontier.isEmpty():
            return []

        node = frontier.pop()
        action = result.pop()

        if problem.isGoalState(node):
            return action

        explored.add(node)
        children = problem.expand(node)

        for child in children:
            if child[0] not in explored and child[0] not in frontier.list:
                frontier.push(child[0])
                result.push(action + [child[1]])
Copy the code

Four search algorithms can be implemented by the above mode, but the data structure is different. forDFSWe’re used to writing it recursively, which is essentially taking advantage of the stack. If we do it iterativelyDFS, you need to manually open oneStackSimulate the behavior of a stack.
The challenge is to effectively record the search path, because ultimately what we need to return is a sequence of actions that should guide Pac-Man from the beginning to the end. By readingproblem.expandFunction source, know that the function return value for alistAnd thelistEach element in is one(child, action, stepCost)Triplet, whereactionRepresents theparentMoving tochildThe steps that need to be taken, that’s what we need to document. So, a straightforward way to do this is to just add this triple tofrontier, and then layer by layer maintenanceaction, let it representThe steps required to move the position from the starting point. This method is universal, and we’ll do it inUCSThe practice is adopted in.
However, we used an extra one hereresultStack, for tracingfrontierIn and out. In fact, the core point of the recording path is thatWe’re going toparentPart of the content is moved tochildFrom, and then fromparentHow to get to thechild, the path is recorded. This is also in the coderesult.push(action + [child[1]]The meaning of,actionIs afterparentHow do you get there from the starting pointparent.[child[1]]saidparenttochild“, splicing the two together, is from the starting point to the presentchildThe action sequence of.
willresultIs selected as andStack, can be synchronizedfrontierTo ensure that when the final state is searched,result popOut of theactionIt is also the path from the beginning to the end.

Width first search

def breadthFirstSearch(problem) :
    explored = set()
    result = util.Queue()
    frontier = util.Queue()

    result.push([])
    frontier.push(problem.getStartState())

    while True:
        if frontier.isEmpty():
            return []

        node = frontier.pop()
        action = result.pop()

        if problem.isGoalState(node):
            return action

        explored.add(node)
        children = problem.expand(node)

        for child in children:
            if child[0] not in explored and child[0] not in frontier.list:
                frontier.push(child[0])
                result.push(action + [child[1]])
Copy the code

As mentioned above,BFSAnd the implementation ofDFSSame thing, but willLIFOtheStackReplace toFIFOtheQueue. Because we generally use iteration to do thisBFS, so the above code looks more natural.
In contrast toDFS, layer by layer searchBFSThe global optimal solution can be guaranteed. Therefore, the actual run time can be found and utilizedBFSYou get more points thanDFSA few higher. But on the other hand,BFSOn average, it takes longer and consumes more memory.
We used oneQueueTo synchronize the tracking.frontierThe joining and leaving of the team. The above code can be used as a template program for scenarios with similar requirements.

Uniform cost search

def uniformCostSearch(problem) :
    explored = set()
    frontier = util.PriorityQueue()
    initial = (problem.getStartState(), [], 0)

    frontier.push(initial, 0)

    while True:
        if frontier.isEmpty():
            return []

        (node, result, value) = frontier.pop()

        if problem.isGoalState(node):
            return result

        explored.add(node)
        children = problem.expand(node)

        for child, action, cost in children:
            if child not in explored:
                temp = value + cost
                frontier.push((child, result + [action], temp), temp)
Copy the code

by $Dijkstra$ Proposed consistent cost searchUCSIt can be interpreted as the contour lineBFSBecause it is based on the root point to the current nodecostTo expand. thiscostIs true and definite, and should be distinguished from the evaluation value obtained by using heuristic function in the following paper.
If you want to rely oncostFor the node out of the queue and child node expansion, then the traditionalQueueIt was no longer enough for our needs, so we used a small top heap implementationPriorityQueueEach layer expandsfrontierThe node with the lowest cost in. Of course, since the priority queue data structure has been implemented in the source code, we can directly call. Here attachedPriorityQueueSource code, I personally think the implementation is quite wonderful.

class PriorityQueue:
    """ Implements a priority queue data structure. Each inserted item has a priority associated with it and the client is usually interested in quick retrieval of the lowest-priority item in the queue. This data structure allows O(1) access to the lowest-priority item. """
    def  __init__(self) :
        self.heap = []
        self.count = 0

    def push(self, item, priority) :
        entry = (priority, self.count, item)
        heapq.heappush(self.heap, entry)
        self.count += 1

    def pop(self) :
        (_, _, item) = heapq.heappop(self.heap)
        return item

    def isEmpty(self) :
        return len(self.heap) == 0

    def update(self, item, priority) :
        # If item already in priority queue with higher priority, update its priority and rebuild the heap.
        # If item already in priority queue with equal or lower priority, do nothing.
        # If item not in priority queue, do the same thing as self.push.
        for index, (p, c, i) in enumerate(self.heap):
            if i == item:
                if p <= priority:
                    break
                del self.heap[index]
                self.heap.append((priority, c, item))
                heapq.heapify(self.heap)
                break
        else:
            self.push(item, priority)
Copy the code

One thing to notepriorityandcostIs negatively correlated,costThe lower,priorityThe higher, therefore, inupdateIn the delta function, if we find the original deltapThan the new incoming parameterpriorityIf it is lower, it proves that the original path is better, so it is directbreak, not to update.
Go back toUCSCode implementation, this time wepushtofrontierThe element is a triplet, and the purpose is of course to record the path:

initial = (problem.getStartState(), [], 0)
frontier.push(initial, 0)
Copy the code

The reason for not using the aboveDFSandBFSIs recorded because in addition toaction, path costcostYou have to add up again. However, with this logging method, there is no need to call the source codeupdateDelta function, because even thoughstateThe same,pathandcostAlso different, so directpushGo to the priority queue.

A * search

def aStarSearch(problem, heuristic=nullHeuristic) :
    explored = set()
    frontier = util.PriorityQueue()
    initial = problem.getStartState()
    tot = heuristic(initial, problem)
    frontier.push((initial, [], tot), tot)

    while True:
        if frontier.isEmpty():
            return []

        (node, result, value) = frontier.pop()

        if problem.isGoalState(node):
            return result

        explored.add(node)
        children = problem.expand(node)

        for child, action, cost in children:
            if child not in explored:
                tmp = value + cost + heuristic(child, problem)
                frontier.push((child, result + [action], tmp), tmp)
Copy the code

A*The search isUCSandGreedy SearchA combination ofGreedy SearchIs to search entirely based on heuristics. This method does not guarantee top priority and completeness.A*Algorithm code structure andUCSIt’s basically the same, but with the addition of heuristic functionsheuristic. I first came into contact with the concept of heuristics when I was learning eight digits in my freshman year. It was a relatively basic problem. However, heuristics are ubiquitous, and even operational research transport planning uses them to quickly obtain initial base-feasible solutions. A good heuristic function can greatly optimize the original algorithm on the constant and speed up the search. But even so,A*The algorithm is still exponentially complex, but in large state space problems, it is much faster than the naive search algorithms described above.
If we don’t callheuristicSo here’s theA*The algorithm degenerates intoUCS. Therefore, we need to design our own heuristic function forA*In terms of algorithms, heuristic functions need to satisfy two properties:
- Admissibility: For any node, the estimate obtained by the heuristic function should $\leq$ The real path cost of the point to the termination state, i.e $heuristic\quad cost \leq actual\quad cost$ ;
- ConsistencyIt is equivalent to the triangle inequality.
For every node NNN and every n’n ‘of NNN generated by any action AAA, The estimated cost of reaching the goal from NNN is no greater than the step cost of getting to N ‘n ‘n ‘plus the Estimated cost of reaching the goal from n’n ‘
Under the premise of not exceeding the true path cost, the larger the value calculated by the heuristic function, the better. So forPac ManThe Hamiltonian distance is better than Euclidean distance.

def yourHeuristic(position, problem, info={}) :
    goal = problem.goal
    return abs(position[0] - goal[0]) + abs(position[1] - goal[1])
Copy the code

At this point, the task of the Pac Man job is complete. At present, we solve a simple pathfinding problem by using four search algorithms. In task 2, we will have a closer understanding of cost functions and project source code, while in task 3, we will be introduced to game problems, using Minimax algorithm, alpha-beta pruning, heuristic evaluation function to achieve a truly intelligent pac-Man mini-game.

Pac Man: Understanding and Application of AI Search Algorithm (1)

Depth-first search

Width first search

Uniform cost search

A * search

Related Posts

[Optimization algorithm] improved LMS algorithm -SVSLMS algorithm

Repetition of Pytorch and Tensorflow training results

Interview | $410 million of financing, valuation of $1.5 billion, the AI start-up company why so expensive?