AI PAC - MAN | Amit Solanki

Build AI Pac-Man with Convolutional Neural Network

Problem Statement:

The Ms. Pac-Man environment presents a classic reinforcement learning challenge, requiring an agent to master the game of Ms. Pac-Man. The core objective is to achieve the highest possible score by:

Collecting all pellets: The primary goal is to consume all small and large pellets scattered across the maze.

Evading ghosts: Four ghosts (Blinky, Pinky, Inky, and Clyde) constantly pursue Ms. Pac-Man, and contact with them results in losing a life.

Utilizing power pellets: Consuming large "power pellets" temporarily turns the ghosts vulnerable, allowing Ms. Pac-Man to eat them for bonus points.

Strategic navigation: The agent must learn optimal paths through the maze to efficiently collect pellets and avoid or strategically consume ghosts.

The problem specifically involves training an agent to select actions from a discrete space of 9 possible movements (including diagonals and no-operation) based on visual observations of the game screen (an RGB image of 210x160x3 pixels). The challenge is to develop an AI that can learn complex strategies for pathfinding, risk assessment, and timing to maximize its score across various game difficulties and modes, adapting to the dynamic environment presented by the moving ghosts and changing pellet configurations.

Project Details:

Description

Your goal is to collect all of the pellets on the screen while avoiding the ghosts.

Actions

MsPacman has the action space of Discrete(9) with the table below listing the meaning of each action’s meanings. To enable all 18 possible actions that can be performed on an Atari 2600, specify full_action_space=True during initialization or by passing full_action_space=True to gymnasium.make.

Observation Space

Atari environments have three possible observation types:

obs_type="rgb" -> observation_space=Box(0, 255, (210, 160, 3), np.uint8)
obs_type="ram" -> observation_space=Box(0, 255, (128,), np.uint8)
obs_type="grayscale" -> Box(0, 255, (210, 160), np.uint8), a grayscale version of the q”rgb” type

See variants section for the type of observation used by each environment id by default.

Starting State

The lander starts at the top center of the viewport with a random initial force applied to its center of mass.

Episode Termination

The episode finishes if:

the lander crashes (the lander body gets in contact with the moon);
the lander gets outside of the viewport (x coordinate is greater than 1);
the lander is not awake. From the Box2D docs, a body which is not awake is a body which doesn’t move and doesn’t collide with any other body:
1. When Box2D determines that a body (or group of bodies) has come to rest, the body enters a sleep state which has very little CPU overhead. If a body is awake and collides with a sleeping body, then the sleeping body wakes up. Bodies will also wake up if a joint or contact attached to them is destroyed.

Project Key Flow Chart for Lunar Landing:

Installation of required packages and importing the libraries
- Installing Gymnasium
- Importing the Libraries

Building the AI
- Creating the Architecture of Neural Network

Training the AI
- Setting up the Environment
- Initializing the Hyperparameters
- Preprocessing the Frames
- Implementing the DCQN Class
- Initializing the DCQN Agent
- Training the DCQN Agent

Visualizing the Results

Final Output: