AI Kung Fu Master | Amit Solanki

Building AI for Kung fu master environment

Problem Statement:

The Kung-Fu Master environment presents a challenging reinforcement learning task centered around a classic arcade beat 'em up game. The primary objective for an agent is to navigate through the Evil Wizard's temple, defeat a variety of enemies, and ultimately rescue Princess Victoria.

The problem specifically involves training an agent to:

Engage in combat: The agent must learn to effectively utilize a discrete action space of 14 possible movements and attacks (including directional movement, punching, and kicking) to defeat incoming adversaries.

Progress through levels: The agent needs to develop strategies for advancing through the temple, overcoming successive waves of enemies and boss characters.

Manage health and resources: Implicitly, the agent must learn to avoid taking excessive damage from enemies to survive and reach the princess.

Optimize for score: While the ultimate goal is rescue, the agent's performance will be measured by its ability to achieve a high score, indicating efficient enemy dispatch and progression.

The challenge lies in developing an AI that can interpret real-time visual observations (an RGB image of 210x160x3 pixels) and translate them into a sequence of strategic actions to successfully defeat opponents, traverse the game world, and complete the game's objective, all within the constraints of the game's mechanics and enemy behaviors.

Project Details:

Description

You are a Kung-Fu Master fighting your way through the Evil Wizard’s temple. Your goal is to rescue Princess Victoria, defeating various enemies along the way.

Actions

KungFuMaster has the action space of Discrete(14) with the table below listing the meaning of each action’s meanings. To enable all 18 possible actions that can be performed on an Atari 2600, specify full_action_space=True during initialization or by passing full_action_space=True to gymnasium.make.

Observation Space

Atari environments have three possible observation types:

obs_type="rgb" -> observation_space=Box(0, 255, (210, 160, 3), np.uint8)
obs_type="ram" -> observation_space=Box(0, 255, (128,), np.uint8)
obs_type="grayscale" -> Box(0, 255, (210, 160), np.uint8), a grayscale version of the q”rgb” type

Project Key Flow Chart for Lunar Landing:

Installation of required packages and importing the libraries
- Installing Gymnasium
- Importing the Libraries

Building the AI
- Creating the Architecture of Neural Network

Training the AI
- Setting up the Environment
- Initializing the Hyperparameters
- Implementing the A3C class
- Initializing the A3C Agent
- Evaluating our A3C agent on a certain number of episodes
- Managing multiple environments simultaneously
- Training the A3C Agent

Visualizing the Results

Final Output: