CO3519 Tutorial 2.3: Calendar Week 46

Artificial Intelligence (CO3519) Tutorial 2.3: Calendar Week 46

2.3 Agent functions

Scenario description

The agent function Jupyter Notebook simulates a single agent moving on an 8x8 chessboard with a changing temperature field:

The agent can survive temperatures between -4 and +4, temperatures outside this range lead to the immediate death of the agent. In each step, the agent can perform one out of the following five possible actions (implemented as method calls to an object of the class Surroundings):

action_increment_x0(), i.e., moving one square in positive x₀ direction.
action_decrement_x0(), i.e., moving one square in negative x₀ direction.
action_increment_x1(), i.e., moving one square in positive x₁ direction.
action_decrement_x1(), i.e., moving one square in negative x₁ direction.
action_wait(), by which the agent remains on the present square until the next time step.

The changing temperature field is caused by moving heat sources which have an average velocity of about one half square per time step, theoretically giving the agent the opportunity to escape extreme temperatures in most cases. However, for decision making within the agent function, the agent only has access to the following percepts:

perceive_local_temperature(), which returns the temperature at the present square
perceive_temperature_x0_higher(), which returns a list with two elements: The temperature of the square with x₀ increased by one and by two, respectively.
perceive_temperature_x0_lower(), which returns a list with two elements: The temperature of the square with x₀ decreased by one and by two, respectively.
perceive_temperature_x1_higher(), which returns a list with two elements: The temperature of the square with x₁ increased by one and by two, respectively.
perceive_temperature_x1_lower(), which returns a list with two elements: The temperature of the square with x₁ decreased by one and by two, respectively.

In each time step, these percepts are automatically collected by the agent method sensor_input() and stored in the following properties of the agent object:

self._local_temperature, a floating-point number
self._percept_x0_higher, a list containing two floating-point numbers
self._percept_x0_lower, a list containing two floating-point numbers
self._percept_x1_higher, a list containing two floating-point numbers
self._percept_x1_lower, a list containing two floating-point numbers

The method live() simulates the life of the agent and returns the number of time steps after which the death has occurred.

The agent function

Decision making is done in the agent function, i.e., the method agent_function of the class Agent. The present solution for this from the notebook makes some use of the sensory percepts, but presumably not in an optimal way:

        if abs(self._percept_x0_higher[1]) < abs(self._local_temperature):
            self._environment.action_increment_x0()
        elif abs(self._percept_x0_lower[1]) < abs(self._local_temperature):
            self._environment.action_decrement_x0()
        elif abs(self._percept_x1_higher[1]) < abs(self._local_temperature):
            self._environment.action_increment_x1()
        elif abs(self._percept_x1_lower[1]) < abs(self._local_temperature):
            self._environment.action_decrement_x1()
        else:
            self._environment.action_wait()

Above, self._percept_x0_higher[1] and the other similar quantities contain the temperature two squares away (one square away would be self._percept_x0_higher[0], etc.).

The task is to improve this agent function such that, on average, the agent will survive this scenario for a longer time.

You may (but need not) extend the data stored by the agent, e.g., to remember previous percepts, which is not done at present, or to evaluate the percepts in an intelligent way. The scenario itself (Surroundings, temperature tolerance, etc.) may not be changed, and collection of the percepts need not be changed since this is already done by the sensor_input() method. The agent and the surroundings may not interact in any other way than by means of the five percepts and the five possible actions mentioned above.

Submission deadline: 4^th December 2021; discussion planned for 17^th December 2021. Group work by up to four people is welcome.