UCLan, CO3519 (2021/22), semester 1: Glossary

Agent

An agent is a system that interacts with its surroundings. It receives percepts through sensors and can carry out actions through actuators.

See also: Inductive reasoning, Knowledge base, Rationality, Turing test.

Dimension

Colloquially, the dimension of a space, set, or object is very clear to us: A line or curve is one-dimensional, hence it has dimension 1; surfaces are two-dimensional, they have dimension 2; volumes have dimension 3; and so on. There are two major ways in defining the dimension of something that is, in the broadest possible sense, a geometrical object:

See also: Hypothesis, Optimization parameter, Optimization objective, Pareto optimality.

Hypothesis

In machine learning, a hypothesis is a function y = f(x0, x1, …) that predicts an outcome variable y on the basis of values of one or multiple independent variables x0, x1, …

Colloquially it is also possible and common to refer to the hypothesis space as "the hypothesis."

See also: Dimension, Linear regression, Optimization objective, Optimization parameter, Overfitting, Regression analysis, Supervised learning, Validation and testing.

Inductive reasoning

There are three major ways for an intelligent agent to acquire knowledge, but only one of them - namely, inductive reasoning - is commonly called learning in an AI context. These three ways to improve the agent's understanding of the world are:

  1. Direct input as part of the percepts received through the agent's sensors. Depending on the kind of agent, this may include observation of the surroundings or data ingest by an authorized user. For example:
  2. Logical or mathematical reasoning by which the agent explores the consequences of its axioms, i.e., of the propositions that it accepts as true to begin with. All that can be proven to be true on that basis must also be accepted as true. This way of thinking is called deductive reasoning. a) Jack talked to me yesterday, b) Jack is a human, and c) humans can only talk after they have been born, therefore d) Jack was born yesterday or before. If the three premises a), b), and c) are indeed true, then the consequence d) must be true as well. This is called automated reasoning if it is done by an algorithm.
  3. Detecting patterns and trends, i.e., correlations between phenomena, in the knowledge available to the agent. This can be done to better explain and understand that knowledge, creating a mental model of it. It can also be done in order to predict the behaviour of an observed system under conditions for which no data have been provided so far. Normally, both of these goals are pursued at the same time. This is inductive reasoning; it is called machine learning if it is done by an algorithm.

Colloquially, all the three items above might be called "learning," since they are ways of expanding an agent's knowledge. For example, learning from a book or a teacher is of the first type, whereas studying mathematics is often of the second type. In AI, the term learning and particularly machine learning is typically understood to refer to inductive reasoning only, whereas reasoning without any further qualification usually means deductive reasoning.

See also: Agent, Knowledge base, Regression analysis, Supervised learning, Validation and testing.

Knowledge base

"The central component of a knowledge-based agent is its knowledge base" (Russell & Norvig 2021, p. 227). Interactions with a knowledge base take two forms:

Knowledge bases are typically designed to support deductive reasoning (logical inference and theorem proving).

See also: Agent, Inductive reasoning.

Linear regression

Linear regression is the most common way of conducting regression analysis. It considers the hypothesis space where the model is linear in terms of all the independent variables: The outcome variable is expressed as a linear combination of the other variables.

See also: Hypothesis, Regression analysis, Root mean square deviation.

Optimization objective

An optimization objective is a quantity that is used to formulate preferences for the outcome of a decision making scenario. In case of a maximization objective, greater values are preferred, and in case of a minimization objective, smaller values are preferred.

See also: Dimension, Hypothesis, Optimization parameter, Pareto optimality, Rationality, SMART objective.

Optimization parameter

In decision making, an optimization parameter is a quantity over which the decision maker has direct control; a parameter value (or parameterization) is selected in order to obtain the best possible outcome for the optimization objective(s).

See also: Dimension, Hypothesis, Optimization objective, Pareto optimality.

Overfitting

"We say a function is overfitting the data when it pays too much attention to the particular data set it is trained on, causing it to perform poorly on unseen data." Obversely, "a hypothesis is underfitting when it fails to find a pattern in the data" even though such a pattern is actually present (Russell & Norvig 2021, p. 673). Overfitting leads to a model that has excellent agreement with the training data, but poor predictive quality for the validation data. Therefore, such models can be eliminated during validation if they are compared against other, simpler models that do not exhibit overfitting.

See also: Hypothesis, Regression analysis, Supervised learning, Validation and testing.

Pareto optimality

In multicriteria optimization (MCO), rational compromises between multiple conflicting optimization objectives are characterized by Pareto optimality. A point in objective space is Pareto optimal if it is accessible and no other accessible point in objective space dominates it.

See also: Dimension, Optimization parameter, Optimization objective, Rationality, SMART objective.

Rationality

A rational agent is an agent that exhibits a tendency toward maximizing a performance measure (or minimizing it, depending on how it is formulated). In particular, rational preferences, or decisions and choices made by a rational agent, satisfy a series of constraints including, but not limited to the following (Russell & Norvig 2021, p. 520).

For a more complete and more mathematically oriented discussion of rational choice, cf. Russell & Norvig (2021, p. 520f.).

See also: Agent, Pareto optimality, Optimization objective.

Regression analysis

Regression is a method or process in quantitative inductive reasoning, i.e., in machine learning applied to numerical data. The learning problem consists in finding out how an outcome variable y (also called the dependent variable) depends on the values of one or multiple independent variables.

Regression analysis can refer to a discussion of regression methodology (e.g., ordinary least squares fits based on the root mean square deviation) or to analysing the outcome of a regression, such as assessing the confidence in the model. Standardized techniques and concepts for analysing the regression outcome are particularly widespread for linear regression.

See also: Hypothesis, Inductive reasoning, Linear regression, Overfitting, Root mean square deviation, Supervised learning, Validation and testing.

Root mean square deviation

The root mean square deviation is a common measure for describing how far two data sets are apart. As the name suggests, it is the square root of the mean square deviation between the two data sets.

See also: Linear_regression, Regression analysis, Validation and testing.

SMART objective

Following Doran (1981), a management "objective should be:

Formulating SMART objectives is not only good organizational or management practice. By including an "indicator of progress," which in decision making and decision support is usually called a key performance indicator (KPI), it can help establish the optimization objective when expressing a scenario as an optimization problem. Multiple conflicting KPIs give rise to a multicriteria optimization (MCO) problem with a multidimensional objective space.

See also: Optimization objective, Pareto optimality.

Supervised learning

Supervised learning is one of the major approaches to machine learning, i.e., of inductive reasoning using computers. In supervised learning, an algorithm is given input-output pairs (or, equivalently, combinations of independent variables x and outcomes y). On this basis, the algorithm proceeds to develop a model of the provided data. However, the hypothesis space (that is, the kind of model), needs to be specified by the user; the algorithm will only determine a parameterization of the model.

It is good practice to develop multiple candidate models (i.e., hypotheses taken from different hypothesis spaces) and compare their performance by validation.

The other two major approaches to machine learning are unsupervised learning, where a data set is given to the algorithm without any additional supporting information or hypothesis, and reinforcement learning where the algorithm receives feedback that drives it toward developing better models.

See also: Hypothesis, Inductive reasoning, Overfitting, Regression analysis, Validation and testing.

Turing test

The Turing test is a game-like criterion devised by Turing (1950) addressing the question: "Can machines think?"

The test is a game with three players: The machine, a human, and an interrogator (who is also human).

According to Turing, the success rate of a machine at passing this test (winning the game) is a measure of its capacity to intelligently emulate human behaviour and communication. The success rate will depend on a multitude of factors, including the amount of text that can be exchanged until the interrogator must make a decision; Turing suggested five minutes, using typewriters. Since the opponent of the AI is an actual human, even a perfect AI cannot be expected to win more than 50% of the time.

See also: Agent.

Validation and testing

In supervised learning, it is often unclear what hypothesis is the best for modelling the phenomena underlying a given data set. In that case, it is common practice to develop multiple candidate models based on different hypotheses (e.g., a linear, quadratic, and cubic model), compare them to each other by validation, and finally assess the accuracy of the selected model by testing.

For this purpose, the overall data set can be split up into three parts:

The split between training and validation data is helpful to prevent overfitting. The split between validation and test data prevents a selection bias: Since the validation data are used to choose the best hypothesis, the performance of the selected hypothesis will usually tend to be overestimated slightly.

See also: Hypothesis, Inductive reasoning, Overfitting, Regression analysis, Root mean square deviation, Supervised learning.

Referenced literature

Index