NMBU, DAT121: Glossary

NMBU REALTEK, DAT121, 2023 (August block): Glossary

Agent

nn agent f.

nb agent m.

Definition: An agent is a system that interacts with its surroundings. It receives percepts through sensors and can carry out actions through actuators.

Beside its sensors and actuators, an agent is characterized by its agent function: The way in which the past and present percepts determine or influence the present and future actions.
A goal-oriented agent is an agent that exhibits the tendency "to achieve a certain state of the world" (Conte 2009, p. 2578). Goal-orientation can emerge by a multitude of mechanisms, including biological evolution. It does not necessarily require the agent to be consciously aware of its goals.
"Intelligent agents are goal-oriented agents using their knowledge to solve problems, including taking decisions and planning actions" (Conte 2009, p. 2578). This requires the agent to have some kind of internal representation of its surroundings, and to store and process information about its surroundings.
A knowledge-based agent is an intelligent agent that uses a knowledge base to store and process its information about its surroundings.
A rational agent is an intelligent agent that exhibits rationality, i.e., a tendency toward optimizing a quantity: The performance measure of the agent. As in the case of goal-orientation, this does not necessarily require the agent to be aware of its performance measure.
"Goal-directed agents are intelligent agents that have an internal representation of the goals they [tend to] achieve" (Conte 2009, p. 2578).

Argument passing

nn parameteroverføring f.

nb parameteroverføring m., f.

Definition: Argument passing is the process of handing over data items to a procedure (e.g., a function in Python) when that procedure is called.

Most procedural programming or object-oriented programming languages provide "pass by value" or "pass by reference," or both, as a mechanism of argument passing. "Pass by value" means that the value of a data item is handed over directly, whereas in "pass by reference," the procedure receives the memory address of that value.
In Python, the mechanism of argument passing is called "pass by object reference," which means that an object reference is passed by value.
The free variables of a procedure are called parameters. Every time that the procedure is called, data items passed to the procedure through these variables - these data items are called arguments. In other words, parameters are the names of the variables passed to the function, arguments are concrete values associated with the parameters at runtime.
It is therefore more correct to speak of "argument passing" than "parameter passing," and that is also more common in English. In Norwegian the expression with the word "parameter," as given above, seems to be more common.

Competency question

(under construction)

Concept (class, entity type)

nn omgrep n. (klasse f., m., entitetstype m.)

nb begrep n. (klasse m., f., entitetstype m.)

Definition: A concept is a universal that is only instantiated by individuals.

From SKOS, a semantic artefact for organizing conceptual schemes: "Concepts are the units of thought - ideas, meanings, or (categories of) objects and events - which underlie many knowledge organization systems" (Isaac & Summers 2009).
In many settings and use cases, including in object-oriented programming, a concept is usually called a class. In E-R diagram terminology, it is called an entity type.
E-R terminology distinguishes between an entity type and the corresponding entity set, i.e., the set of all individuals that instantiate the entity type. In nominalist ontology, these two are the same - a universal is the set of its individual instances.

Dark data

(under construction)

Decorrelation time (autocorrelation time)

(under construction)

Dictionary (hash)

Definition and translation left open for discussion on 14th/15th August.

Dynamic array

nn dynamisk array n., m. (dynamisk rekke f.)

nb dynamisk array n., m. (dynamisk rekke m., f.)

Definition: A dynamic array is a dynamic data structure; specifically, it is an array with a dynamically adjustable size, usually reserving free memory capacity at its end or beginning (or both) for additional future elements. Therein, an array is a variable referring to a contiguous region in memory that can hold the content of multiple elementary variables or objects.

In Python, the dynamic array data structure is called a list; nn liste f., nb liste m., f.
Most programming languages require the elements of an array to be all of the same type. This does not apply to Python, however; any sort of object references can be elements of the same Python list.

Dynamic typing

nn dynamisk typetildeling f.

nb dynamisk typetildeling m., f.

Definition: In a program or programming language using dynamic typing, variables do not need a declaration (and their type does not need to be explicitly specified) before being used; instead, it is determined at runtime.

Foundational ontology (top-level ontology)

(under construction)

Global variable

nn global variabel m.

nb global variabel m.

Definition: A global variable is a variable that can be accessed through a name with an unrestricted scope. It has a name that resolves everywhere in the code.

The term local variable can sometimes refer to any variable that is not global (i.e., any variable with a restricted scope). But it is more common to just call those variables "local" that are declared within a function.
Some say that it is bad style to use any global variables at all. The main reason is that it is hard to debug or verify what a code does if it relies on write access to global variables.
Despite this, it is common practice to use global variables in script languages.

Hypothesis

nn hypotese m.

nb hypotese m.

Definition: In machine learning, a hypothesis is a function y = f(x₀, x₁, …) that predicts an outcome variable y on the basis of values of one or multiple independent variables x₀, x₁, …

In this sense, hypothesis means the same as model or model function, where the independent variables in the hypothesis are the arguments of the model function; it can also be called a correlation or a regression.
A hypothesis space is a kind of model, or a model class, with free parameters (i.e., model parameters) that can be adjusted to optimize quantitative agreement with the data. For example, with two independent variables x₀ and x₁, the hypothesis space for linear regression is given by the space of functions that have the form f(x₀, x₁) = ax₀ + bx₁ + c, with three adjustable parameters: a, b, and c.
Model parameterization, or the process of computing a regression, means to solve an optimization problem where the hypothesis space is the parameter space, and some measure for model quality is the optimization objective.

Individual (entity, object)

nn individ n. (entitet m., objekt n.)

nb individ n. (entitet m., objekt n.)

Definition: Anything about which it can be meaningfully asked what concepts it instantiates is an individual.

In knowledge graphs, the individuals correspond to the nodes of the graph.
All individuals are instances of the top concept, namely, the concept "individual." In OWL, this concept is called owl:Thing.
Following Quine, the domain of quantification consists of individuals; i.e., all the possible values that could be assigned to a free variable are individuals. Accordingly, anything that exists is an individual.
In object-oriented programming, an individual is usually called an object; in entity-relationship digrams, it is called an entity.

Influence diagram

(under construction)

Knowledge base

nn kunnskapsbase m.

nb kunnskapsbase m.

Definition: A knowledge base, given by K = (T, A), consists of an ontology T, describing universals, and a set of assertions A describing concrete instances of these universals.

See also: Agent, knowledge graph, ontology, resource.

Knowledge graph (ABox)

Definition left open for discussion on 18th, 21st or 22nd August.

Object-oriented programming

nn objektorientert programmering f.

nb objektorientert programmering m., f.

Definition left open for discussion on 17th August.

Object reference

nn objektreferanse m.

nb objektreferanse m.

Definition: A reference is an alias for data stored at a certain memory address. An object reference is a reference to an object; the memory address remains hidden from the programmer, who can use the reference as if it was the object itself.

Object-oriented programming languages usually distinguish between classes and elementary data types, and consequently between object variables (and their values, which are usually objects) and elementary variables (and their values, which are usually elementary data items). That distinction is absent in Python, where the elementary data are also objects.
In programming languages that use references (and/or similar constructs such as pointers), there is typically a distinction between references/pointers, which hold a memory address as their value (even though that may be hidden from the programmer), and "normal" variables which hold "normal" values. That distinction is also absent in Python, where every variable is an object reference.

Ontology (TBox)

nn ontologi m. (t-boks m.)

nb ontologi m. (t-boks m.)

Definition: An ontology is a semantic artefact that formulates a conceptual scheme; it specifies, for a certain domain of knowledge and according to a certain paradigm within that domain, what kinds of entities there can be and how they can relate to each other.

Optimization objective

nn optimaliseringsmål n.

nb optimaliseringsmål n.

Definition: An optimization objective is a quantity that is used to formulate preferences for the outcome of a decision making scenario. In case of a maximization objective, greater values are preferred, and in case of a minimization objective, smaller values are preferred.

An optimization objective can also be called an optimization criterion or a key performance indicator (KPI). If it is a minimization objective, it can also be called cost, and if it is a maximization objective, it can also be called utility.
In multicriteria optimization (MCO), multiple conflicting optimization objectives are used simultaneously. In this case, there is a multidimensional objective space; the dimension of the objective space is given by the number of optimization objectives.
The function f(x) that maps points in parameter space to points in objective space is called the objective function; in case of maximization, it is also referred to as a utility function, and in case of minimization, as a cost function.

See also: Pareto optimality, rationality.

Optimization parameter

nn optimaliseringsparameter m.

nb optimaliseringsparameter m.

Definition: An optimization parameter is a quantity over which the decision maker has direct control; a parameter value (or parameterization) is selected in order to obtain the best possible outcome for the optimization objective(s).

In multivariate optimization, there are multiple optimization parameters; accordingly, the parameter space is multidimensional.
If an optimization problem with multiple parameters is formulated adequately, it should be possible to vary all optimization parameters independently. If that is not the case and one of the parameters can be expressed as a function of the others, the problem needs to be reformulated, eliminating redundant parameter(s).

Pareto optimality

nn Pareto-optimalitet m.

nb Pareto-optimalitet m.

Definition: Within the framework of multicriteria optimization (MCO), a point in objective space is Pareto optimal if it is accessible and no other accessible point in objective space dominates it.

The Pareto front consists of all the Pareto optimal points in objective space.
A point y in objective space is accessible if there is a point x in parameter space such that f(x) = y, where f(x) is the objective function (utility function in case of maximization objectives, cost function in case of minimization objectives).
A point y in objective space dominates another point y' if there is at least one objective for which y is better than y', whereas there is no objective for which y' is better than y. If that is the case, there is no possible compromise between the objectives that would lead a rational agent to prefer y' over y. Therefore, if y is accessible, y' cannot be Pareto optimal.
By extension, a point x in parameter space can also be called Pareto optimal (e.g., a Pareto optimal solution, parameterization, or design choice) if y = f(x) is Pareto optimal, i.e., if the point y in objective space is on the Pareto front.
It is a common technique in AI-driven decision support to compute the Pareto front and the associated Pareto optimal design choices, presenting them to decision makers. All the other possible solutions can be discarded since they cannot correspond to a rational compromise between the objectives.

Persistent identifier

(under construction)

Procedural programming

nn procedyreprogrammering f.

nb procedyreprogrammering m., f.

Definition: Procedural programming is the programming paradigm where procedures are employed as the highest-level device for structuring code and the program control flow.

In the above definition, procedures are understood to be contiguous blocks of code that provide an interface for argument passing from outside and can serve as the scope of local variables.
The procedures are called functions in many programming languages, including C/C++ and Python.
The paradigm is nonetheless called "procedural programming," not "functional programming" - that exists as well, but is a different programming paradigm.

p value

(under construction)

Rationality

nn rasjonalitet m.

nb rasjonalitet m.

Definition: Tendency toward minimizing a cost function or maximizing a performance measure. In particular, rational preferences, or decisions and choices made by a rational agent, must satisfy the following constraints (Russell & Norvig 2021, p. 520):

Transitivity: If the agent prefers A over B, and B over C, then the agent also prefers A over C whenever given the choice.
Monotonicity: Assume that the agent prefers A over B. The lotteries (i.e., probability distributions) X and Y both have A and B as their only possible outcomes, where the probability of A is greater in case of the lottery X than in case of the lottery Y. Then the agent prefers X over Y.
Continuity: If the agent prefers A over B, and B over C, then here is exactly one lottery X with A and C as its only possible outcomes such that the agent is indifferent between B and X, i.e., the agent neither prefers B over X nor does the agent prefer X over B. For any other lottery Y with the two possible outcomes A and C, the agent prefers Y over B if the chance of A is greater in case of Y than in case of X; obversely, the agent prefers B over Y if the chance of A is smaller in case of Y than in case of X.

For a more complete and more mathematically oriented discussion of rational choice, cf. Russell & Norvig (2021, p. 520f.).

Regression analysis

nn regresjonsanalyse m.

nb regresjonsanalyse m.

Regression is a method or process in supervised learning. The learning problem consists in finding out how an outcome variable y (also called the dependent variable) depends on the values of one or multiple independent variables.

Regression is based on a pre-selected functional form of the permitted hypotheses (models), i.e., on a pre-selected hypothesis space. It must be known what the model function looks like and what free (i.e., adjustable) parameters it contains; e.g., cubic regression produces a hypothesis according to which the outcome y is modelled by y = ax³ + bx² + cx + d, with the model parameters a, b, c, and d. The outcome of the regression would then consist in a set of values for these parameters.
The outcome of the regression (i.e., the outcome of the learning process) is often also called regression; to avoid confusion it is advisable to refer to the outcome of the regression as a model, a model function, a hypothesis, or a correlation. Unfortunately, all these terms can have other meanings as well.
On the origin of the term, Russell & Norvig (2021, p. 670) comment that the name regression for this problem and methodology is "admittedly obscure - a better name would have been function approximation or numeric prediction. But in 1886 Francis Galton wrote an influential article on the concept of regression to the mean (e.g., the children of tall parents are likely to be taller than average, but not as tall as the parents). Galton showed plots with what he called 'regression lines,' and readers came to associate the word 'regression' with the statistical technique of function approximation rather than with the topic of regression to the mean."
In Python, the statsmodels library can be used for regression analysis.

Regression analysis can refer to a discussion of regression methodology (e.g., ordinary least squares fits based on the root mean square deviation) or to analysing the outcome of a regression, such as assessing the confidence in the model. Standardized techniques and concepts for analysing the regression outcome are particularly widespread for linear regression.

Relation (object property, relationship type)

Definition left open for discussion on 17th August.

See also: Concept, relationship, resource.

Relationship

(under construction)

Reproducibility

(Reserved for discussion on 25th August.)

Residual quantity

(under construction)

Resource

nn ressurs m.

nb ressurs m.

Definition: In RDF, there are three kinds of resources, namely, concepts, relations, and individuals.

Scope

nn gyldigheitsområde n.

nb gyldighetsområde n.

Definition: The scope of a name (e.g., for the name of an object reference) is the region within the source code within which that name can be resolved.

Script language (interpreted language)

nn skriptspråk n. (fortolka språk n.)

nb skriptspråk n. (fortolket/-a språk n.)

Definition: A script language is a language that is most typically used for writing scripts, i.e., programs that require a run-time environment, such as an interpreter or shell, going through the code and executing it step by step.

In compiled programming languages, a compiler (and linker) is used to generate an executable binary file from the code. In interpreted programming languages, the code is not translated into binary/executable form, but instead processed by another program, the interpreter, every time that it is run.
Programming languages cannot be perfectly classified into compiled and interpreted languages. Most script languages can be compiled, including Python, it is just not how they are most typically used. Java uses a run-time environment, but programs need to be compiled to be ready for that environment. Logic programs (e.g., in Prolog) use reasoners, sharing some features with typical interpreters, but they don't necessarily proceed step by step in a fixed order.
Typical features of script languages include dynamic typing and more reliance on global variable by many of its users.

See also: Dynamic typing, global variable.

Static array

nn statisk array n., m. (statisk rekke f.)

nb statisk array n., m. (statisk rekke m., f.)

Definition: An array is a variable referring to a contiguous region in memory that can hold the content of multiple elementary variables or objects. A static array is just that, without any special additional functionality, as opposed to a dynamic array which provides the additional functionality that it can be resized.

A two-dimensional static or dynamic array can also be legitimately called a table, nn tabell m., nb tabell m.

Triple

nn trippel n.

nb trippel n.

Definition: An RDF triple consists of a subject, a predicate, and an object, all of which need to be resources.

Validation and testing

In supervised learning, it is often unclear what hypothesis is the best for modelling the phenomena underlying a given data set. In that case, it is common practice to develop multiple candidate models based on different hypotheses (e.g., a linear, quadratic, and cubic model), compare them to each other by validation, and finally assess the accuracy of the selected model by testing.

For this purpose, the overall data set can be split up into three parts:

The training data are used to parameterize multiple candidate models, i.e., to adjust any free variables in the models (such as a and b in y = ax + b), optimizing the accuracy of the model; one such method would be an ordinary least squares (OLS) fit, which minimizes the root mean square deviation between the actual data for the outcome y (using training data only) and the values obtained from the model.
The validation data are used to compare the candidate models against each other and, where appropriate, to the null hypothesis which states that the outcome y is constant. For example, the strategy may consist in selecting the model with the smallest root mean square deviation between the actual and the predicted outcome y (using validation data only).
The test data are used to provide an independent accuracy assessment for the final model. This permits statements on the margin of error, e.g., based on the root mean square deviation between the test data and the corresponding predictions for the outcome variable y. If a normal distribution is assumed for the deviation between actual and predicted data, using a margin of error given by two times the root mean square deviation (in both directions) leads to a 95.4% probability for observing a deviation that is smaller than the margin of error.

The split between training and validation data is helpful to prevent overfitting. The split between validation and test data prevents a selection bias: Since the validation data are used to choose the best hypothesis, the performance of the selected hypothesis will usually tend to be overestimated slightly.

See also: Hypothesis, regression analysis.

Referenced literature

(Conte 2009) R. Conte, "Rational, goal-oriented agents," doi:10.1007/978-1-4614-1800-9_158, in R. A. Meyers (ed.), Encyclopedia of Complexity and Systems Science, New York: Springer (ISBN 978-1-4614-1801-6), 2009.
(Galton 1886) F. Galton, "Regression towards mediocrity in hereditary stature," Journal of the Anthropological Institute of Great Britain and Ireland 15: 246-263, doi:10.2307/2841583, 1886.
(Isaac & Summers 2009) A. Isaac, E. Summers (eds.), SKOS Simple Knowledge Organization System Primer, W3C, 2009.
(Russell & Norvig 2021) S. Russell, P. Norvig, Artificial Intelligence: A Modern Approach, 4th edn. (global), Harlow: Pearson (ISBN 978-1-29240113-3), 2021.

Index