NMBU REALTEK, DAT121, 2023 (August block): Glossary
Agent
Definition: An agent is a system that interacts with its surroundings. It receives
percepts through sensors and can carry out actions through actuators.
- Beside its sensors and actuators, an agent is characterized by
its agent function: The way in which the past and present percepts
determine or influence the present and future actions.
- A goal-oriented agent is an agent that exhibits the
tendency "to achieve a certain state of the
world" (Conte 2009, p. 2578).
Goal-orientation can emerge by a multitude of mechanisms, including biological evolution.
It does not necessarily require the agent to be
consciously aware of its goals.
- "Intelligent agents are goal-oriented agents using their
knowledge to solve problems, including taking decisions and planning
actions" (Conte 2009, p. 2578). This requires the agent to have some
kind of internal representation of its surroundings, and to store and
process information about its surroundings.
- A knowledge-based agent is an intelligent agent that uses
a knowledge base to store and process its
information about its surroundings.
- A rational agent is an intelligent agent that exhibits rationality,
i.e., a tendency toward optimizing a quantity: The performance measure
of the agent. As in the case of goal-orientation, this does not necessarily
require the agent to be aware of its performance measure.
- "Goal-directed agents are intelligent agents that have an
internal representation of the goals they [tend to] achieve" (Conte 2009, p. 2578).
See also: Influence diagram, knowledge base, Pareto optimality, rationality.
Argument passing
- nn parameteroverføring f.
- nb parameteroverføring m., f.
Definition: Argument passing is the process of handing over data items to a procedure (e.g., a function in Python) when that procedure is called.
- Most procedural programming or object-oriented programming languages provide "pass by value" or "pass by reference," or both, as a mechanism of argument passing. "Pass by value" means that the value of a data item is handed over directly, whereas in "pass by reference," the procedure receives the memory address of that value.
- In Python, the mechanism of argument passing is called "pass by object reference," which means that an object reference is passed by value.
- The free variables of a procedure are called parameters. Every time that the procedure is called, data items passed to the procedure through these variables - these data items are called arguments. In other words, parameters are the names of the variables passed to the function, arguments are concrete values associated with the parameters at runtime.
- It is therefore more correct to speak of "argument passing" than "parameter passing," and that is also more common in English. In Norwegian the expression with the word "parameter," as given above, seems to be more common.
See also: Object reference, procedural programming.
Competency question
(under construction)
See also: Ontology, triple.
Concept (class, entity type)
- nn omgrep n. (klasse f., m., entitetstype m.)
- nb begrep n. (klasse m., f., entitetstype m.)
Definition: A concept is a universal that is only instantiated by individuals.
- From SKOS, a semantic artefact for organizing conceptual schemes: "Concepts are the units of thought - ideas, meanings, or (categories of) objects and events - which underlie many knowledge organization systems" (Isaac & Summers 2009).
- In many settings and use cases, including in object-oriented programming, a concept is usually called a class. In E-R diagram terminology, it is called an entity type.
- E-R terminology distinguishes between an entity type and the corresponding entity set, i.e., the set of all individuals that instantiate the entity type. In nominalist ontology, these two are the same - a universal is the set of its individual instances.
See also: Foundational ontology, individual, object-oriented programming, ontology, relation, relationship, resource.
Dark data
(under construction)
See also: Ontology, regression analysis, reproducibility.
Decorrelation time (autocorrelation time)
(under construction)
See also: Regression analysis, reproducibility.
Dictionary (hash)
Definition and translation left open for discussion on 14th/15th August.
See also: Dynamic array, object reference, static array.
Dynamic array
- nn dynamisk array n., m. (dynamisk rekke f.)
- nb dynamisk array n., m. (dynamisk rekke m., f.)
Definition: A dynamic array is a dynamic data structure; specifically, it is an array with a dynamically adjustable size, usually reserving free memory capacity at its end or beginning (or both) for additional future elements. Therein, an array is a variable referring to a contiguous region in memory that can hold the content of multiple elementary variables or objects.
- In Python, the dynamic array data structure is called a list; nn liste f., nb liste m., f.
- Most programming languages require the elements of an array to be all of the same type. This does not apply to Python, however; any sort of object references can be elements of the same Python list.
See also: Dictionary, static array.
Dynamic typing
- nn dynamisk typetildeling f.
- nb dynamisk typetildeling m., f.
Definition: In a program or programming language using dynamic typing, variables do not need a declaration (and their type does not need to be explicitly specified) before being used; instead, it is determined at runtime.
See also: Object reference, script language.
Foundational ontology (top-level ontology)
(under construction)
See also: Concept, ontology.
Global variable
- nn global variabel m.
- nb global variabel m.
Definition: A global variable is a variable that can be accessed through a name with an unrestricted scope. It has a name that resolves everywhere in the code.
- The term local variable can sometimes refer to any variable that is not global (i.e., any variable with a restricted scope). But it is more common to just call those variables "local" that are declared within a function.
- Some say that it is bad style to use any global variables at all. The main reason is that it is hard to debug or verify what a code does if it relies on write access to global variables.
- Despite this, it is common practice to use global variables in script languages.
See also: Scope, script language.
Hypothesis
- nn hypotese m.
- nb hypotese m.
Definition: In machine learning, a hypothesis is a
function y = f(x0, x1, …) that
predicts an outcome variable y
on the basis of values of one or multiple independent
variables x0, x1, …
- In this sense, hypothesis means the same
as model or model function, where the independent variables in the hypothesis
are the arguments of the model function; it can also be called a correlation or a regression.
- A hypothesis space is a kind of model, or a model class, with free parameters (i.e., model parameters)
that can be adjusted to optimize quantitative agreement with the data. For example, with two independent
variables x0 and x1, the hypothesis
space for linear regression is given by the space
of functions that have the form f(x0, x1)
= ax0 + bx1 + c, with three adjustable parameters: a, b, and c.
- Model parameterization, or the process of computing a regression,
means to solve an optimization problem where the hypothesis space is the parameter space,
and some measure for model quality is the optimization objective.
See also: Residual quantity, validation and testing.
Individual (entity, object)
- nn individ n. (entitet m., objekt n.)
- nb individ n. (entitet m., objekt n.)
Definition: Anything about which it can be meaningfully asked what concepts it instantiates is an individual.
- In knowledge graphs, the individuals correspond to the nodes of the graph.
- All individuals are instances of the top concept, namely, the concept "individual." In OWL, this concept is called owl:Thing.
- Following Quine, the domain of quantification consists of individuals; i.e., all the possible values that could be assigned to a free variable are individuals. Accordingly, anything that exists is an individual.
- In object-oriented programming, an individual is usually called an object; in entity-relationship digrams, it is called an entity.
See also: Concept, knowledge graph, object-oriented programming, object reference, persistent identifier, relationship, resource.
Influence diagram
(under construction)
See also: Agent, optimization parameter, p value, rationality, regression analysis, reproducibility, residual quantity.
Knowledge base
- nn kunnskapsbase m.
- nb kunnskapsbase m.
Definition: A knowledge base, given by K = (T, A), consists of an ontology T, describing universals, and a set of assertions A describing concrete instances of these universals.
See also: Agent, knowledge graph, ontology, resource.
Knowledge graph (ABox)
Definition left open for discussion on 18th, 21st or 22nd August.
See also: Individual, knowledge base, persistent identifier, relationship, triple.
Object-oriented programming
- nn objektorientert programmering f.
- nb objektorientert programmering m., f.
Definition left open for discussion on 17th August.
See also: Concept, individual, object reference, procedural programming.
Object reference
- nn objektreferanse m.
- nb objektreferanse m.
Definition: A reference is an alias for data stored at a certain memory address. An object reference is a reference to an object; the memory address remains hidden from the programmer, who can use the reference as if it was the object itself.
- Object-oriented programming languages usually distinguish between classes and elementary data types, and consequently between object variables (and their values, which are usually objects) and elementary variables (and their values, which are usually elementary data items). That distinction is absent in Python, where the elementary data are also objects.
- In programming languages that use references (and/or similar constructs such as pointers), there is typically a distinction between references/pointers, which hold a memory address as their value (even though that may be hidden from the programmer), and "normal" variables which hold "normal" values. That distinction is also absent in Python, where every variable is an object reference.
See also: Argument passing, dictionary, dynamic typing, individual, object-oriented programming, scope.
Ontology (TBox)
- nn ontologi m. (t-boks m.)
- nb ontologi m. (t-boks m.)
Definition: An ontology is a semantic artefact that formulates a conceptual scheme; it specifies, for a certain domain of knowledge and according to a certain paradigm within that domain, what kinds of entities there can be and how they can relate to each other.
See also: Competency question, concept, dark data, foundational ontology, knowledge base.
Optimization objective
- nn optimaliseringsmål n.
- nb optimaliseringsmål n.
Definition: An optimization objective is a quantity that is used to formulate preferences
for the outcome of a decision making scenario. In case of a maximization objective, greater
values are preferred, and in case of a minimization objective, smaller values are preferred.
- An optimization objective can also be called an optimization criterion
or a key performance indicator (KPI).
If it is a minimization objective, it can also be called cost,
and if it is a maximization objective, it can also be called utility.
- In multicriteria optimization (MCO),
multiple conflicting optimization objectives are used simultaneously.
In this case, there is a multidimensional objective space;
the dimension of the objective space is
given by the number of optimization objectives.
- The function f(x) that maps points in parameter space to points in objective
space is called the objective function; in case of maximization,
it is also referred to as a utility function, and in case of minimization,
as a cost function.
See also: Pareto optimality, rationality.
Optimization parameter
- nn optimaliseringsparameter m.
- nb optimaliseringsparameter m.
Definition: An optimization parameter is a quantity
over which the decision maker has direct control;
a parameter value (or parameterization) is selected in order to
obtain the best possible outcome for the optimization objective(s).
- In multivariate optimization, there are multiple optimization parameters;
accordingly, the parameter space is multidimensional.
- If an optimization problem with multiple parameters is formulated adequately,
it should be possible to vary all optimization parameters independently.
If that is not the case and one of the parameters can be expressed as a function
of the others, the problem needs to be reformulated, eliminating redundant parameter(s).
See also: Influence diagram, Pareto optimality.
Pareto optimality
- nn Pareto-optimalitet m.
- nb Pareto-optimalitet m.
Definition: Within the framework of multicriteria optimization (MCO),
a point in objective space is Pareto optimal
if it is accessible and no other accessible point in objective space dominates it.
- The Pareto front consists of all the Pareto optimal points in objective space.
- A point y in objective space is accessible if there is
a point x in parameter space such that f(x) = y,
where f(x) is the objective function (utility function in case of
maximization objectives, cost function in case of minimization objectives).
- A point y in objective space dominates another point y'
if there is at least one objective for which y is better than y',
whereas there is no objective for which y' is better than y.
If that is the case, there is no possible compromise between the objectives
that would lead a rational agent to prefer
y' over y. Therefore, if y is accessible, y' cannot be Pareto optimal.
- By extension, a point x in parameter space can also be called Pareto optimal
(e.g., a Pareto optimal solution, parameterization, or design choice)
if y = f(x) is Pareto optimal, i.e., if the
point y in objective space is on the Pareto front.
- It is a common technique in AI-driven decision support to compute
the Pareto front and the associated Pareto optimal design choices,
presenting them to decision makers.
All the other possible solutions can be discarded since they cannot
correspond to a rational compromise between the objectives.
See also: Agent, optimization parameter, optimization objective, rationality.
Persistent identifier
(under construction)
See also: Individual, knowledge graph.
Procedural programming
- nn procedyreprogrammering f.
- nb procedyreprogrammering m., f.
Definition: Procedural programming is the programming paradigm where procedures are employed as the highest-level device for structuring code and the program control flow.
- In the above definition, procedures are understood to be contiguous blocks of code that provide an interface for argument passing from outside and can serve as the scope of local variables.
- The procedures are called functions in many programming languages, including C/C++ and Python.
- The paradigm is nonetheless called "procedural programming," not "functional programming" - that exists as well, but is a different programming paradigm.
See also: Argument passing, object-oriented programming, scope.
p value
(under construction)
See also: Influence diagram, regression analysis.
Rationality
- nn rasjonalitet m.
- nb rasjonalitet m.
Definition: Tendency toward minimizing a cost function or
maximizing a performance measure.
In particular, rational preferences, or decisions and choices made by a rational agent,
must satisfy the following constraints (Russell & Norvig 2021, p. 520):
- Transitivity: If the agent prefers A over B, and B over C,
then the agent also prefers A over C whenever given the choice.
- Monotonicity: Assume that the agent prefers A over B. The
lotteries (i.e., probability distributions) X and Y both have
A and B as their only possible outcomes, where the
probability of A is greater in case of the lottery X
than in case of the lottery Y. Then the agent prefers X over Y.
- Continuity: If the agent prefers A over B, and B over C,
then here is exactly one lottery X with A and C as its only possible outcomes
such that the agent is indifferent between B and X, i.e.,
the agent neither prefers B over X nor does the agent prefer X over B.
For any other lottery Y with the two possible outcomes A and C,
the agent prefers Y over B if the chance of A is greater in case of Y than in case of X;
obversely, the agent prefers B over Y if the chance of A is smaller in case of Y than in case of X.
For a more complete and more mathematically oriented discussion
of rational choice, cf. Russell & Norvig (2021, p. 520f.).
See also: Agent, influence diagram, optimization objective, Pareto optimality.
Regression analysis
- nn regresjonsanalyse m.
- nb regresjonsanalyse m.
Regression is a method or process in supervised learning. The learning
problem consists in finding out how an outcome variable y (also called
the dependent variable) depends on the values
of one or multiple independent variables.
- Regression is based on a pre-selected functional form of the permitted hypotheses (models),
i.e., on a pre-selected hypothesis space.
It must be known what the model function looks like and what free (i.e., adjustable) parameters it contains; e.g.,
cubic regression produces a hypothesis according to which the outcome y
is modelled by y = ax3 + bx2 + cx + d,
with the model parameters a, b, c, and d.
The outcome of the regression would then consist in a set of values for these parameters.
- The outcome of the regression (i.e., the outcome of the learning process)
is often also called regression; to avoid confusion
it is advisable to refer to the outcome of the regression as a model,
a model function, a hypothesis, or a correlation. Unfortunately, all these terms can have other meanings as well.
- On the origin of the term, Russell & Norvig (2021, p. 670) comment that the name regression
for this problem and methodology is "admittedly obscure - a better name would have
been function approximation or numeric prediction. But in 1886 Francis Galton
wrote an influential
article on the concept of regression to the mean (e.g.,
the children of tall parents are likely to be taller than average, but not as tall
as the parents). Galton showed plots with what he called 'regression lines,' and
readers came to associate the word 'regression' with the statistical technique
of function approximation rather than with the topic of regression to the mean."
- In Python, the statsmodels library
can be used for regression analysis.
Regression analysis can refer to a discussion of regression
methodology (e.g., ordinary least squares fits
based on the root mean square deviation)
or to analysing the outcome of a regression,
such as assessing the confidence in the model. Standardized techniques
and concepts for analysing the regression outcome are particularly widespread
for linear regression.
See also: Dark data, decorrelation time, influence diagram, p value, reproducibility, residual quantity, validation and testing.
Relation (object property, relationship type)
Definition left open for discussion on 17th August.
See also: Concept, relationship, resource.
Relationship
(under construction)
See also: Concept, individual, knowledge graph, relation, triple.
Reproducibility
(Reserved for discussion on 25th August.)
See also: Dark data, decorrelation time, influence diagram, regression analysis.
Residual quantity
(under construction)
See also: Hypothesis, influence diagram, regression analysis.
Resource
- nn ressurs m.
- nb ressurs m.
Definition: In RDF, there are three kinds of resources, namely, concepts, relations, and individuals.
See also: Concept, individual, knowledge base, relation, triple.
Scope
- nn gyldigheitsområde n.
- nb gyldighetsområde n.
Definition: The scope of a name (e.g., for the name of an object reference) is the region within the source code within which that name can be resolved.
See also: Global variable, object reference, procedural programming.
Script language (interpreted language)
- nn skriptspråk n. (fortolka språk n.)
- nb skriptspråk n. (fortolket/-a språk n.)
Definition: A script language is a language that is most typically used for writing scripts, i.e., programs that require a run-time environment, such as an interpreter or shell, going through the code and executing it step by step.
- In compiled programming languages, a compiler (and linker) is used to generate an executable binary file from the code. In interpreted programming languages, the code is not translated into binary/executable form, but instead processed by another program, the interpreter, every time that it is run.
- Programming languages cannot be perfectly classified into compiled and interpreted languages. Most script languages can be compiled, including Python, it is just not how they are most typically used. Java uses a run-time environment, but programs need to be compiled to be ready for that environment. Logic programs (e.g., in Prolog) use reasoners, sharing some features with typical interpreters, but they don't necessarily proceed step by step in a fixed order.
- Typical features of script languages include dynamic typing and more reliance on global variable by many of its users.
See also: Dynamic typing, global variable.
Static array
- nn statisk array n., m. (statisk rekke f.)
- nb statisk array n., m. (statisk rekke m., f.)
Definition: An array is a variable referring to a contiguous region in memory that can hold the content of multiple elementary variables or objects. A static array is just that, without any special additional functionality, as opposed to a dynamic array which provides the additional functionality that it can be resized.
- A two-dimensional static or dynamic array can also be legitimately called a table, nn tabell m., nb tabell m.
See also: Dictionary, dynamic array.
Triple
- nn trippel n.
- nb trippel n.
Definition: An RDF triple consists of a subject, a predicate, and an object, all of which need to be resources.
See also: Competency question, knowledge graph, relationship, resource.
Validation and testing
In supervised learning, it is often unclear
what hypothesis is the best for modelling
the phenomena underlying a given data set. In that case, it is common practice
to develop multiple candidate models based on different
hypotheses (e.g., a linear, quadratic, and cubic model),
compare them to each other by validation, and finally assess the
accuracy of the selected model by testing.
For this purpose, the overall data set can be split up into three parts:
- The training data are used to parameterize multiple candidate models, i.e.,
to adjust any free variables in the models (such as a and b in y = ax + b),
optimizing the accuracy of the model; one such method would be an ordinary least squares (OLS) fit,
which minimizes the root mean square deviation
between the actual data for the outcome y (using training data only) and the values obtained from the model.
- The validation data are used to compare the candidate models against
each other and, where appropriate, to the null hypothesis which states that
the outcome y is constant. For example, the strategy
may consist in selecting the model with the smallest
root mean square deviation between the
actual and the predicted outcome y (using validation data only).
- The test data are used to provide an independent accuracy assessment for the final model.
This permits statements on the margin of error, e.g., based on the
root mean square deviation between the test data and the
corresponding predictions for the outcome variable y.
If a normal distribution is assumed for the deviation between actual and predicted data,
using a margin of error given by two times the root mean square deviation (in both directions) leads to
a 95.4% probability for observing a deviation that is smaller than the margin of error.
The split between training and validation data is helpful to prevent overfitting.
The split between validation and test data prevents a selection bias: Since the validation data are
used to choose the best hypothesis, the performance of the
selected hypothesis will usually tend to be overestimated slightly.
See also: Hypothesis, regression analysis.
Referenced literature
- (Conte 2009) R. Conte, "Rational, goal-oriented agents," doi:10.1007/978-1-4614-1800-9_158, in R. A. Meyers (ed.), Encyclopedia of Complexity and Systems Science, New York: Springer (ISBN 978-1-4614-1801-6), 2009.
- (Galton 1886) F. Galton, "Regression towards mediocrity in hereditary stature," Journal of the Anthropological Institute of Great Britain and Ireland 15: 246-263, doi:10.2307/2841583, 1886.
- (Isaac & Summers 2009) A. Isaac, E. Summers (eds.), SKOS Simple Knowledge Organization System Primer, W3C, 2009.
- (Russell & Norvig 2021) S. Russell, P. Norvig, Artificial Intelligence: A Modern Approach, 4th edn. (global), Harlow: Pearson (ISBN 978-1-29240113-3), 2021.
Index