The web-site is now in readonly mode. Login and registration are disabled. (28 June 2019)

LAEP Glossary

Cloud created by:

Rebecca Ferguson
13 March 2016

The LAEP Glossary has been developed by the LAEP project to identify and define key terms related to learning analytics.



academic analytics

The process of evaluating and analysing organisational data from the systems of educational institutions for reporting and decision-making reasons. If a distinction is drawn with learning analytics, academic analytics are typically focused at the level of the institution or above, whereas learning analytics are typically focused at the level of the individual learner or course.


Of some learning activity or environment, means that the system adapts to characteristics or behaviours of the individual learner.


Emotions or moods.

affect detection

Of a computer system, the ability to detect the emotions or moods of the learner.

affective computing

Computing that takes into account the emotional state or mood of the user.


A process or set of rules to be followed in problem-solving operations, especially by a computer.


Processing of data to produce meaningful patterns and inferences, or individual metrics that convey information about a large dataset.


Application Programming Interface, the means by which software components exchange data or direct processing.

association rule

In data mining, a strong association discovered between items using methods that look for patterns where items co-occur (are associated), as distinct from sequence rules, which are the result of sequence mining: looking for patterns where one item happens after another (in sequence) 

Bayesian knowledge tracing

A particular way of inferring the cognitive model of the learner based on whether their answers are correct or incorrect. Typically used in cognitive tutors.

Bayesian network

A probabilistic model of the relationships between variables, typically ‘learned’ from a large dataset.

big data

A loose term for situations where the amount of data to be processed is so large that traditional approaches do not work, or for using data processing approaches that were originally developed to deal with very large datasets, such as data mining.

causal discovery

In data mining/machine learning, algorithms and techniques that seek to discover causal relationships between variables, as opposed to mere associations (for example wet pavements and open umbrellas are associated, but one does not cause the other – they share a common cause, rain).


In machine learning, algorithms and techniques for determining which category an observation belongs in, based on categories developed from a ‘training set’ of data. An example would be whether a student’s learning activity is ‘on track’ or ‘in trouble’ based on a comparison with data from students from a previous instance of the same course.

cluster analysis & clustering

In data mining/machine learning, algorithms and techniques for grouping data so that each group (cluster) contains items that are more similar to each other than they are to items in the other groups.

cognitive modelling

The process of developing models of the cognitive processes in learners, typically for the purposes of a cognitive tutor.

cognitive tutor

A type of intelligent tutoring system in which feedback is provided to the learner based on cognitive models of the learner (typically inferred from their responses to the system) and of the knowledge domain to be learned. As a trademark, systems of this type produced commercially by Carnegie Learning.

computational linguistics

An established interdisciplinary field of study concerned with using computers to analyse human languages (natural language).

data mining

Algorithms and techniques for discovering patterns and regularities in large datasets – often associated with big data

data protection**

The laws and rules concerning the processing of personal data, and the associated processes and procedures for ensuring that processing complies with these laws and rules. Within the EU, there is clear and relatively strict legislation aimed at ensuring privacy and fairness in the processing of personal data. Similar legislation exists in all other OECD countries, with the exception of the United States, where the law is substantially more permissive, except for personal data about children.

design research

Research into the processes of design or, more recently, research that is part of a process of design, such as design-based research that designs, tests and improves learning activities in an iterative cycle.

discourse analytics

Within learning analytics, the collective term for a wide variety of approaches to the analysis of series of communicative events, typically those that involve speech or written communication.

dynamic Bayesian networks

A Bayesian network concerned with how variables change over time. A probabilistic model of the relationships between various variables at one point in time and another.

educational data mining or EDM

An emerging discipline, concerned with developing methods for exploring the unique types of data that come from the educational setting, and using those methods to better understand students, and the settings which they learn in. In contrast with learning analytics, it is typically concerned with finer-grained detail about individual learner behaviours, and is closer to computer science as a discipline.


A broad term with a range of meanings. At one end of the range, it can mean a substantial affective investment of a learner in the process of learning (as in a deep orientation to learning). At the other, it can mean use of learner activity data to infer how long learners spent on particular activities, typically based on timestamps in log files of web page access.

evidence-centred design

A method for the design and evaluation of educational systems that has a particular focus on higher-level knowledge.

eye tracking

Determining where someone is looking (where their eyes are focused) and, typically, using this to inform design or research. Originally, this required highly specialised equipment and was used to track people’s gaze on computer screens, but advances mean that gaze can be detected in many contexts using two carefully positioned cameras.

feature engineering & feature selection

In machine learning, the often-challenging process of identifying or developing features (data that could be useful for prediction or classification) for the algorithms to work on. 

hierarchical clustering

A particular sort of cluster analysis that aims to group (cluster) data into groups (clusters) that form some sort of hierarchy.

intelligent tutor & intelligent tutoring system

Software that gives immediate, adaptive and individual responses to learners, such as instruction and feedback, generally without requiring input from a human tutor.

item response theory

The study of how learners’ responses to individual questions (items) in tests relate to their underlying abilities, typically using probabilistic approaches.

knowledge tracing

Algorithms and techniques for inferring the cognitive model of the learner, typically used in cognitive tutors.

learning analytics

The measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimising learning and the environments in which it occurs. In the context of this report, the term is used more broadly to cover both academic analytics and educational data mining.

learning curves

A graph showing the amount of learning (for example, as represented by test scores) over time or repeated attempts at a task. Analysis of such curves – typically for many learners – can help the design of a learning task, for instance by identifying tasks that are very easy, or very difficult.

log files

Computer files that contain lists of past events. For instance, in a learning environment, a log file might contain an entry for each time the learner clicked on an item, showing which item was clicked and when. Analysis of log files can be useful for tracking learner behaviour and for improving learning environments.

logistic regression

In machine learning, a particular algorithm used for classification when the data is to be classified in to discrete categories, such as ‘pass’ or ‘fail’.

machine learning

The use of computer algorithms to detect patterns in data, such as cluster analysis or predictive modelling.

massive open online course

An online course open for anyone to study without pre-requisites or charge, intended for a larger number of learners than a traditional course. 

matrix factorisation

(also known as matrix decomposition) Algorithms that take a matrix and determine two factors (i.e. two new matrices) that when multiplied together give the original matrix. Often used to develop systems that can recommend particular resources to a learner based on other learners’ behaviours or outcomes.

natural language processing

Within computational linguistics, algorithms and techniques for relating human languages (natural language) to computer language. Used to enable computer systems to communicate using human language (whether written or spoken), as opposed to computer languages or input methods such as keyboards, mice, controllers and buttons.


Of analytics, finding patterns in data and using those to make predictions about future events, such as whether a student will pass or fail a course.

predictive modelling

Finding patterns in data and using those to make guesses (predictions) about other data, such as whether a student will pass or fail a course.


Keeping personal data so that it is not observed by others, or by unauthorised people. An important part of data protection.

process mining

In educational data mining, looking for patterns in log files that relate to learning processes, and using the models developed for purposes such as uncovering those learning processes or improving the system.


An established field of study concerned with the measurement of psychological variables. In this context, typically used for the construction and validation of questionnaires and tests.


A broad set of statistical tools and algorithms for modelling and analysing the relationships between variables.


In universities, keeping students who have enrolled on a course until they complete that course (reducing drop-out). The retention rate is the fraction of students who started a course who complete it, as distinct from the pass rate, which is the fraction of students who passed the course’s assessment. Can apply to an individual module, semester or course, or to an entire degree programme.

sequence mining

In data mining, looking for patterns where items happen in sequence (one after another), as distinct from patterns where they co-occur (are associated, as in association rules).

social learning analytics

Within learning analytics, analytics that focus on how learners build knowledge together in their cultural and social settings.

social network analysis

Algorithms and techniques for analysing the relationships between individuals (social relationships) based on network and graph theory. The underlying model is one of ‘nodes’ (individuals or things) and ‘edges’ (relationships or interactions between them).


Society for Learning Analytics Research: an inter-disciplinary network of leading international researchers exploring the role and impact of analytics on teaching, learning, training and development.

student model
Related terms: learner modelling, user modelling

Student models represent information about a student’s characteristics or state, such as the current knowledge, motivation, meta-cognition, and attitudes.

text mining

Algorithms and techniques for finding useful patterns in text, often using natural language processing.


Whether a particular method does what it is supposed to do, or measures accurately what it is intended to measure. As distinct from reliability, which is whether a particular method gives the same result given input that is essentially the same.

visual analytics

Processing of data to produce meaningful visual patterns, or individual visualisations that convey information about a large dataset.


A graphical or visual display of information, intended to help the viewer to understand a set of data.

Extra content

Embedded Content