Evaluate & Compare

To objectively demonstrate the value of artworks and research, it is not enough to say "it seems good" --- you need to show how experiencers felt using numbers and data. This section introduces evaluation methods from three perspectives: what to measure, how to measure it, and how to compare and dig deeper.

What to Measure

First, let's clarify "What are we actually measuring?"

1. UX (User Experience)

UX refers to everything a user feels when using or experiencing a product, service, or artwork. "It was easy to use," "It was interesting," "I want to experience it again" --- all of these are part of UX.

UX evaluation of interactive products is mainly divided into the following two qualities:

Pragmatic quality --- Qualities related to achieving a goal: "Is it easy to use?" "Is it understandable?" "Is it efficient?" For a smartphone app, this means "Can I quickly reach the desired function?"
Hedonic quality --- Qualities related to emotional and affective satisfaction: "Is it interesting?" "Is it novel?" "Is it attractive?" For a smartphone app, this means "Is it exciting to use?"

Why divide into two?

There are products that are "easy to use but boring" and products that are "hard to use but interesting." By evaluating both qualities separately, you can clearly identify what the strengths and areas for improvement of a work are.

2. Impression

When evaluating the impression received from a subject, we assess it through multiple adjectives and onomatopoeia. For example, the impression of a work is quantified on scales of opposing word pairs such as "bright <-> dark" and "soft <-> hard." A single adjective cannot capture the full impression, so by combining multiple adjective pairs, we can objectively grasp multifaceted impressions.

3. Preference

The intuitive feeling that something is good, fitting, or appropriate. Unlike impressions, preference includes a "personal value judgment." Even for the same work, preferences vary greatly between individuals.

4. Emotion

The P(V)AD model is well known for measuring emotions:

Dimension	Meaning	Example
Pleasure / Valence	Positive <-> Negative	Happy <-> Sad
Arousal	Excited <-> Calm	Thrilling <-> Relaxed
Dominance	Dominant <-> Submissive	In control <-> Overwhelmed

Emotions received from content are often evaluated on two axes: valence and arousal. For example, horror movies are positioned as "unpleasant x high arousal," and healing music as "pleasant x low arousal."

How to Measure

1. SD (Semantic Differential) Method

A method for objectively quantifying and analyzing the impressions people receive from artworks, products, etc.

How it works: Prepare multiple pairs of opposing adjectives (adjective pairs) such as "bright - dark" and "artificial - natural," and have respondents rate each on a 5- or 7-point scale indicating which adjective they lean toward (reference).

Interpreting results: Connecting the mean values of each adjective pair in a line graph reveals the "impression profile" of the subject. By overlaying multiple works or conditions on the same graph, differences in impressions become immediately apparent.

Example

When comparing impressions of "a particular acoustic artwork" and "conventional speaker playback":

Work A: Leans toward "fantastic," "dynamic," "comfortable"
Work B: Leans toward "realistic," "static," "unsettling"

-> This analysis suggests that Work A provides an immersive, non-everyday experience

References: Impression evaluation of AR installations, Impression evaluation of virtual forest bathing content, Graphing SD method survey results

2. Semi-Structured Interview

A research method where questions are prepared in advance, but additional questions are flexibly added or modified based on the respondent's answers to dig deeper (reference).

Difference from surveys: Surveys limit responses to "yes/no" or "5-point scales," making unexpected discoveries difficult. In interviews, you can dig deeper with questions like "Why did you feel that way?" or "At which specific moment?" allowing you to grasp the quality of experiences that numbers alone cannot reveal.

What "semi-structured" means: Completely free conversation (unstructured) tends to go off-topic, while strictly fixed questions (structured) prevent digging deeper. Semi-structured interviews are the middle ground.

3. Focus Group Interview (FGI)

A research method where one moderator conducts a roundtable-style interview with a group of approximately 4-8 people (reference).

Advantages: Participants stimulate each other's responses, generating more diverse opinions than one-on-one interviews. Discussions come alive with reactions like "Oh, I thought so too!" or "Actually, it was the opposite for me..."

Caution: There is a risk that a dominant participant may pull everyone toward the same opinion.

4. Specific Measurement Methods (Scales & Questionnaires)

In the research world, many measurement tools (scales) with established reliability and validity already exist. Choose the appropriate one based on your purpose.

Scale	What It Measures	Features
AttrakDiff	Pragmatic and hedonic quality of products	A classic UX scale evaluating both usability and attractiveness
UEQ / UEQ-S (survey sample)	Overall UX	The 8-item short version (UEQ-S) is quick and easy to administer
SAM (survey sample)	Emotion (valence, arousal)	Responses are given using illustrations (manikins) rather than words, making it language-independent
POMS2	Negative mood states	Measures multiple dimensions of negative mood including tension, depression, anger, fatigue, and confusion
Temporal Dominance of Emotion	Temporal changes in emotion	Captures how emotions shift over time during an experience
PRS	Perceived restorativeness of places	Evaluates how much psychological rest a space provides across 4 factors
PANAS	Positive and negative affect	A scale that evaluates positive and negative affect as two independent axes
ME (Magnitude Estimation) method	Sensory intensity	Respondents indicate "how many times stronger" a sensation is compared to a reference stimulus, directly quantifying sensory intensity

Methods for Comparing & Digging Deeper

To more deeply understand evaluation results, we use the following comparison and analysis methods.

Comparison Methods

Vary elements within the artwork --- For example, prepare a version with sound and a version without sound to verify the effect of sound (prepare a dummy artwork for comparison)
Vary the medium --- Investigate how differences in presentation methods affect impressions: monitor vs. projector, headphones vs. speakers, etc.
Compare with existing works --- By evaluating your work alongside similar existing works or products, you can highlight the distinctive features of your work

Statistical Analysis Methods

ANOVA (Analysis of Variance) --- A method for testing whether there are statistically significant differences in the means of 3 or more groups. For example, you can test whether there are differences in impression ratings among Works A, B, and C. For comparing 2 groups, use a t-test
Factor Analysis --- A method for organizing the many adjective-pair ratings obtained from the SD method and identifying a small number of underlying factors (categories). For example, if "bright," "vivid," and "flashy" cluster together, you might name it the "activity" factor

Data Exploration Methods

Conjoint Analysis --- A method for identifying which elements (color, material, shape, etc.) among the multiple components of a product or service, and to what degree, influence preference. It yields insights like "color has the greatest influence on preference"
Text Mining --- A method for statistically analyzing frequently occurring words and word co-occurrences from survey free-text responses or interview transcripts. Effective for identifying trends from large volumes of text data

Evaluate & Compare ​

What to Measure ​

1. UX (User Experience) ​

2. Impression ​

3. Preference ​

4. Emotion ​

How to Measure ​

1. SD (Semantic Differential) Method ​

2. Semi-Structured Interview ​

3. Focus Group Interview (FGI) ​

4. Specific Measurement Methods (Scales & Questionnaires) ​

Methods for Comparing & Digging Deeper ​

Comparison Methods ​

Statistical Analysis Methods ​

Data Exploration Methods ​