Redundant Encoding Strengthens Segmentation and Grouping in Visual Displays of Data

王茂南

3294
文章

75
评论

2021年8月7日07:04:58

评论 6086字阅读20分17秒

摘要本文介绍了关于「冗余编码」的相关内容。通过三个实验证明了「冗余编码」可以有益于数据显示，具体来说可以使得识别准确率提高，可以有更强的分组效果。

文章目录(Table of Contents)

简介

数据可视化可以帮助我们更好的了解数据。有一种说法，「当向观众呈现不同类型数据时，这些值集不仅应通过单个特征（如颜色）来区分，还应通过多个特征（如颜色和形状）来区分，即需要冗余特征在一张图中」（When presenting multiple sets of values to a viewer, those sets should be distinguished not just by a single feature, such as colour, but redundantly by multiple features, such as colour and shape. ）。本文会设计实验来验证上面的说法。

本文通过实验有了以下的两个发现：

This practice can indeed yield a large benefit for rapidly segmenting objects within a dense display (Experiments 1 and 2) --> 快速区分物体.
This practice can strengthen the visual grouping of display elements (Experiment 3) --> 加强分组的显示.

在下图中是「redundant encoding」的一个例子，我们想要表现两类数据，于是有下面三种方式：

通过颜色，一类使用「红色」，一类使用「蓝色」；
通过形状，一类使用「圆形」，一类使用「正方形」；
同时使用颜色与形状特征（redundant features），一类是「红色+圆形」，一类是「蓝色+正方形」；

Redundant Encoding Strengthens Segmentation and Grouping in Visual Displays of Data

参考资料

Redundant Encoding，这个页面快速介绍论文中的实验（这里有实验流程的视频，一定要看），如果不想知道实验的细节，看这个页面就足够了。下面是对应的具体的论文：

Nothelfer, Christine, Michael Gleicher, and Steven Franconeri. "Redundant encoding strengthens segmentation and grouping in visual displays of data." Journal of Experimental Psychology: Human Perception and Performance 43, no. 9 (2017): 1667.

实验一

实验介绍

Data visualizations often require the observer to judge the shape of the distribution of a collection, where are the outliers, clumps, and regions of greater or lower concentration.

We requiring the participant to select a collection of objects holistically in order to judge the shaped envelope of the collection, by reporting the quadrant of the display that was missing elements of a given colour or shape.

Because we were interested in whether redundant encoding improves performance for "in-a-glance" decisions—as opposed to slow and serial inspection over the course of several seconds—we use a brief presentation time (around 90 ms on average).

实验方法

生成的图像中有 99 个物体在一个 9*11 的网格的中排列。每个物体的直径为1.0-1.5可视度。在每次试验中，每个对象的 x 和 y 坐标都会受到 0.6 度的视觉抖动。11个目标形成一个部分环，嵌入 88 个分心对象。环总是缺少 5 个相邻的目标元素，限制在屏幕的一个象限内，并替换为从一组有用的干扰物中随机挑选（未替换）的对象。

整个实验流程如下所示，分为三个步骤：

屏幕上显示 target 和 distractor 的特点；
接着出现一幅图，实验者进行查看，来判断哪一个象限没有 target 元素；
接着屏幕会被遮住，此时实验者需要选择一个象限；

这个实验除了统计准确率外（哪个象限没有 target 元素，实验者是否回答准确），还会统计实验者查看图像时候花费的时间（在第二步花费了多久才能得到结论），所以会有两个指标进行比较。

例如对于上面的例子，如下图所示就是「第三象限」是没有 target 的。

中心 redundant encoding 的方式本文选择了三种 (three different sets of redundant visual features)，于是就有了三个实验，在「实验结果」部分会介绍这三个实验的结果：

Experiment 1a: blue or asterisk；
Experiment 1b: blue or circle；
Experiment 1c: red or triangle；

实验结果

实验最终的结果如下所示，可以看到使用了 redundant encoding 方式的准确率都高于只使用单个特征的方式，因此可以认为此时人们在进行判断的时候，是将两个特征的信息都用上了，而不是只利用了一个特征的信息 (because the actual accuracy in the redundant condition is significantly larger than either of these estimates based on separate processing of the two features, the result suggests that shape and colour are integrated)。

实验二

实验介绍

在「实验一」中会同时告诉实验者目标和干扰物的形状。这样实验者就可以提前知道实验是不是关于 redundant code 的实验了。因此在「实验二」中进行了两处调整：

在准备阶段，只告诉实验者关于 target 的信息【The preview specified the color and shape of the target, omitting descriptions of the distractor objects for the upcoming trial so that participants would not know if an upcoming display would contain a redundant encoding of the target】；
不通过图像的方式告诉，而是使用文字的方式，例如告诉 blue+circle【the target was described not by an image, but by printed text that identified a single shape and color (e.g., 'blue circle')】；

如下图所示，现在通过文字的方式介绍 target，同时不会介绍干扰项的内容：

实验方法

大致的实验方法与「实验一」是一样的。这里包含两组实验：

Experiment 2a (target object: blue/circle)
Experiment 2b (target object: red/triangle)

同时在准备阶段，通过文字的方式描述 target 的特点。

实验结果

上面两个实验结果如下图所示，从实验结果可以看到使用 redundant encoding 方式实验准确率要高于只使用 colour 或是 shape。同时作者进行了方差检验，最终结果表明他们之间的 accuracy 之间有显著的差异：

需要注意的是，在这个实验中实验者对每一组使用得到的先验知识是一样的（因为都是通过语言描述 target 的样式），因此实验者能采取的策略也是一样的。

实验三

实验介绍

前面两个实验表明当使用冗余编码的时候，参与者可以更好的感受全局形状。实验三探索这种好处是否可以推广到其他的任务。

在「实验三」中我们使用「repetition discrimination task」的实验。具体来说，对于下面的图中会出现重复的字母，例如下面出现字母的顺序是 H A H A A H A，于是 A 出现了重复，则选择 A。

实验中两两物体的样式需要一样。例如上图中，第一个和第二个字母的样式是一样的，第三和第四个字母的样式是一样的，依次类推。【Pairs of adjacent objects were identical in the variable feature (e.g., luminance in luminance trials), which alternated between object pairs (e.g., two light green squares, followed by two dark green squares, followed by two light green squares). 】

我们需要注意的是，重复的单词可能会出现在「样式重复的部分内（Within Group）」，或是出现在「样式不重复的部分（Between Groups）」

实验方法

参加实验的对象被要求快速回答哪一个字母是重复的【After participants viewed a fixation screen for 500 ms, the test display was presented until response. After 24 unrecorded practice trials, test trials were done in eight blocks of 36 trials, lasting 10 –15 min】。

实验中会出现三种情况（如上图），每一种的样式都不同，分别是「颜色」，「形状」和「颜色+形状」【Each display contained objects pairs which differed by luminance, shape, or both luminance and shape (redundant)】。
实验者回答字母 A 还是 H 是重复的【Participants indicated which letter, A or H, repeated in the display, unpredictably appearing either within or between object groupings】。
研究者们预计当出现 between-groups 的情况的时候，准确率是最糟糕的【Performance was expected to be worse for between-groups trials. If redundant grouping cues can be combined, participants should be slowest on between-groups redundant trials (sixth row)】。

实验结果

实验的最终结果（反映时间）如图所示：

The graph shows the response time for each of three similarity grouping cues (luminance, shape, redundant), depending on whether the letter repetition occurred within or between object pairs.
Note that the difference between the last two bars (redundant condition) is larger than the difference between either of the first two sets of bars. --> 这里我理解是当使用了 redundant encoding 之后，更容易使得人们将物体绑定在一起。所以最后一个情况 repetition between 的时间比 repetition within 的要多很多.
Error bars represent within-subject standard errors of the mean.

最后作者还是做了 ANOVA 来验证数据之间是有差异的。

总结

首先说明「冗余编码」可以有益于数据显示【The present data provide the first empirical demonstration that the redundant encoding of objects can be beneficial to viewers of visual data displays】。

接着总结实验一和实验二：

介绍两个实验的内容，Experiments 1 and 2 presented participants with a brief display designed to mimic a dense data visualization, and asked them to report the display quadrant that was missing objects of a specified colour and shape.
在这两个实验中，使用「冗余编码」都会使得识别的准确率变高，Performance was substantially better when a collection was redundantly specified by both colour and shape, regardless of whether participants knew the collection's encoding type before each trial.
不仅在准确率上要高，使用「冗余编码」在反应时间上也是比单个编码要快，This advantage was echoed in response times, and was present even when comparing redundant encoding with each participant's best single dimension.

最后总结实验三的内容：

实验 3 证明「冗余编码」在视觉分组测量中产生了更强的效果，Experiment 3 replicated a similar benefit by showing that redundant encoding of visual groups created stronger effects within a measure of visual grouping.
当使用冗余编码的时候，他与周围的其他物体看起来更加不同，These results apply directly to the redundant encoding design technique used in data visualization, but also more broadly to how we attend to objects in our daily environment—it is much more typical that an object will differ from its surrounding objects in multiple feature dimensions than a single feature dimension.