Stage2
Stage 2: Pruning Promiscuous Interpolations
Inspiration:
Earlier work (see Stage 1: Promiscuous Interpolation) has suggested that contour interpolation proceeds in two stages.
1) In an early, promiscuous interpolation stage, interpolations are produced indiscriminately across all edge pairs that meet the relatability criteria.
2) In a second stage, those interpolations are examined with respect to other information in the scene in order to determine which interpolations are good ones (and should therefore ultimately appear in our perception) and which interpolations should be deleted or weakened in perceptual strength.
What kinds of information are taken into account in Stage 2?
Cues as to surface color and luminance
This question is exactly what this set of experiments seeks to answer. Some previous work suggests some cues that are likely candidates and would be worth testing. From our prior research (see Stage 1: Promiscuous Interpolation), we know that differences in surface color across to-be-connected edge fragments and the gaps that separate them can drastically reduce the subjective perception of interpolated contours in the absence of an occluder (Carrigan, Erlikhman, & Kellman, in prep; see Stage 1: Promiscuous Interpolation). Thus, it seems likely that processing in Stage 2 takes into consideration discontinuities in surface characteristics across pairs of edge fragments and the gaps that separate them. Two kinds of surface discontinuities that are likely to play a role in the second stage are suggested by the work of Su, He, and Ooi (2010a, 2010b). These authors have demonstrated that differences in fragment background luminance contrast polarity and equiluminant color contrast (see figure below) can affect the perception of interpolated edges (Su, He, & Ooi, 2010a; Su, He, & Ooi, 2010b).
Redrawing of “illusory-O” figures from Su, He, & Ooi (2010a, 2010b). Gaps between relatable edge fragments have the same color and brightness as the background. The gaps could be perceived to be caused by a ring or “O” shaped object that lies on top of the bars and which has the same color as the background. In a), relatable edge fragments in the same wheel spoke have identical luminance and color, and an “illusory-O” with grey color is typically perceived to lie atop otherwise complete spokes. In b), relatable edge fragments have opposite luminance contrast polarities relative to the background (one fragment is white while the other is black), and the perception of the “illusory-O” is typically eliminated. c) and d) replicate these effects with stimuli with the same c) (“illusory-O” is perceived) or opposite d) (“illusory-O is not typically perceived) color contrast.
Cues as to border ownership
Other candidates for consideration in Stage 2 are cues to border ownership. Border ownership cues are cues that indicate which side of an edge owns that boundary. For example, a straight vertical edge could belong to an object whose surface is either to the left or to the right of that edge. Certain border ownership cues, are known to affect the subjective perception of interpolated contours (Kellman & Shipley, 1991). For example, outlined fragments tend to own their own boundaries. In the figure below, note the “illusory” interpolated edges that give rise to the perception of a whole white square occluding four black circles in a). In perceiving a white square, our visual system is interpreting the four corners visible in the image as belonging to a central square that is partially occluding four black circles. If, instead, the visual system interpreted the edges as belonging to the black elements, we would perceive four “pacman” like objects rather than circles and no central square. This is just what happens in e) when this same stimulus is altered so that the elements are outlines rather than filled fragments.
Outlined fragments are known to tend to be perceived as owning their boundaries, and this ownership of the boundaries tends to reduce or eliminate the perception of interpolated contours connecting separate fragments. These effects suggest that cues as to border ownership are likely among those kinds of information taken into account in Stage 2.
Interpolation across multiple spatial scales
A final, interesting possibility is that Stage 2 processing may take into account information about interpolated edges produced in separate spatial frequency channels. The figure below illustrates an image broken up into relatively low frequency edges and relatively high frequency edges. It has been proposed that the visual system extracts information about edges of different spatial frequencies in separate channels and that this information about edges with different spatial frequencies must later be integrated (Marr & Hildreth, 1980; Morrone & Burr, 1988).
Because information for edges with different spatial frequencies initially exists in separate channels, it seems possible that promiscuous interpolation would proceed separately across multiple separate channels. The second stage might then provide an opportunity for the integration of interpolations produced in separate channels. In addition, corroboration of an interpolated edge across multiple channels could be taken as increased evidence in favor of the interpolation, making it more likely that that interpolation would be strengthened or maintained (rather than deleted).
The visual system picks up Information about edges with different spatial frequencies in separate channels. a) shows a complete image, while b) and c) show the relatively low frequency and high frequency edges, respectively.
Theory & Hypothesis:
Multi-stage theory
Interpolations produced promiscuously in an early stage enter a second stage in which various scene cues are taken into account to determine the final perceptual strength of each interpolation. Some interpolations will be deleted entirely, while others will take on relatively great or weak perceptual strengths.
Relation to object perception
The interpolated contours that make it out of Stage 2 are those that are available to be used by object formation processes in the building up of object descriptions. Because object recognition relies on these descriptions, it therefore depends on the output of Stage 2.
Complete object descriptions of partially occluded objects necessarily include some interpolated contours. A partially occluded object should, therefore, become less recognizable when cues encourage the deletion or down-weighting of interpolations that are necessary to make that object whole.
Cues under consideration
Manipulations to the following cues, individually, are expected to produce measurable effects on the recognition of partially occluded objects:
Background luminance contrast polarity differences across edge fragments
Color contrast polarity differences differences across edge fragments
Border ownership cues
Interpolation across multiple spatial scales
If promiscuous interpolation does proceed separately across multiple spatial scales, those interpolations are likely integrated at the second stage. This integration could be such that the degree of corroboration across multiple spatial scales plays a role in determining an interpolation’s ultimate perceptual strength. Increased corroboration could constitute increased evidence in favor of an interpolation, resulting in a final perceptual strength that is relatively large compared to an interpolation that was supported at relatively few spatial scales.
This leads to some straightforward predictions regarding the ultimate perceptual strengths of different interpolations. For example, an interpolation connecting one fragment with relatively blurry edges and another fragment with quite crisp edges should not be perceived as strongly as an interpolation connecting two crisp edges. This is true because a relatively blurry fragment will only be picked up at relatively low spatial scales; therefore, any interpolation connecting this fragment to another one can only be produced at relatively low spatial scales. In contrast, relatively crisp edges are picked up at both relatively high and relatively low spatial scales. Therefore, an interpolation connecting two crisp fragments would be produced at both relatively high and relatively low spatial scales.
Goals:
EXPERIMENT SET 1:
Test whether each of the aforementioned cues is among those taken into account in the second stage.
EXPERIMENT SET 2:
Determine how the visual system integrates the information provided by each of these cues.
COMPUTATIONAL MODELING:
Build a computational model to model the integration of information from multiple cues.
Method:
A novel paradigm was developed based on the theorized relation to object perception described above. If object perception relies on Stage 2 output, then objects should be less recognizable when the interpolations necessary to complete objects were deleted in Stage 2 or exited Stage 2 with relatively weak perceptual strengths.
The paradigm requires observers to recognize partially occluded alphanumeric characters. Cues are manipulated to encourage the deletion of interpolations in Stage 2, and recognition accuracy is measured. Presentation time is also varied to observe the effects of the cues over time.
The control stimulus shows a grey blob-like object partially occluding a background of filled black random alphanumeric character fragment (see figure below). Hidden within this background is one potentially whole alphanumeric character that can be completed by interpolating across the grey gaps created by the occluder. The use of random alphanumeric character fragments in the background ensures the task cannot be completed merely be recognizing a single part or parts of an object (recognition from partial information). Parts of many different letters and numbers are visible - but there is only one possible whole.
Hidden within a background of randomly arranged alphanumeric character fragments is one partially occluded whole number or letter. For this particular stimulus, it is the number “2”, hidden in the upper left of the image.
Altered versions of this stimulus were created to manipulate cues to background luminance contrast polarity, equiluminant color contrast, spatial frequency of visible edges, as well as several border ownership cues. Each manipulation was examined alone in one set of experiments, while another set of experiments examined the effects of combining multiple manipulations. Examples of some of those manipulations from the first set of experiments are illustrated below.
Example stimuli from Experiment set 1. From left to right, top to bottom, the stimulus images manipulate border ownership cues, luminance contrast polarity, spatial frequency of edges, and color contrast.
Predictions
Each of the manipulated cues is expected to produce a decrease in recognition accuracy. Because interpolation is known to take approximately 150ms, effects are expected to appear at 150ms and increase over time.
Results Summary
1) Manipulations to luminance contrast polarity, equiluminant color contrast, spatial frequency of edge information, and border ownership cues all produce significant differences, in the expected direction, in the recognition of partially occluded objects from 150ms onward.
2) Results from experiments that combined multiple cues in a single image suggest that the combined effect is simply the addition of the individual effects - that is, the brain integrates the information conveyed by each cue in the simplest way possible: it combines their individual effects through simple summation.
The figure below illustrates the data for the luminance contrast polarity manipulation relative to the control stimulus.
Results for the luminance contrast polarity manipulation relative to the control. Significant differences in recognition accuracy, in the expected direction, appear from 150ms onward.
Take-Aways
1) These results support the two-stage theory of interpolation. That is, in a first stage, interpolations are produced promiscuously across every pair of edge fragments that meets the relatability criteria. In a later stage of processing, interpolations are considered in the face of other information in the scene in order to determine their ultimate perceptual strengths. Interpolations may be deleted, or they may exit Stage 2 with relatively great or weak perceptual strengths.
2) Among the kinds of information taken into account in the second stage are differences, across fragments, in: luminance contrast polarity, equiluminant color contrast, and spatial frequency of edge information. Cues as to which side of a boundary owns that boundary are also taken into account in the second stage.
3) Finally, the results support the theory that interpolation proceeds across multiple spatial frequency channels, and that these interpolations are integrated in the second stage. This integration is such that corroboration across multiple channels is taken as increased evidence in favor of an interpolation. Interpolations with increased corroboration receive greater perceptual strengths than interpolations with relatively little support across multiple channels.
References
Kellman, P. J., & Shipley, T. F. (1991). A theory of visual interpolation in object perception. Cognitive Psychology, 23, 141-221.
Marr, D. & Hildreth, E. (1980). Theory of edge detection. Proceedings of the Royal Society of London B, 207(1167), 187-217.
Morrone, M. C., & Burr, D. C. (1988). Proceedings of the Royal Society of London, B, 235, 221-245.
Su, Y. R., He, Z. J., & Ooi, T. L. (2010a). Surface completion affected by luminance contrast polarity and common motion. Journal of Vision, 10(3), 1-14.
Su, Y. R., He, Z. J., & Ooi, T. L. (2010b). Boundary contour-based surface integration affected by color. Vision Research, 50, 1833-1844.