Abstract
purpose. In preceding studies, simulations of artificial vision were used to determine the basic parameters for visual prostheses to restore useful reading abilities. These simulations were based on a simplified procedure to reduce stimuli information content by preprocessing images with a block-averaging algorithm (square pixelization). In the present study, how such a simplified algorithm affects reading performance was examined.
methods. Five to six volunteers with normal vision were asked to read full pages of text with a 10° × 7° viewing window stabilized in central vision. In a first experiment, reading performance with off-line and real-time square pixelizations was compared at different resolutions. In a second experiment, off-line square pixelization was compared with off-line Gaussian pixelization with various degrees of overlap. In a third experiment, real-time square pixelization was compared with real-time Gaussian pixelization.
results. Results from the first experiment showed that real-time square pixelization required approximately 30% less information (pixels) than its off-line counterpart. Results from the second experiment, using off-line processing, revealed a restricted range of Gaussian widths for which performances were equivalent or significantly better than that obtained with square pixelization. The third experiment demonstrated, however, that reading performances were similar in both real-time pixelization conditions.
conclusions. This study reveals that real-time stimulus pixelization favors reading performance. Performance gains were moderate, however, and did not allow for a significant (e.g., twofold) reduction of the minimum resolution (400–500 pixels) needed to achieve useful reading abilities.
Currently, several research groups are working toward the development of visual prostheses for the blind.
1 2 3 4 5 6 7 Despite fundamental design differences (implantation site, image acquisition, and processing techniques), these approaches share common features that lead to several major constraints on the visual percepts that can be elicited. Envisioned devices consist of a finite number of discrete stimulation contacts, will be implanted at a fixed location in the eye, and will subtend only a fraction of the entire visual field. If one expects to restore useful vision to blind patients, these constraints have to be thoroughly considered.
Our research group is part of a larger multidisciplinary research effort aiming to develop a subretinal implant. Our CMOS-Retina
8 9 10 is built to transform incident light on the retina into electric stimulation currents “in situ.” In this context, we have developed special experimental conditions (simulations) to explore the minimum requirements to restore useful artificial vision.
Our simulations use low-resolution (pixelized) images that are projected in a “small” viewing area, stabilized at a fixed location in the visual field. We attempt to mimic the type of visual information provided by a retinal implant, using photodiode technology to transform incident light into an electric signal. With this methodological approach we explored, in a first study,
11 the reading of isolated four-letter words. In central vision, accurate recognition was possible with pixelizations down to 286 pixels, distributed over a 10° × 3.5° viewing window. After a period of systematic training, comparable results were achieved with the same viewing window stabilized at 15° eccentricity in the lower visual field. In a second study,
12 we explored full-page text reading under similar conditions. Tests were performed with a larger viewing window of 10° × 7° containing 572 pixels, that moved across the page of text under control of the subject’s eye movements. Performance was close to perfect with central vision. With eccentric vision, subjects achieved reading scores between 86% and 98% after a period of methodical training.
In earlier studies, we used a simplified technique to simulate the limited number of stimulation contacts available in a visual prosthesis. Stimulus images were decomposed into a finite number of pixels with a simple block-averaging algorithm. This resulted in a mosaic of square pixels of various gray levels, the gray level within each pixel being constant (square pixelization). However, electrophysiological research
13 14 15 revealed that the patterns of neural activity elicited by electric stimulation of the retina depend on the strength of the stimulation current and that neural activation diminishes progressively with increasing electrode-to-neural target distance. These findings imply that phosphenes elicited by electrical stimulation of the retina should not be of constant luminosity and not of square shape. Furthermore, depending on the strength of the stimulation current, the percepts may develop from a collection of isolated phosphenes toward more continuous patterns with different degrees of overlap across neighboring phosphenes.
One could argue that square pixelization is adequate to simulate the reduced information content of the stimuli transmitted by a retinal implant. In a given condition, the detailed shape of each pixel does not alter the overall information content of the image. However, studies on face recognition have demonstrated that detection is considerably hampered when images are decomposed into uniform square pixels. Harmon and Julesz
16 suggested that the oriented high-frequency noise introduced at block borders masks certain image features essential for recognition. Gestalt psychologists
17 18 further proposed that square pixelization distorts the image to the point of modifying its intrinsic gestalt properties.
19 Bachmann and Kahusk
20 also suggest that the “block” constituents or pixels of the processed image compete for attention with the particular features of the image, thus affecting recognition. If one wants to avoid these drawbacks, square pixelization should be replaced by other types of image quantization featuring softer borders and allowing for variable amounts of overlap.
Another shortcoming of our previous studies is that the pixelization algorithm was applied off-line over the entire original image (e.g., seven lines of full-page text). Subjects were allowed to scan this preprocessed image through a viewing window containing a subset of 572 pixels, the gray level of these “frozen” pixels being independent of the point of gaze on the image. This would not be the case in artificial vision systems, since stimulation intensity at each electrode contact would depend on the exact point of gaze relative to the image observed. For retinal implants transforming light falling on the retina into stimulation currents “in situ,”
4 7 10 this would happen due to eye movements. Head movements would act similarly in systems using an external head-mounted camera for stimulus generation.
1 2 3 5 6 In the case of reading, when focusing on a string of a few characters, its appearance would change on small eye (or camera) movements. Temporal cues seem to play a significant role in visual perception: the human visual system is optimized for detecting structural changes in dynamic images. A dynamic sequence of slightly different pixelized images may contain more information than one frozen pixelized image; therefore, dynamic (real-time) pixelization is likely to enhance information transmission to the visual system. Major object identification features (such as shape or location) are extracted from different spatial patterns (such as local contrast changes or relative position changes) resulting from image motion. Improved sensitivity for moving contrast changes, compared to their static equivalents, has previously been demonstrated.
21 Moreover, it has already been established that dynamic presentations lead to better performance in tasks like facial recognition.
22 23 24 Hence, if one wants more accurate simulations of artificial vision, pixelization should be performed in real-time and the intensity of each pixel should vary dynamically, according to gaze position.
To our knowledge, psychophysical research using simulations of prosthetic vision has not been extensive so far. Reading and mobility were first studied by a group at the University of Utah.
25 26 Their head-mounted experimental setup consisted of a video camera sending images to a monochrome monitor that projected to the subject’s right eye (maximum viewing angle of 1.7°). Pixelization was achieved by overlaying the monitor with opaque masks containing a variable number of square perforations (pixels). Recently, another group at The Johns Hopkins University presented a series of experiments that used simulations specifically designed to mimic percepts evoked by retinal implants.
27 28 29 Different pixelization algorithms were used: a square pixelizing filter similar to the one presented in this article, a constant luminosity circular pixelizing filter, and a nonoverlapping Gaussian filter. Unfortunately, no direct comparison of the different pixelizing algorithms has been reported. Moreover, all these experiments neglected a fundamental aspect of artificial vision with a retinal implant: Viewing areas were not stabilized at fixed (eccentric) retinal positions. In more recent studies, the latter authors acknowledged that the stabilization of the viewing area on the retina can significantly affect performance (Dagnelie G, et al.
IOVS 2004;45:ARVO E-Abstract 4223; Kelley AJ, et al.
IOVS 2004;45:ARVO E-Abstract 5436), especially in visually demanding tasks such as reading.
To validate our previous studies as well as to improve our simulation methods for future studies, we decided to investigate specifically the influence of the spatial and temporal characteristics of stimulus pixelization on reading performance. In the present study, we report a series of three paired comparisons of the effects of different pixelization methods on full-page reading. We compared reading performance: (1) between off-line square pixelization and real-time square pixelization of the image, (2) between off-line square pixelization and off-line Gaussian pixelization of the image, and (3) between real-time square pixelization and real-time Gaussian pixelization of the image.
Off-Line Pixelization.
Real-Time Pixelization.
In this condition, only the small portion of the entire text segment image displayed in the 10° × 7° viewing window (determined by the subject’s gaze position on the screen) was pixelized in real-time. Gaze position data were used to reposition the viewing window and to display its newly pixelized content on the screen. To achieve adequate image stabilization on the retina, the maximum image-processing time (stimulus pixelization and display) was kept below 10 ms. To fulfill this condition, enormous processing power is needed when large Gaussian widths are used, due to significant amounts of overlap across neighboring pixels. For real-time pixelization, the processing power of our equipment limited us to Gaussian widths up to 0.14 pixels.
Six normal subjects (26, 29, 29, 33, 34, and 41 years of age) participated in the second experiment. Pixelizations with six different Gaussian widths (σ of 0.036, 0.071, 0.143, 0.286, 0.571, and 1.143 pixels) were tested and compared with square pixelization. The effect of varying the Gaussian width σ for image pixelization is illustrated in
Figure 5 . In all conditions, the 10° × 7° viewing window contained 572 pixels (resolution shown to provide enough information for useful full-page text reading
12 ). Each subject had to read an article of approximately 250 words (i.e., 10 consecutive text segments, per condition). Three subjects started the experiment with Gaussian pixelization at the smallest σ value, progressed toward the larger Gaussian widths, to finish with square pixelization. The remaining three subjects conducted the experiment inversely.
Mean reading performances versus Gaussian function width (σ) are shown in
Figure 6and compared to results obtained with square pixelization. Four Gaussian widths (σ = 0.071, 0.143, 0.286, and 0.571 pixels) resulted in reading scores above 94% correctly read words. These scores were very close to those obtained with square pixelization
(Fig. 6a) . Mean reading scores with σ = 0.143 and 0.286 pixels were found to be significantly better than those obtained with square pixelization (
P = 0.04 and 0.009, respectively). Reading scores declined markedly below 80% for the two extreme Gaussian widths tested (σ = 0.036 and 1.143 pixels).
Mean reading rates displayed a similar picture. A maximum reading rate of 70 words/min was achieved at σ = 0.286 pixels. This value is significantly higher (P < 0.001) than the reading rate of 57 words/min achieved with square pixelization. Reading rates with σ = 0.143 and 0.571 pixels were not significantly different from those obtained with square pixelization. For σ = 0.036, 0.071, and 1.143 pixels, reading rates declined markedly (below 40 words/min).
Taken together, these data reveal that Gaussian pixelization can lead to slightly, but significantly better reading performance than can its square counterpart. This suggests that some degree of image smoothing resulting from overlapping between neighboring pixels can be beneficial for reading. This benefit is, however, only observed for a restricted range of overlapping.
Results of the second experiment demonstrated that off-line Gaussian pixelization could lead to significantly better reading performance than off-line square pixelization. A third experiment was thus dedicated to extend this comparison to real-time mode.
For this evaluation we would have rather used the “optimal” Gaussian width (σ = 0.286 pixels) determined in the second experiment. However, the total processing time needed to simulate this condition turned out to be too important to ensure adequate image stabilization on the retina. Using the second best condition (σ = 0.143 pixels) allowed us to keep processing time below 10 ms. The same six normal volunteers who had participated in the second experiment were requested to read 10 text segments in each of two conditions: (1) real-time Gaussian pixelization at σ = 0.143 pixels and (2) real-time square pixelization. In both conditions, the 10° × 7° viewing window contained 572 pixels. Three subjects started with real-time square pixelization and then switched to real-time Gaussian pixelization. The remaining three subjects performed the experiment inversely.
The results of this experiment are summarized in
Table 1 . No significant difference in performance was recorded between both types of pixelization. However, reading scores and reading rates tended to be slightly higher with square pixelization. Comparing those real-time scores with their off-line counterparts gathered in the second experiment reveals that both real-time conditions yielded better performance. This performance gain was significant for square pixelization (reading scores:
P = 0.003; reading rates:
P = 0.008), but not for Gaussian pixelization (reading scores:
P = 0.12; reading rates:
P = 0.25).
The exact characteristics of the electrophysiological response of the retina to patterned electrical stimulation remain undetermined to this date. However, the use of 2-D Gaussian functions for stimulus pixelization is certainly a more physiologically pertinent approach than the use of square pixels (pixel borders are smoother and it allows for overlapping between neighboring pixels). As soon as the results of electrophysiological experiments on retinal tissue become available, the parameters of such 2-D Gaussian (or more adequate) functions should be adapted. Our experiments also revealed that Gaussian width is an important factor for readability, suggesting that stimulating current strength and electrode spacing might have to be further “tuned” (within safe and comfortable limits) to achieve the most efficient image transmission possible.
Real-time processing also allows for more realistic simulations of the visual information provided by retinal prostheses. Our results demonstrated that it yields significantly better performance than its off-line counterpart. However, this benefit was relatively moderate, not allowing for a significant reduction (e.g., a factor of two) of the number of stimulation points. Most probably, this advantage will be even less important in visual prostheses with external head-mounted cameras, since head movements are larger and less frequent than eye movements. Recurring head movements could also result in an abnormal vestibulo-ocular reflex.
The first visual prosthesis prototypes have been recently implanted in humans with encouraging results.
5 6 7 Yet, several important challenges still need to be overcome before these devices can provide benefits similar to those of cochlear implants in cases of deafness. The basic notion of patterned vision resulting from the continuous stimulation of several electrodes has not been fully confirmed. An appropriate method of selective stimulation eliciting the adequate psychophysical response has not been developed yet. Another major problem is to achieve efficient electrical stimulation within safe charge density limits.
32 To reduce the total electrical charge injected on the retina, the use of relatively large stimulation electrodes (fundamentally limiting interelectrode spacing) as well as alternate solutions (such as inverted polarity, interleaved stimulation, and/or increasing the total area of the retinal array within feasible limits) may be mandatory. A substantial research effort is therefore still needed to solve these and other open issues before realizing the level of electrode integration suggested by our studies.
In conclusion, these results demonstrate that the spatial and temporal characteristics of image pixelization play a role in artificial vision simulations. Equivalent performance could be reached with a resolution reduction of approximately 30%, if stimulation parameters were adequate. This effect is not strong enough, however, to change fundamentally the minimum requirements determined in our previous studies on the basis of simplified processing:
11 12 Four to five hundred contacts covering a 2 × 3-mm
2 retinal area are necessary to transmit sufficient visual information for full-page text reading. Reading is particularly important because it is strongly associated with vision-related estimates of quality of life and represents one of the main goals of low vision patients seeking rehabilitation.
33 34 35 It is thus important to be aware of such minimal conditions when developing visual prostheses, even if less sophisticated devices might already bring some clinical benefits to patients.
Supported by Swiss National Foundation for Scientific Research Grants 3100-61956.00 and 3152-063915.00 and by the ProVisu Foundation.
Submitted for publication October 4, 2004; revised February 14, May 24, and June 2, 2005; accepted August 1, 2005.
Disclosure:
A. Pérez Fornos, None;
J. Sommerhalder, None;
B. Rappaz, None;
A.B. Safran, None;
M. Pelizzone, None
The publication costs of this article were defrayed in part by page charge payment. This article must therefore be marked “
advertisement” in accordance with 18 U.S.C. §1734 solely to indicate this fact.
Corresponding author: Jörg Sommerhalder, Ophthalmology Clinic, Geneva University Hospitals, 24 rue Micheli-du-Crest, 1211 Geneva 14, Switzerland;
[email protected].
Table 1. Mean Reading Performances with Real-Time Stimulus Pixelization in Six Normal Subjects
Table 1. Mean Reading Performances with Real-Time Stimulus Pixelization in Six Normal Subjects
| Gaussian Pixelization | | Square Pixelization | | P |
Mean Reading Scores (rau ± SEM) | 115.8 ± 3.6 | (99.6%) | 117.2 ± 3.4 | (99.8%) | 0.22 (ns) |
Mean Reading Rates (words/min ± SEM) | 69 ± 12 | | 74 ± 15 | | 0.35 (ns) |
The authors thank Andrew Whatham, PhD, for insightful contributions and a critical review of the manuscript.
RizzoJF, WyattJ. Prospects for a visual prosthesis. Neuroscientist. 1997;3:251–262.
[CrossRef] NormannRA, MaynardEM, RouschePJ, WarrenDJ. A neural interface for a cortical vision prosthesis. Vision Res. 1999;39:2577–2587.
[CrossRef] [PubMed]DobelleWH. Artificial vision for the blind by connecting a television camera to the visual cortex. ASAIO J. 2000;46:3–9.
[CrossRef] [PubMed]ZrennerE. Will retinal implants restore vision?. Science. 2002;295:1022–1025.
[CrossRef] [PubMed]HumayunMS, WeilandJD, FujiiGY, et al. Visual perception in a blind subject with a chronic microelectronic retinal prosthesis. Vision Res. 2003;43:2573–2581.
[CrossRef] [PubMed]VeraartC, Wanet-DefalqueMC, GerardB, VanlierdeA, DelbekeJ. Pattern recognition with the optic nerve visual prosthesis. Artif Organs. 2003;27:996–1004.
[CrossRef] [PubMed]ChowAY, ChowVY, PackoKH, PollackJS, PeymanGA, SchuchardR. The artificial silicon retina microchip for the treatment of vision loss from retinitis pigmentosa. Arch Ophthalmol. 2004;122:460–469.
[CrossRef] [PubMed]LecchiM, MargueratA, IonescuA, et al. Ganglion cells from chick retina display multiple functional nAChR subtypes. Neuroreport. 2004;15:307–311.
[CrossRef] [PubMed]LinderholmP, BertschA, RenaudP. Resistivity probing of multi-layered tissue phantoms using microelectrodes. Physiol Meas. 2004;25:645–658.
[CrossRef] [PubMed]ZieglerD, LinderholmP, MazzaM, et al. An active microphotodiode array of oscillating pixels for retinal stimulation. Sensors and Actuators A: Physical. 2004;110:11–17.
[CrossRef] SommerhalderJ, OueghlaniE, BagnoudM, LeonardsU, SafranAB, PelizzoneM. Simulation of artificial vision: I. Eccentric reading of isolated words, and perceptual learning. Vision Res. 2003;43:269–283.
[CrossRef] [PubMed]SommerhalderJ, RappazB, de HallerR, Pérez FornosA, SafranAB, PelizzoneM. Simulation of artificial vision: II. Eccentric reading of full-page text and the learning of this task. Vision Res. 2004;44:1693–1706.
[CrossRef] [PubMed]WeilandJD, HumayunMS, DagnelieG, De JuanE, GreenbergRJ, IliffNT. Understanding the origin of visual percepts elicited by electrical stimulation of the human retina. Graefes Arch Clin Exp Ophthalmol. 1999;237:1007–1013.
[CrossRef] [PubMed]StettA, BarthW, WeissS, HaemmerleH, ZrennerE. Electrical multisite stimulation of the isolated chicken retina. Vision Res. 2000;40:1785–1795.
[CrossRef] [PubMed]RizzoJF, WyattJ, LoewensteinJ, KellyS, ShireD. Perceptual efficacy of electrical stimulation of human retina with a microelectrode array during short-term surgical trials. Invest Ophthalmol Vis Sci. 2003;44:5362–5369.
[CrossRef] [PubMed]HarmonLD, JuleszB. Masking in visual recognition: effects of two-dimensional filtered noise. Science. 1973;180:1194–1197.
[CrossRef] [PubMed]BachmannT. Identification of spatially quantised tachistoscopic images of faces: how many pixels does it take to carry identity?. Eur J Cogn Psychol. 1991;3:87–107.
[CrossRef] UttalWR, BaruchT, AllenLA. parametric study of face recognition when image degradations are combined. Spat Vis. 1997;11:179–204.
[CrossRef] [PubMed]LeeuwenbergE. Miracles of perception. Acta Psychol (Amst). 2003;114:379–396.
[CrossRef] [PubMed]BachmannT, KahuskN. The effects of coarseness of quantisation, exposure duration, and selective spatial attention on the perception of spatially quantised (‘blocked’) visual images. Perception. 1997;26:1181–1196.
[CrossRef] [PubMed]LappinJS, TadinD, WhittierEJ. Visual coherence of moving and stationary image changes. Vision Res. 2002;42:1523–1534.
[CrossRef] [PubMed]ChristieF, BruceV. The role of dynamic information in the recognition of unfamiliar faces. Mem Cognit. 1998;26:780–790.
[CrossRef] [PubMed]LanderK, ChristieF, BruceV. The role of movement in the recognition of famous faces. Mem Cognit. 1999;27:974–985.
[CrossRef] [PubMed]ThorntonIM, KourtziZ. A matching advantage for dynamic human faces. Perception. 2002;31:113–132.
[CrossRef] [PubMed]ChaK, HorchKW, NormannRA. Mobility performance with a pixelized vision system. Vision Res. 1992;32:1367–1372.
[CrossRef] [PubMed]ChaK, HorchKW, NormannRA, BomanDK. Reading speed with a pixelized vision system. J Opt Soc Am A. 1992;9:673–677.
[CrossRef] [PubMed]HumayunMS. Intraocular retinal prosthesis. Trans Am Ophthalmol Soc. 2001;99:271–300.
[PubMed]HayesJS, YinVT, PiyathaisereD, WeilandJD, HumayunMS, DagnelieG. Visually guided performance of simple tasks using simulated prosthetic vision. Artif Organs. 2003;27:1016–1028.
[CrossRef] [PubMed]ThompsonRW, BarnettGD, HumayunMS, DagnelieG. Facial recognition using simulated prosthetic pixelized vision. Invest Ophthalmol Vis Sci. 2003;44:5035–5042.
[CrossRef] [PubMed]StudebakerGA. A “rationalized” arcsine transform. J Speech Hear Res. 1985;28:455–462.
[CrossRef] [PubMed]CostenNP, ParkerDM, CrawI. Spatial content and spatial quantisation effects in face recognition. Perception. 1994;23:129–146.
[CrossRef] [PubMed]BrummerSB, RobbleeLS, HambrechtFT. Criteria for selecting electrodes for electrical stimulation: theoretical and practical considerations. Ann N Y Acad Sci. 1983;405:159–171.
[CrossRef] [PubMed]WolffsohnJS, CochraneAL. The changing face of the visually impaired: the Kooyong low vision clinic’s past, present, and future. Optom Vis Sci. 1999;76:747–754.
[CrossRef] [PubMed]HazelCA, PetreKL, ArmstrongRA, BensonMT, FrostNA. Visual function and subjective quality of life compared in subjects with acquired macular disease. Invest Ophthalmol Vis Sci. 2000;41:1309–1315.
[PubMed]McClureME, HartPM, JacksonAJ, StevensonMR, ChakravarthyU. Macular degeneration: do conventional measurements of impaired visual function equate with visual disability?. Br J Ophthalmol. 2000;84:244–250.
[CrossRef] [PubMed]