May 2011
Volume 52, Issue 6
Free
Visual Psychophysics and Physiological Optics  |   May 2011
Chinese Character Recognition Using Simulated Phosphene Maps
Author Affiliations & Notes
  • Ying Zhao
    From the Department of Biomedical Engineering, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China; and
  • Yanyu Lu
    From the Department of Biomedical Engineering, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China; and
  • Chuanqing Zhou
    From the Department of Biomedical Engineering, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China; and
  • Yao Chen
    From the Department of Biomedical Engineering, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China; and
  • Qiushi Ren
    the Department of Biomedical Engineering, College of Engineering, Peking University, Beijing, China.
  • Xinyu Chai
    From the Department of Biomedical Engineering, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China; and
  • *Each of the following is a corresponding author: Xinyu Chai, Department of Biomedical Engineering, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China; xychai@sjtu.edu.cn. Qiushi Ren, Department of Biomedical Engineering, College of Engineering, Peking University, Beijing, 100871, China; renqsh@coe.pku.edu.cn
  • Footnotes
    2  These authors contributed equally to the work presented here and should therefore be regarded as equivalent authors.
Investigative Ophthalmology & Visual Science May 2011, Vol.52, 3404-3412. doi:10.1167/iovs.09-4234
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to Subscribers Only
      Sign In or Create an Account ×
    • Get Citation

      Ying Zhao, Yanyu Lu, Chuanqing Zhou, Yao Chen, Qiushi Ren, Xinyu Chai; Chinese Character Recognition Using Simulated Phosphene Maps. Invest. Ophthalmol. Vis. Sci. 2011;52(6):3404-3412. doi: 10.1167/iovs.09-4234.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Purpose.: A visual prosthetic device may produce phosphene maps in which individual phosphene characteristics can be altered. This study was an investigation of the ability of normally sighted subjects to recognize Chinese characters (CCs) after altering simulated phosphene maps.

Methods.: Thirty volunteers with normal or corrected visual acuity of 20/20 were recruited. CC recognition accuracy and response time were investigated while one parameter was changed (distortion, pixel dropout percentage, pixel size variability, or pixel gray level) or different combinations of three parameters were used. Five hundred CCs consisting of 1 to 16 strokes were used for the character sets.

Results.: CC recognition accuracy and response times respectively decreased and increased when distortion, dropout, and pixel size variability increased. Gray levels did not significantly affect the results, except when eight levels were used. To maintain an 80% accuracy rate, there should be a distortion index (k) of no more than 0.2 (irregularity), a pixel dropout of 20%, and a pixel size range of 1 to 16 mm (7–112 min arc). Only a combination of a k = 0.1 distortion index, a dropout of 10%, and a pixel size range of 1.33 to 12 mm (9.3–84 min arc) achieved a goal of ≥80% accuracy.

Conclusions.: Distortion, dropout percentage, and pixel size variability have a significant impact on pixelated CC recognition. Although at present the visual ability of prosthesis users is limited, it should be possible to extend this to CC recognition and reading in the future. The results will help visual prosthesis researchers determine the effects of altering phosphene maps and improve outcomes for patients.

Visual prostheses have the potential to restore partial function to individuals by electrically stimulating different parts of the visual pathway (retina, optic nerve, or cortex) and have become an increasingly prominent topic in the field of neural prosthetics. 1 8 A prosthesis may provide useful visual percepts in the form of spots of light called phosphenes. 9 These phosphenes form rudimentary building blocks in prosthetic vision that can be used to realize more complex patterns representing visual scenes. The possibility of restoring vision by using multiple simultaneously elicited phosphenes with a microelectronic prosthesis is the foundation for current clinical trials and is the general assumption made in prosthetic visual simulations. 10  
Human clinical trials have reported different phosphene characteristics. Brindley and Lewin 11 implanted an 80-electrode device in the visual cortex of a 52-year-old blind woman that elicited only 40 phosphenes, consisting of spots of white light or a small cloud of spots. Increasing the duration of each radio pulse from 0.1 to ∼0.6 ms with constant frequency and strength could make the phosphenes appear brighter. Veraart et al. 12 chose a blind volunteer with retinitis pigmentosa and implanted a self-sizing spiral cuff electrode around the optic nerve. The phosphenes were a set of spots ranging from 8 to 43 min arc (1–5.5 mm at a distance of 0.45 m) with nine levels of brightness. The perceived attributes of a phosphene, such as brightness, color, size, organization, and position, changed with different stimulating pulse durations and/or train frequencies. Occasionally, adjacent phosphenes overlapped. Humayun et al. 13 implanted an intraocular epiretinal 4×4-electrode array in a completely blind subject. The subject described the phosphenes as round spots of light or rings with a bright center and a black surround. The size and brightness of the spots (10 levels) increased with higher stimulation currents. Although the location of the electrode array and percept in most cases corresponded retinotopically, there were several electrodes that did not cause phosphenes or caused phosphenes that did not show retinotopic correspondence. Other groups have reported similar phosphene characteristics (Wilke R, et al. IOVS 2006;47:ARVO E-Abstract 3202). 14,15 All clinical trials show that the size, brightness, dropout, and arrangement of the phosphenes change under different implant and stimulation conditions. 
Based on visual prosthesis trials in blind people, some investigators have studied simulated prosthetic vision with normal subjects. 16 27 Although testing simulated phosphenes in normal subjects can only be a rough approximation of the situation of blind subjects with a retinal disease, simulated prosthetic vision can help in determining requirements for visual tasks, as well as in exploring and understanding future reports by prosthesis wearers and finding solutions for their problems; conveying the prosthetic experience to clinicians and the public; and designing rehabilitation programs for future prosthesis recipients. 28 Cha et al. 21 measured English reading speed using a pixelated visual presentation. The results indicated that, by using 625 pixels, reading speed could reach 170 words/min with scrolling text and 100 words/min with fixed text. Sommerhalder et al. 25 27 discussed eccentric reading of isolated French words and perceptual learning and indicated that it would be necessary to use more than 300 retinal stimulation contacts to restore some reading ability and a minimum of 600 for useful full-page text reading; different pixel shape also affected the performance. Dagnelie et al. 29 using five parameters to simulate adequate English reading (dot size and spacing, grid size, random dropout percentage, and gray-scale resolution), showed that a 3 × 3-mm2 prosthesis with 16 × 16 electrodes could allow paragraph reading if distinct phosphenes were perceived. Similarly, Chai et al. 18 explored minimum requirements for efficient recognition of pixelated Chinese characters (CCs) and reported that recognition accuracy could reach 100% when the number of pixels reached 12 × 12, although the complexity and number of strokes composing the isolated character affected accuracy. A stroke is the basic structural unit of CCs and several reports have shown that the number of strokes within a character can affect its recognition. 30 34 Cai et al. 35 investigated prosthetic visual acuity by altering the geometric irregularity of simulated phosphene maps and two different down-sampling schemes. Their findings showed that the irregularity of simulated phosphene maps had a negative effect on visual acuity. 
Despite such encouraging results, most of the previous research was based on regular simulated phosphene maps and focused on visual acuity or words written in the Latin alphabet. We have investigated CC recognition performance in normal-sighted individuals based on simulated phosphene maps that were formed by changing four parameters: distortion, percentage of pixel dropout, pixel size variability, and the gray level. Recognition accuracy and reaction time during single parameter and combination experiments were measured. 
Methods
Choice of Experimental Chinese Characters
A character library was chosen from The Frequency Statistics for Commonly Used Modern Chinese Characters issued by the State Language Work Committee and the State Education Commission in 1989. 36 To avoid the effect of unfamiliar or infrequently used CCs, we selected the first 500 characters from the statistical table. These CCs consist of 1 to 16 character strokes, have a frequency of use ranging from 0.042% to 3.593%, and could provide nearly 80% of daily reading information. 
We transformed CCs (Song typeface) to 12 × 12 regular simulated phosphene maps to build a test character library (TCL) according to the CC Guo Biao code (Chinese National Standard) with their corresponding grapheme (see also Chai et al., 18 ). Changes in distortion, dropout, the range of pixel sizes, or gray scale were made to the full 144-pixel grid, and then the pixels were chosen that corresponded to the particular CC (Figs. 1, 2). 
Figure 1.
 
Example of a basic pixelated Chinese character. Left: the original CC (stroke number, 8; pronunciation, de); middle: the full regular 12 × 12 array; and right: the rendered CC constructed from the 12 × 12 simulated gray level 1 phosphene map.
Figure 1.
 
Example of a basic pixelated Chinese character. Left: the original CC (stroke number, 8; pronunciation, de); middle: the full regular 12 × 12 array; and right: the rendered CC constructed from the 12 × 12 simulated gray level 1 phosphene map.
Figure 2.
 
Flow chart showing the formation of the altered characters.
Figure 2.
 
Flow chart showing the formation of the altered characters.
Subjects
The 30 volunteer students (14 women and 16 men) recruited from Shanghai Jiao Tong University were native Chinese speakers 21 to 28 years of age, with a normal or corrected visual acuity of 20/20. They all passed a Standard Chinese Proficiency Test (PSC Test, Grade B, second level) and reached 100% recognition accuracy on an assessment program containing 200 CCs randomly chosen from the TCL. Before the experiments, the purpose and procedures of the study were explained to every subject, and the research adhered to the tenets of the Declaration of Helsinki. 
Apparatus
All trials were performed in a quiet, dark room. The test platform consisted of a personal computer (2.8 GHz CPU; Dell Inc., Austin, TX), a 17-in. CRT screen (1024 × 768 at 85 Hz, 110 MHz; EMC; Proview Technology Co. WuHan, China), a light-shielding device (40 × 40 × 50 cm) to avoid the influence of stray light around the test screen, a head rest, and a microphone to record verbal responses. 
The computer was used for experimental control and was connected via a VGA distributor to the CRT screen. Experimental software, developed in-house with Visual C++ language, was run on the Windows XP operating system (Microsoft, Redmond, WA), displayed the CCs one at a time for a fixed duration, and recorded the recognition times and verbal responses of each subject. The head rest was used to position the subject's head and maintain the visual angle on the center of the screen. 
Previous research 37 has shown that the optimal visual angle for CC recognition is 10° to 11°, therefore the CRT resolution was set to 800 × 600, and the subject's head positioned 50 cm from the screen so that each CC occupied 10° of the visual angle. 
Experiment 1: Effect of CC Distortion
Blind subjects have reported that phosphenes appear distorted with respect to the retinotopic position of the regular stimulus phosphene map (i.e., a positional uncertainty or retinotopic displacement).12,13,38 A two-dimensional Gaussian distribution with 0 mean and mutual independence was used to mimic this situation (see also Cai S. et al.35). The probability distribution function is   where Δx and Δy are the deviations from the regular matrix in the horizontal and vertical directions39 and σ is the standard deviation, which indicates the degree of positional uncertainty of the phosphenes. In the regular array, each pixel obeys equation 1 with the same σ, where   and S is the center-to-center distance between two neighboring pixels in the array. The irregularity index, k, measures positional uncertainty and is an index of the geometric array deformation. In this experiment, trials were performed in which k = 0.1, 0.2, 0.3, and 0.4. In each trial, the location of each simulated phosphene in the original 12 × 12 array was distorted according to equation 1. Then the CCs were mapped onto a distorted array to locate the pixels necessary to form the altered CC (Figs. 2, 3). Thus, the phosphenes in the map show the same information that would have been shown in an undistorted map but at each new displaced location. The 500 distorted CCs at each distortion level formed the distortion libraries (DisL)-0.1, -0.2, -0.3, and -0.4. One hundred CCs randomly selected from corresponding DisLs were used in each trial as the character set (CS). This CS was then used for all subjects. To ensure uniformity, the four CSs had similar stroke number distributions and character frequencies. 
Figure 3.
 
Examples of CCs formed by different simulated phosphene maps. The gray level 1 sample shows the regular and standard CC (pronunciation, de).
Figure 3.
 
Examples of CCs formed by different simulated phosphene maps. The gray level 1 sample shows the regular and standard CC (pronunciation, de).
Experiment 2: Effect of the Pixel Dropout Percentage
If electrodes contact degenerating cells or fibers or stimulus intensity does not reach a threshold value that causes phosphenes, there will be missing points in the visual field. 11,13 To simulate this configuration, trials were performed by using four dropout percentages (10%, 20%, 30%, and 40%). For example, 20% of the 144 pixels in the 12 × 12 array were randomly removed; dropout libraries (DroL)-10%, -20%, -30%, and -40% were formed (as in experiment 1), and a CS of 100 CCs randomly selected from the DroLs. 
Experiment 3: Effect of Pixel Size Variation within a CC
Because of technical limitations, each electrode may contact a variable number of nerve cells as a result of the asymmetrical distribution of the cells/fibers or pathologic changes, thereby resulting in a range of phosphene sizes. 40,41 Furthermore, changes in stimulus current can result in different phosphene sizes. 13 We used four different pixel size ranges with a total range from 0.166 to 6 times the regular spot diameter of 4 mm (28 min arc) (i.e., pixel size range index 3: 0.33–3 × 4 mm = 1.33−12 mm [9.3–84 min arc]; pixel size range index 6: 0.67–24 mm [4.7–168 min arc]. The size interval of all ranges was 0.4 mm (2.8 min arc). The altered CCs were processed to form size libraries (SL)-3, -4, -5, and -6. Trials were performed using 100 CCs randomly selected from each SL. 
Experiment 4: Effect of Gray Level
Veraart et al. 12 and Humayun et al. 13 reported that subjects using a visual prosthesis could identify various levels of brightness in response to electrode stimulation. In this regard, others 23,29,41 have used the gray level as another factor in their simulated visual prosthesis trials. We used four options with differing numbers of gray-scale values (1 = 255; 4 = 63, 127, 191, and 255; 6 = 43, 86, 129, 172, 215, and 255; and 8 = 31, 63, 95, 127, 159, 191, 223, and 255). For example using option 4, meant that the gray-scale value of each simulated phosphene in the array would be randomly assigned one of four values (63, 127, 191, or 255). The altered CCs formed the libraries gray level (GL)-1, -4, -6, and -8 (see Fig. 2). GL-1 also formed the TCL; 100 CCs randomly selected from the GLs were used in each trial (see Fig. 3). 
It is expected that each prosthetic user will have a phosphene map that is specific to that individual, due to differences in retinal disease, retinal ganglion cell survival, and electrode array placement. The randomized presentation of the simulated maps attempts to mimic the variability in the phosphene maps experienced by prosthetic users. 
Experiment 5: Effect of Combined Parameters
Subjects with an implanted visual prosthesis have indicated that different visual percepts were elicited by different combinations of the stimulus parameters; for example changes in current intensity versus frequency. 12,13 Therefore, it is important to investigate the effects caused by combined parameters. 
Based on the results of the four single-parameter experiments, two levels of three parameters (distortion, k: 0.1, 0.2; dropout: 10%, 20%; and pixel size range index: 3, 4) were combined (Fig. 4). The recognition accuracy of the single parameters used to form the combinations was between 75% and 95%. Trials were performed using eight character sets of 100 CCs with a random combination of the parameters. 
Figure 4.
 
Examples of simulated phosphene maps of CCs formed by different combinations of the parameters (pronunciation, de).
Figure 4.
 
Examples of simulated phosphene maps of CCs formed by different combinations of the parameters (pronunciation, de).
Procedure
A training program was performed before the pretest and formal experiments, to help subjects adapt to the dark experimental environment and to familiarize them with the setup and procedure. The training CS was randomly selected from DisL, DroL, SL, and GL and contained 200 examples of CC simulated phosphene maps. 
Optimal display duration was determined with a pretest by using the perceptually more challenging levels of the four parameters (i.e., distortion, k: 0.3, 0.4; pixel dropout percentage: 30%, 40%; pixel size range index: 5, 6; gray level: 6, 8). Ten subjects were randomly selected from the pool of volunteers. Although the duration of the CC presentation was not fixed in the pretest, the subjects were encouraged to identify and read aloud the CC as soon as possible and then press the keyboard for the next CC. In the case of nonrecognition, the subjects responded with “pass” and pressed “page down” for the next CC. Statistical analysis confirmed that 4 seconds was adequate for both the CC presentation and response (standard presentation duration for 100 CCs/trial = 400 seconds). 
In formal experiments, the four parameters and the combined parameters were presented in five independent experiments. Twenty subjects participated in experiment 1 to 4, and 10 participated in experiment 5. All subjects viewed the same CS for a given parameter value or for combined parameters. 
The subjects were encouraged to read aloud a recognized CC as soon as possible and to avoid guessing. Although a fixed presentation of 4 seconds was used in formal experiments, the subjects could move on to the next CC (as in the pretest) if the task was completed in less time, or a “pass” response was indicated. Failure to respond (in 4 seconds) resulted in an automatic advance to the next CC and delayed responses (>4 seconds) were recorded as a failure, irrespective of whether the answer corresponded to the next CC. To reduce any learning effect, each trial began with the more challenging levels. 
The verbal responses were counted and analyzed for recognition accuracy after each experiment. The data points represent the mean number of correct responses (percentage ± SD) for combined data from all subjects. Recognition time in each trial was recorded by a special time-count program and represented the total amount of time used to recognize the 100 CCs in each trial. The data points represent the mean value (seconds ± SD) for combined data from all subjects. Data were analyzed with ANOVAs and t-tests (two-tailed; a Bonferroni correction was applied to multiple comparisons) performed with commercial software (Statistical Product and Service Solutions [SPSS], ver. 16.0 for Windows; SPSS, Inc., Chicago, IL), and P ≤ 0.05 (after correction) was considered significant. 
Results
Experiment 1: Effect of CC Distortion
The recognition accuracy and recognition time of distorted CCs were reduced and increased respectively, with increasing amounts of distortion (Fig. 5A). Recognition accuracy was slightly but significantly affected, even at the k = 0.1 distortion level (98.5% ± 1.8%) when compared with the standard test CC (t = 2.714, P = 0.014; n = 20; distortion k = 0, 0% dropout, spot size 4 mm, gray level 255). Between different levels, recognition accuracy also significantly decreased (P < 0.001); the amount of the decrease appeared to be greatest between levels k = 0.2 (89.9% ± 6.2%) to k = 0.3 (58.8% ± 9.8%). Although total recognition response time at each level was less than 400 seconds, this significantly increased (P < 0.001) at the higher distortion levels to a maximum of 270 ± 73 seconds at k = 0.4 (level 0.1 vs. level 0.2, NS). 
Figure 5.
 
Recognition accuracies and total response times after presentation of stimulated phosphene maps with (A) distorted CCs, (B) different pixel dropout percentages, (C) different range of pixel sizes, and (D) different gray levels. Note that the task was completed in less than the required 4-second presentation limit and the value for gray level 1 indicated the highest attained training level (i.e., control baseline for all parameters, distortion, k = 0, dropout 0%, spot size = 4 mm [2.8 min arc], gray level 255; n = 20). ***P < 0.001, compared with the standard test CC.
Figure 5.
 
Recognition accuracies and total response times after presentation of stimulated phosphene maps with (A) distorted CCs, (B) different pixel dropout percentages, (C) different range of pixel sizes, and (D) different gray levels. Note that the task was completed in less than the required 4-second presentation limit and the value for gray level 1 indicated the highest attained training level (i.e., control baseline for all parameters, distortion, k = 0, dropout 0%, spot size = 4 mm [2.8 min arc], gray level 255; n = 20). ***P < 0.001, compared with the standard test CC.
Experiment 2: Effect of Pixel Dropout
Figure 5B shows that changing the percentage of pixel dropout had an effect on both accuracy and response time similar to that seen with image distortion. Accuracy was significantly reduced compared with the TCL presentation (cf. gray level 1; P < 0.001) and between different dropout levels (P < 0.001). A 40% dropout level reduced accuracy to below 50% (48.7% ± 6.8%). Increasing the number of missing pixels also significantly increased (P < 0.001) the total response time for the higher levels, reaching a maximum of 220 ± 46 seconds at the 40% level. A dropout of 10% to 20% significantly reduced accuracy but not response time, compared with distortion at equivalent levels, but a k = 0.4 distortion level clearly had a greater effect on accuracy and response time compared with a 40% pixel dropout (36.8% ± 7.0% vs. 48.7% ± 6.8% and 270 ± 73 seconds vs. 220 ± 46 seconds, respectively; P < 0.001). 
Experiment 3: Effect of Pixel Size Variation within a CC
Changing the range of pixel sizes reduced accuracy to lower values than those seen after changes to the dropout percentage at all levels, although the reduced accuracy was only significant for size indexes 3 and 6 (Fig. 5C). Size variability at levels 3 and 4 had a greater effect on accuracy than did distortion, but this was not maintained at levels 5 and 6, which had a lesser effect than distortion (P < 0.001). Response time significantly increased as the amount of size variability increased (P < 0.001), but was not significantly different from response times recorded for the dropout trials. Response times were significantly different when compared with TCL CC responses and between different pixel size ranges (P < 0.001). Similar to comparisons between distortion versus dropout, using a k = 0.4 distortion level resulted in significantly lower accuracy and increased total response times compared with pixel size range index 6 (size range, 0.68–24 mm [4.8–168 min arc]; 36.8% ± 7.0% vs. 42.6% ± 9.1% and 270 ± 73 seconds vs. 217 ± 38 seconds, respectively; P < 0.001). 
Experiment 4: Effect of Gray Level
Altering the gray levels had a very small, albeit significant, effect on the accuracy and response times when the CC pixels contained eight gray levels (P < 0.001; Fig. 5D). However, recognition accuracies were all higher than 95%, showing clearly that gray level had less effect than the other parameters. 
Effects of CC Stroke Number
CCs with fewer strokes are more easily recognized; therefore, we binned the CC responses in each trial according stroke number (1–5, 6–10, and 11–16) and calculated the recognition accuracy (Fig. 6). Note that “response time” represents the total response time for the 100 CCs; thus, the response time for an individual CC was not analyzed. 
Figure 6.
 
The recognition accuracy of CCs with a different number of strokes formed by different simulated phosphene maps. (A) Distortion k, (B) dropout, (C) pixel size range, and (D) gray level (n = 20). Recognition accuracy values obtained for CCs with one to five strokes at the lowest level in each parameter were compared with CCs that had a higher number of strokes (significance shown in parentheses). Other comparisons were between CCs with the same number of strokes at higher levels (NS, nonsignificant; ***P < 0.001). Accuracy was significantly affected for all comparisons with the standard test CC, except for changes in gray level (see also Fig. 5).
Figure 6.
 
The recognition accuracy of CCs with a different number of strokes formed by different simulated phosphene maps. (A) Distortion k, (B) dropout, (C) pixel size range, and (D) gray level (n = 20). Recognition accuracy values obtained for CCs with one to five strokes at the lowest level in each parameter were compared with CCs that had a higher number of strokes (significance shown in parentheses). Other comparisons were between CCs with the same number of strokes at higher levels (NS, nonsignificant; ***P < 0.001). Accuracy was significantly affected for all comparisons with the standard test CC, except for changes in gray level (see also Fig. 5).
The number of strokes at the lowest level of distortion (k = 0.1) and CCs with 1 to 5 strokes and k = 0.2 did not significantly affect CC recognition; however, successively higher levels had a significantly greater impact, which was more pronounced as the number of strokes increased (P < 0.001 for values both within and between levels; Fig. 6A). Note that for a k = 0.4 level of distortion almost all CCs with >10 strokes were unrecognizable (mean, 1.7 ± 5.1%). Distortion levels k = 0.3 and 0.4 reduced recognition accuracy to a greater extent compared with the other parameters. Low levels of dropout (Fig. 6B) had a similar nonsignificant effect at the lowest levels for all CCs, but recognition of CCs with six strokes or more was significantly reduced at higher levels—in particular, those with a dropout of 40%. Dropout levels greater than 30% had a smaller effect on recognition when compared with distortion. A 40% dropout level still attained response levels equivalent to a k = 0.3 distortion level for CCs with >10 strokes. Increasing the size variability of the pixels significantly (P < 0.001) reduced the accuracy between the character groups (size index level 3, 1–5 and 6–10 strokes were not significant). In contrast to the other parameters, there was less variability between levels and number of strokes; for example, the accuracy for CCs with higher stroke number at lower levels were not significantly different from the accuracy of CCs with lower stroke number at a higher level (i.e., recognition of CCs with 11–16 strokes at size index 4 was not significantly different compared with index 5, 6–10 strokes). Although largely similar in effect to dropout, alteration of the range of spot sizes (Fig. 6C) had a greater negative effect on recognizing CCs with >10 strokes compared with dropout, but an index of 5 or 6 did not adversely affect recognition to the same degree as distortion did. Recognition accuracies for CCs with one to five strokes were generally similar when different parameters were compared at the same level, although a size index level of 6 had a greater effect than the other parameters. Given the results shown in Figure 5D, it is not surprising that changing the gray level did not significantly alter response accuracy, except at gray level 8 and CCs with >10 strokes (P < 0.001; Fig. 6D). 
Experiment 5: Effect of Combined Parameters
If an acceptable prosthetic performance is defined as a CC recognition rate ≥80%, the above experiments suggest the following parameters: an amount of nonretinotopic phosphene incongruity equivalent to a k ≤ 0.2 distortion level, no more than 20% pixel dropout, and pixel sizes within the range of 1.0 to 16 mm (7–112 min arc). Gray-scale levels appear not to be a significant factor in recognition, but may ultimately provide additional clues. Experiment 5 examined the effect of different combinations of these three parameters (Fig. 2). 
Recognition accuracy decreased with all combinations used, compared with the standard test CC (P < 0.001, n = 10) and ranged from a maximum of 90.6% ± 3.1% (distortion, k, 0.1; dropout, 10%; and size range index, 3) to a low of 46.3% ± 6.4% (k, 0.2; dropout, 20%; and size range index, 4). Only combination 1 (k, 0.1; dropout, 10% and size range index 3) achieved ≥80% accuracy and provided significantly higher accuracy rates compared with the other combinations (P < 0.001); however, maintaining a low dropout level (10%) appeared to be somewhat more beneficial in maintaining an accuracy level near 70% (Fig. 7A). 
Figure 7.
 
Recognition accuracy (A) and response time (B) to CCs formed by different combinations of the parameters within the simulated phosphene map (n = 10). All comparisons versus the standard CC test were significant (P < 0.001). Nonsignificant comparisons between parameter conditions are indicated NS; all other comparison were significant at P < 0.05.
Figure 7.
 
Recognition accuracy (A) and response time (B) to CCs formed by different combinations of the parameters within the simulated phosphene map (n = 10). All comparisons versus the standard CC test were significant (P < 0.001). Nonsignificant comparisons between parameter conditions are indicated NS; all other comparison were significant at P < 0.05.
Consistent with Figure 5, there was an inverse relationship between response accuracy and total response time (Fig. 7B). Total response time increased from 130 ± 28 seconds (distortion, k 0.1; dropout, 10%, size range index, 3) to a maximum of 248 ± 46 seconds (distortion, k 0.2; dropout, 20%, size range index, 4). Similarly, combination 1 had significantly lower response times than did other combinations (P < 0.001). 
Discussion
Changing the amount of character distortion, the percentage pixel dropout, or the range of pixel sizes within a CC, significantly decreased recognition accuracy and increased response time when subjects viewed simulated phosphene maps. Changing the distortion index, k (irregularity), to values greater than 0.2, increasing pixel dropout by more than 20%, or using a pixel size range greater than 1.0 to 16 mm (7–112 min arc) decreased CC recognition rates to less than 80%. When all three parameters were changed at the same time, only the combination with a distortion index of 0.1, dropout of 10%, and pixel size range of 1.33 to 12 mm (9.3–84 min arc) achieved ≥80% CC recognition accuracy. Recognition accuracy was clearly affected as the number of strokes increased, although recognition of CCs with one to five strokes was generally similar across the different parameters, CC recognition significantly decreased when CCs with a higher number of strokes were combined with higher distortion levels. The parameters can be loosely ranked in the following order of importance—size variability, dropout, distortion, and gray level—when small changes are made to the map, but with large deviations this order changes to distortion, size variability, dropout, and gray level. 
CC Distortion
A lack of retinotopic correspondence between the stimulation site and the perceived location in visual space has been reported in several studies. 12,13,38 In our study, low levels of distortion (nonretinotopic correspondence) were less likely to be problematic in CC recognition, but it became a more important issue as the number of noncorresponding phosphenes and/or strokes increased. By the same token, these findings indicate that recognition accuracy can be improved for prosthesis wearers by distortion correction. Since the phosphene locations are purely dependent on the location of the electrodes and the tissues with which they interact appear to be relatively stable, 13 it should be possible to adjust the stimulus characteristics of the array to either remove the nonretinoptic phosphene (a dropout of stimulation at the specified electrode), or reduce its size (reduced stimulus current and/or frequency). Alternatively, adjustments can be made to the stimulus pattern of the sampling array so that the image pattern coincides with the actual phosphene map. 10,42 However, as shown in this study, a dropout strategy should be used with caution, as excessive dropout of phosphenes, that is retinotopic phosphenes possibly associated with the same stimulation site, will also reduce the chances of CC recognition. 
Pixel Dropout
Dagnelie et al. 29 have shown that English reading accuracy with optimal character size fell from ∼80% to below 60% when the dropout rate went from 30% to 70%. Our results are in agreement with theirs; however, recognition accuracy of CCs was more sensitive to pixel dropout in our experiment, and recognition accuracy fell below 80% with dropouts greater than 20% and was below 50% with a 40% dropout. Dagnelie et al. used a paragraph reading task and therefore provided word context, whereas the lower recognition accuracy in our study may be partly attributable to a lack of context; many CCs depend on context to determine character meaning. Our ongoing research will examine Chinese paragraph reading by simulated prosthetic vision to investigate the effect of context. 
Pixel Size Variation within a CC
Changing the stimulation current may result in different phosphene sizes 13 because of the differences in the number of cells or fibers stimulated, their thresholds and retinotopic location, and the pathologic changes in the ganglion cells. These factors and individual differences between patients make it difficult to adjust all phosphenes to a homogeneous size when using the prosthetic device. Our findings indicated that variability in spot size was an important and significant factor and that 80% recognition accuracy could be maintained if pixel size variability was adjusted within the range of 1.0 to 16 mm (7–112 min arc). Changes in the variability of pixel size were only marginally more disruptive to the recognition process when compared with pixel dropout. Small changes in pixel size had a greater affect across CCs of different stroke number (particularly at the lower levels of change), while at higher levels, distortion decreased CC recognition to a greater extent, in particular for CCs with six or more strokes. Each CC consisted of a mixture of large and small spots, and so it remains to be determined which size range (large or small) has more impact on CC recognition. It could be argued that very large spots have an effect similar to that of distortion, in that the basic shape of the CC is obscured due to spot overlap, whereas very small spots may fall below the perceptual threshold, similar to a degree of pixel dropout. 
Gray Levels
Overall, changes in the brightness or gray levels of the simulated phosphenes did not appear to greatly affect CC recognition, in agreement with previous work. 29 Although maps containing eight gray levels achieved recognition accuracies above 90%, accuracy was significantly decreased for CCs with more than 10 strokes compared with that of the standard test. 
Alteration of Multiple Parameters
Altering the stimulus parameters at specific electrodes and/or the simultaneous stimulation of electrode combinations can result in changes in the perception of individual phosphenes within the map. This change may be due to the effect of stimulus strength, duration, and frequency in different ganglion cell populations and their thresholds, and the recruitment of various numbers of cells or passing fibers (e.g., large fibers from the α ganglion cell population have lower thresholds). The resulting changes to phosphene location, size, and possible dropout may then be critical to CC recognition. Experiment 5 showed that recognition accuracies higher than 65% could be achieved with five of the eight combinations. The results suggest that keeping dropout to a minimum is an important consideration when adjusting a prosthetic device. However, changes in stimulation that result in small changes in phosphene size may also dramatically reduce recognition of CCs with a high stroke number. When combining parameters (distortion, dropout, and size) it is clear that maintaining low levels of all three parameters is the most helpful to the subject in achieving a higher CC recognition rate and faster response times. 
Possible Sources of Reduced CC Recognition
The combination of different strokes and stroke directions are used to form different CCs. A one-stroke difference between CCs may cause recognition inaccuracies. For example, Image not available(pronounced: gan) and Image not available (qian) are different CCs, with an equal number of strokes, but in our distortion test, the upper horizontal stroke Image not available in Image not available was interpreted as a right-to-left falling stroke resulting in the subject's responding with Image not available (qian)—an incorrect response. Similarly, missing or additional key strokes could cause errors, such as Image not available (shu) versus Image not available (mu), Image not available (si) and Image not available (pi), which are examples of CCs missing key strokes (i.e., the point Image not available and vertical Image not available stroke, respectively), while in Image not available (shua) versus Image not available (yao), the addition of a key horizontal stroke Image not available changes the character's meaning, leading to an error. 
On the other hand, the special structure of CCs helped some subjects recognize the character by identifying its partial components. Most CCs are formed by left–right (Image not available, ming), upper–lower (Image not available, xiang), outer–inner (Image not available, yin); and half-surrounding (Image not available, jin) structures. Once subjects identified one component of the CC, they recognized the character by making assumptions on the other components. 
Simulated versus Stimulated Phosphene Maps
Our experiments present simulation data based on a highly idealized map in which all the points in the array have a possibility of representing a stimulated phosphene resulting from a 12 × 12 stimulus probe. Previous work using direct retinal or optic nerve stimulation has shown that prosthetic devices are unlikely to produce such well-organized and complete phosphene maps. Indeed, the perceived phosphene maps are more likely to appear like the simulations shown in Figure 4 using mixed parameters. In addition, subjects may have relied on a small amount of scanning when viewing the pixelated CC, even though they were asked to fixate throughout the experiment. Although patients using prosthetic stimulation also use a scanning strategy to determine basic shapes, 42 the ability to scan the CCs by our subjects differs in principle from prosthetic stimulation in which the pixelated image is stabilized on the retina after capture from an external source such as a head-mounted camera. 
The phosphene maps and phosphene characteristics created by array stimulators, while apparently stable over time, can be altered by changing the stimulus parameters in conjunction with verbal responses by the subject. 42 These changes can then be used with reference to our simulation data. For example, the simulated data indicate that increasing phosphene size in an established map through higher stimulation currents, which may also increase the number of nonretinoptic phosphenes, will reduce the subject's ability in tasks such as recognizing simple CCs or shapes. To some degree, this finding establishes some basic expectations as to possible outcomes when manipulating the characteristics of the phosphenes, and it is likely to become increasingly important as new arrays are developed and tested. A device with 60 electrodes is currently in clinical trials, and arrays with 250 to 1000 electrodes are being planned. 43  
In conclusion, the results indicate that changes in the level of distortion, dropout percentage, and the amount of pixel size variability have significant impact on the recognition of CCs. At present, although reading ability is limited to recognizing capital letters in simple fonts, shapes, or three-letter words (Stanga PE, et al. IOVS 2010;51:ARVO E-Abstract 426; Humayun MS, et al. IOVS 2010;51:ARVO E-Abstract 2022; daCruz L, et al. IOVS 2010;51:ARVO E-Abstract 2023), it should be possible to extend this to CC recognition and even text reading in the near future. Our results will help visual prosthesis researchers to determine the effects of altering phosphene maps with respect to CC recognition and to improve the end result for visual prosthesis users. 
Footnotes
 Supported by National Basic Research Program of China, Program 973, 2011CB7075002/3; National Natural Science Foundation of China Grants 60871091, 31070895, and 31070981; National High Technology Research and Development Program of China, Program 863, 2009AA04Z326; National Key Technology R&D Program Grants 2007BAK27B04 and 2008BAI65B03; and the 111 Project Grant B08020 from the Ministry of Education of China.
Footnotes
 Disclosure: Y. Zhao, None; Y. Lu, None; C. Zhou, None; Y. Chen, None; Q. Ren, None; X. Chai, None
The authors thank Thomas FitzGibbon for comments on earlier drafts of the manuscript. 
References
Liu W Sivaprakasam M Singh PR Bashirullah R Wang G . Electronic visual prosthesis. Artif Organs. 2003;27:986–995. [CrossRef] [PubMed]
Zrenner E . Will retinal implants restore vision? Science. 2002;295:1022–1025. [CrossRef] [PubMed]
Weiland JD Liu W Humayun MS . Retinal prosthesis. Annu Rev Biomed Eng. 2005;7:361–401. [CrossRef] [PubMed]
Javaheri M Hahn DS Lakhanpal RR Weiland JD Humayun MS . Retinal prostheses for the blind. Ann Acad Med Singapore. 2006;35:137–144. [PubMed]
Winter JO Cogan SF Rizzo JF3rd . Retinal prostheses: current challenges and future outlook. J Biomater Sci Polym Ed. 2007;18:1031–1055. [CrossRef] [PubMed]
Maynard EM . Visual prostheses. Annu Rev Biomed Eng. 2001;3:145–168. [CrossRef] [PubMed]
Chai X Li L Wu K Zhou C Cao P Ren Q . C-sight visual prostheses for the blind. IEEE Eng Med Biol Mag. 2008;27:20–28. [CrossRef] [PubMed]
Veraart C Duret F Brelen M Oozeer M Delbeke J . Vision rehabilitation in the case of blindness. Expert Rev Med Devices. 2004;1:139–153. [CrossRef] [PubMed]
Dagnelie G . Psychophysical evaluation for visual prosthesis. Annu Rev Biomed Eng. 2008;10:339–368. [CrossRef] [PubMed]
Chen SC Suaning GJ Morley JW Lovell NH . Simulating prosthetic vision, I: visual models of phosphenes. Vision Res. 2009;49:1493–1506. [CrossRef] [PubMed]
Brindley GS Lewin WS . The sensations produced by electrical stimulation of the visual cortex. J Physiol. 1968;196:479–493. [CrossRef] [PubMed]
Veraart C Raftopoulos C Mortimer JT . Visual sensations produced by optic nerve stimulation using an implanted self-sizing spiral cuff electrode. Brain Res. 1998;813:181–186. [CrossRef] [PubMed]
Humayun MS Weiland JD Fujii GY . Visual perception in a blind subject with a chronic microelectronic retinal prosthesis. Vision Res. 2003;43:2573–2581. [CrossRef] [PubMed]
Schmidt EM Bak MJ Hambrecht FT Kufta CV O'Rourke DK Vallabhanath P . Feasibility of a visual prosthesis for the blind based on intracortical microstimulation of the visual cortex. Brain. 1996;119:507–522. [CrossRef] [PubMed]
Dobelle WH . Artificial vision for the blind by connecting a television camera to the visual cortex. ASAIO J. 2000;46:3–9. [CrossRef] [PubMed]
Legge GE Pelli DG Rubin GS Schleske MM . Psychophysics of reading, I: normal vision. Vision Res. 1985;25:239–252. [CrossRef] [PubMed]
Legge GE Ross JA Luebker A LaMay JM . Psychophysics of reading, VIII: The Minnesota Low-Vision Reading Test. Optom Vis Sci. 1989;66:843–853. [CrossRef] [PubMed]
Chai X Yu W Wang J Zhao Y Cai C Ren Q . Recognition of pixelized Chinese characters using simulated prosthetic vision. Artif Organs. 2007;31:175–182. [CrossRef] [PubMed]
Cha K Horch K Normann RA . Simulation of a phosphene-based visual field: visual acuity in a pixelized vision system. Ann Biomed Eng. 1992;20:439–449. [CrossRef] [PubMed]
Cha K Horch KW Normann RA . Mobility performance with a pixelized vision system. Vision Res. 1992;32:1367–1372. [CrossRef] [PubMed]
Cha K Horch KW Normann RA Boman DK . Reading speed with a pixelized vision system. J Opt Soc Am A. 1992;9:673–677. [CrossRef] [PubMed]
Hayes JS Yin VT Piyathaisere D Weiland JD Humayun MS Dagnelie G . Visually guided performance of simple tasks using simulated prosthetic vision. Artif Organs. 2003;27:1016–1028. [CrossRef] [PubMed]
Thompson RWJr Barnett GD Humayun MS Dagnelie G . Facial recognition using simulated prosthetic pixelized vision. Invest Ophthalmol Vis Sci. 2003;44:5035–5042. [CrossRef] [PubMed]
Perez Fornos A Sommerhalder J Pittard A Safran AB Pelizzone M . Simulation of artificial vision, IV: Visual information required to achieve simple pointing and manipulation tasks. Vision Res. 2008;48:1705–1718. [CrossRef] [PubMed]
Fornos AP Sommerhalder J Rappaz B Safran AB Pelizzone M . Simulation of artificial vision, III: do the spatial or temporal characteristics of stimulus pixelization really matter? Invest Ophthalmol Vis Sci. 2005;46:3906–3912. [CrossRef] [PubMed]
Sommerhalder J Rappaz B de Haller R Fornos AP Safran AB Pelizzone M . Simulation of artificial vision, II: eccentric reading of full-page text and the learning of this task. Vision Res. 2004;44:1693–1706. [CrossRef] [PubMed]
Sommerhalder J Oueghlani E Bagnoud M Leonards U Safran AB Pelizzone M . Simulation of artificial vision, I: eccentric reading of isolated words, and perceptual learning. Vision Res. 2003;43:269–283. [CrossRef] [PubMed]
Dagnelie G . Visual prosthetics 2006: assessment and expectations. Expert Rev Med Devices. 2006;3:315–325. [CrossRef] [PubMed]
Dagnelie G Barnett D Humayun MS Thompson RWJr . Paragraph text reading using a pixelized prosthetic vision simulator: parameter dependence and task learning in free-viewing conditions. Invest Ophthalmol Vis Sci. 2006;47:1241–1250. [CrossRef] [PubMed]
Ai W . Problems about Chinese Characters and Words (in Chinese). Beijing: Chinese Press; 1949.
Cai D Chi C-F You M . The legibility threshold of Chinese characters in three-type styles. Int J Ind Ergonom. 2001;27:9–17. [CrossRef]
Chuen T Shen y . The recognition of Chinese characters under tachistoscopic condition in primary school children. Acta Psychol Sinica. 1963;203–213.
Zheng Z . The Process of Recognition of Chinese Characters and Words. Hong Kong: Wenhe Publishing Company; 1982.
Zhang W Feng L . A study on the unit of processing in recognition of Chinese characters. Acta Psychol Sinica. 1992;24:379–385.
Cai S Fu L Zhang H Hu G Liang Z . Prosthetic visual acuity in irregular phosphene arrays under two down-sampling schemes: a simulation study. Conf Proc IEEE Eng Med Biol Soc. 2005;5:5223–5226. [PubMed]
Modern Chinese Characters Commonly Used in the Frequency Statistics: Beijing: The State Language Work Committee; 1989.
Huang X Cai Z Chen L . The effect of visual angle on the recognition of Chinese characters. Psychol Sci. 2004;27:770–773.
Delbeke J Oozeer M Veraart C . Position, size and luminosity of phosphenes generated by direct optic nerve stimulation. Vision Res. 2003;43:1091–1102. [CrossRef] [PubMed]
Hogg RV Craig A . Introduction to Mathematical Statistics. 5th ed. Englewood Cliffs, NJ: Prentice Hall; 1994.
Shou T . Brain Mechanisms of Visual Information Processing. Shanghai, China: Shanghai Scientific and Technological Education Publishing House; 1997.
Dagnelie G Keane P Narla V Yang L Weiland J Humayun M . Real and virtual mobility performance in simulated prosthetic vision. J Neural Eng. 2007;4:S92–S101. [CrossRef] [PubMed]
Brelen ME Duret F Gerard B Delbeke J Veraart C . Creating a meaningful visual perception in blind volunteers by optic nerve stimulation. J Neural Eng. 2005;2:S22–S28. [CrossRef] [PubMed]
Chader GJ Weiland J Humayun MS . Artificial vision: needs, functioning, and testing of a retinal electronic prosthesis. Prog Brain Res. 2009;175:317–332. [PubMed]
Figure 1.
 
Example of a basic pixelated Chinese character. Left: the original CC (stroke number, 8; pronunciation, de); middle: the full regular 12 × 12 array; and right: the rendered CC constructed from the 12 × 12 simulated gray level 1 phosphene map.
Figure 1.
 
Example of a basic pixelated Chinese character. Left: the original CC (stroke number, 8; pronunciation, de); middle: the full regular 12 × 12 array; and right: the rendered CC constructed from the 12 × 12 simulated gray level 1 phosphene map.
Figure 2.
 
Flow chart showing the formation of the altered characters.
Figure 2.
 
Flow chart showing the formation of the altered characters.
Figure 3.
 
Examples of CCs formed by different simulated phosphene maps. The gray level 1 sample shows the regular and standard CC (pronunciation, de).
Figure 3.
 
Examples of CCs formed by different simulated phosphene maps. The gray level 1 sample shows the regular and standard CC (pronunciation, de).
Figure 4.
 
Examples of simulated phosphene maps of CCs formed by different combinations of the parameters (pronunciation, de).
Figure 4.
 
Examples of simulated phosphene maps of CCs formed by different combinations of the parameters (pronunciation, de).
Figure 5.
 
Recognition accuracies and total response times after presentation of stimulated phosphene maps with (A) distorted CCs, (B) different pixel dropout percentages, (C) different range of pixel sizes, and (D) different gray levels. Note that the task was completed in less than the required 4-second presentation limit and the value for gray level 1 indicated the highest attained training level (i.e., control baseline for all parameters, distortion, k = 0, dropout 0%, spot size = 4 mm [2.8 min arc], gray level 255; n = 20). ***P < 0.001, compared with the standard test CC.
Figure 5.
 
Recognition accuracies and total response times after presentation of stimulated phosphene maps with (A) distorted CCs, (B) different pixel dropout percentages, (C) different range of pixel sizes, and (D) different gray levels. Note that the task was completed in less than the required 4-second presentation limit and the value for gray level 1 indicated the highest attained training level (i.e., control baseline for all parameters, distortion, k = 0, dropout 0%, spot size = 4 mm [2.8 min arc], gray level 255; n = 20). ***P < 0.001, compared with the standard test CC.
Figure 6.
 
The recognition accuracy of CCs with a different number of strokes formed by different simulated phosphene maps. (A) Distortion k, (B) dropout, (C) pixel size range, and (D) gray level (n = 20). Recognition accuracy values obtained for CCs with one to five strokes at the lowest level in each parameter were compared with CCs that had a higher number of strokes (significance shown in parentheses). Other comparisons were between CCs with the same number of strokes at higher levels (NS, nonsignificant; ***P < 0.001). Accuracy was significantly affected for all comparisons with the standard test CC, except for changes in gray level (see also Fig. 5).
Figure 6.
 
The recognition accuracy of CCs with a different number of strokes formed by different simulated phosphene maps. (A) Distortion k, (B) dropout, (C) pixel size range, and (D) gray level (n = 20). Recognition accuracy values obtained for CCs with one to five strokes at the lowest level in each parameter were compared with CCs that had a higher number of strokes (significance shown in parentheses). Other comparisons were between CCs with the same number of strokes at higher levels (NS, nonsignificant; ***P < 0.001). Accuracy was significantly affected for all comparisons with the standard test CC, except for changes in gray level (see also Fig. 5).
Figure 7.
 
Recognition accuracy (A) and response time (B) to CCs formed by different combinations of the parameters within the simulated phosphene map (n = 10). All comparisons versus the standard CC test were significant (P < 0.001). Nonsignificant comparisons between parameter conditions are indicated NS; all other comparison were significant at P < 0.05.
Figure 7.
 
Recognition accuracy (A) and response time (B) to CCs formed by different combinations of the parameters within the simulated phosphene map (n = 10). All comparisons versus the standard CC test were significant (P < 0.001). Nonsignificant comparisons between parameter conditions are indicated NS; all other comparison were significant at P < 0.05.
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×