Abstract
Purpose.:
A variety of approaches to developing visual prostheses are being pursued: subretinal, epiretinal, via the optic nerve, or via the visual cortex. This report presents a method of comparing their efficacy at genuinely improving visual function, starting at no light perception (NLP).
Methods.:
A test battery (a computer program, Basic Assessment of Light and Motion [BaLM]) was developed in four basic visual dimensions: (1) light perception (light/no light), with an unstructured large-field stimulus; (2) temporal resolution, with single versus double flash discrimination; (3) localization of light, where a wedge extends from the center into four possible directions; and (4) motion, with a coarse pattern moving in one of four directions. Two- or four-alternative, forced-choice paradigms were used. The participants' responses were self-paced and delivered with a keypad.
Results.:
The feasibility of the BaLM was tested in 73 eyes of 51 patients with low vision. The light and time test modules discriminated between NLP and light perception (LP). The localization and motion modules showed no significant response for NLP but discriminated between LP and hand movement (HM). All four modules reached their ceilings in the acuity categories higher than HM.
Conclusions.:
BaLM results systematically differed between the very-low-acuity categories NLP, LP, and HM. Light and time yielded similar results, as did localization and motion; still, for assessing the visual prostheses with differing temporal characteristics, they are not redundant. The results suggest that this simple test battery provides a quantitative assessment of visual function in the very-low-vision range from NLP to HM.
Several groups worldwide are currently using various approaches in the development of visual prostheses. Retinal prostheses have been designed to restore lost visual function due to degenerative retinal diseases, such as retinitis pigmentosa (RP) and age-related macular degeneration (AMD).
1–13 These conditions cause a gradual loss of photoreceptor cells, yet a substantial fraction of the neural pathways from the retina to the visual cortex remain functional. Approaches involving optic nerve electrode cuffs or cortical electrode arrays are aimed at restoring visual function along the retinal processing stream, which can be lost due to many different blinding conditions.
14–16
The current arsenal of visual assessment protocols is not effective for quantitatively evaluating the efficacy of visual prostheses for two main reasons: (1) In the first-phase human trials, only patients can be included for whom no other means of prevention or therapy is available and whose visual function is below counting fingers (CF) and/or hand movement (HM) detection; and (2) even after successful interventions, we cannot foresee to what extent and how well these devices will restore vision in an individual patient. There is therefore a need to test various visual functions, from mere light perception (LP) over crude temporal and spatial resolution toward combined spatiotemporal functions such as motion recognition. The test battery Basic Assessment of Light, Location, Time, and Motion (BaLM) was designed to meet these requirements. This report presents the implementation details and results in patients in the very-low-vision range.
Since blind subjects are usually good at hearing and somatosensing, BaLM communicates with sounds, and the patients use a standard numerical keypad to input responses themselves. Feedback tones can also relay information about response accuracy. A test sequence, once started, runs automatically (self-paced) to reduce operator influence. At the end of each run, results are compiled and documented by direct printout or PDF filing.
BaLM was implemented in a commercial programming environment (Flash with the ActionScript 2 language; Adobe Systems, Inc., San Jose, CA).
Common Aspects of All Test Modules.
The basic procedure after subject training is as follows: One of the four test modules is selected. The stimulus is presented, accompanied by an auditory cue. The participant responds by pressing a key, and an auditory signal provides feedback on the correctness of the response. After an intertrial interval of 1.5 seconds, the next trial follows. When the preset number of trials (typically 24) is reached, a gong sound announces the end of that run. Then, the next BaLM module is started.
All test results (hit rates, reaction times, test parameters, subject and examiner identification, and time and date) are accumulated across all tests and can be printed out on paper or saved to PDF for documentation. A “clear results” button deletes this history, so as to prepare for a new session.
The modules are arranged roughly in a hierarchical order of visual difficulty. As an obvious example, when a patient fails the light module, it is unlikely that the time module will be passed, because in the time module, the patient must discriminate between one or two of the flashes involved in the light module. The next step, the module for localization of light, is regarded as more difficult in the clinical examination of low vision also.
Light Module.
This test module, the simplest module of all four, tests basic light perception. The subjects' task is to decide whether they see light appear after the warning tone or not. The main parameters that can be manipulated are luminance (via neutral density filters) and flash duration (default, 200 ms). When a large range (more than one order of magnitude) is targeted, it is best achieved with neutral-density filters in a trial frame (or in front of the computer projector, which we used in place of a visual display unit). This luminance parameter applies to all four test modules. A two-alternative, forced-choice (2AFC) scheme is used. As some visual prosthetic devices may perform better with patterned stimulation, the flashes can be set to a pattern rather than being spatially homogeneous.
Time Module.
Location Module.
Motion Module.
Calibration.
Subjects.
Response Box.
To prevent errors in communication, we trained the subjects to enter their responses via a keypad, which could be operated by blind participants or in darkness. The keypad was a commercial USB-connected 3 × 3 numerical entry pad. The central key (5) initiated a test run, and four keys at left, right, top, and bottom coded the test response: one flash, left key; two flashes, right key. For location and motion, the keys corresponded geographically to the direction observed. If the subject was uncomfortable with the keypad, he or she could use verbal responses, which are then entered by the investigator.
Procedure.
BaLM can be performed by stimulus presentation (1) on a computer screen or (2) via computer projector, which provides a larger stimulus-intensity range. The latter setup was used in this first clinical application. The subjects sat in a chair that was adjustable in height to allow for a comfortable fit in a chin- and headrest. With appropriate correction, they viewed a projection screen at 57 cm distance (at this distance, 1° = 1 cm), on which a computer projector, mounted above the subjects' head, projected the stimuli. One hand rested on the response keypad. A computer with a double video output was used. Most current laptops can mirror their built-in display on a video-out socket, which was connected to the projector. Since the main BaLM screen is laid out in dark gray colors, there was no need to switch off the projector for operator entry.
The number of trials was set to 24 for all the tests. The light and time test modules had 2AFC. The location and motion test modules had four response alternatives (4AFC).
Table 2 describes the psychometric characteristics depending on the number of alternatives. As usual, the positive test outcome criterion was set at the steepest point of the psychometric function, which is the center between chance rate and 100% correct. As can be seen, the probability of exceeding the criterion by chance in 24 runs is ∼1% for 2AFC and miniscule for 4AFC and 8AFC.
Table 2. Psychometric Characteristics of Two-, Four-, and Eight-Alternative, Forced-Choice Tasks
Table 2. Psychometric Characteristics of Two-, Four-, and Eight-Alternative, Forced-Choice Tasks
Alternatives (n) | Chance Rate (1/n × 100%) | Criterion [(100 − Chance Rate)/2 + Chance Rate] | Probability of Reaching or Exceeding Criterion by Chance |
24 Trials | 30 Trials |
(Binomial [cumulative] Depending on n, Chance Rate, and Criterion) |
2 | 50% | 75% | 1.1% | 0.26% |
4 | 25% | 62.5% | 0.011% | 0.001% |
8 | 12.5% | 56.25% | 0.000013% | 0.000001% |
The subjects were instructed to look straight at the middle of the screen. At the beginning of each session, they were guided to touch the middle of the screen to obtain a tactile impression of the screen's position. Before the start of each module, the subjects were informed about that module's function and the valid choices. They were encouraged to respond within the time limit, even when uncertain. No head or eye tracking was used during the testing to account for fixation.
Each session always began with the light test module. This module started with the densest filter (d = 4.6) in front of the projector, resulting in a luminance of 0.13 cd/m2. When the criterion was not reached, the filter density was reduced in roughly half-log units until it was reached or the no-filter condition was reached. The last filter level was recorded as the threshold in module light. For the three subsequent test modules (time, location, and motion), we reduced the threshold filter by 0.5 log units so that the tests operated in the suprathreshold range.
All four test modules reached their ceiling at HM. Thus, none of the modules discriminated between HM, CF, and CF+. The light and time modules discriminated between NLP and LP. For the location and motion test modules, no NLP eye exceeded 75% criterion, but both modules discriminated between LP and HM (and better). There was a marked rise in the median for chance response at LP to nearly 100% performance at HM and better, whereas the quartile boxes indicate that there was some overlap in performance between some LP and HM eyes.
The software implementation environment flash was chosen in 2003 and proved adequate, one major advantage being its platform independence. One disadvantage is that there is no means of synchronizing graphic changes to the cyclic screen update (be it 60 Hz for LCDs or higher rates with CRTs); thus, the temporal granularity is limited to approximately 20 ms. Given the current state of the art in visual prostheses, this limitation did not prove to be a problem.
To ensure adaptability for unexpected findings with patients wearing prostheses, we included many free parameters in the test battery for each of the four test modules. The Appendix lists starting values, which we arrived at after several pilot trials.
BaLM was designed to quantitatively assess visual function in patients with very low vision (below CF). We validated BaLM by testing 73 eyes in 51 patients in the following visual categories, arranged in order of increasing visual function: NLP, LP, HM, CF, and CF+.
Of the four test modules in BaLM, the light and time test pair behaved similarly and showed the steepest rise in detection between categories NLP and LP. The test pair location and motion also behaved similarly and showed the steepest rise in detection between the categories LP and HM. In terms of a sensitivity/specificity analysis (
Table 3), both location and motion showed 100% specificity to detect vision above NLP. In other words, there are no false positives in detecting vision above NLP. The observation that the time test's specificity was only 88% and not 100% suggests that the patients with NLP had some residual vision, which is also evident from the light test results. A completely blind subject (e.g., simulated by switching off the display) would perform at chance level, which has a mere 0.8% chance of exceeding the criterion in 24 trials (see
Table 1). By increasing the number of trials to, for instance, 40, this chance can be reduced to below 0.1%. In fact, the 4AFC modules location and motion possess such a chance rate, below 0.01% with 24 trials.
Table 3. Sensitivity/Specificity Analysis of the Three Test Modules with a Fixed Criterion
Table 3. Sensitivity/Specificity Analysis of the Three Test Modules with a Fixed Criterion
Test Module/Detecting Vision | Sensitivity (%) | Specificity (%) |
Time | | |
>NLP | 77.0 (67.2–85.7) | 88.2 (73.7–100) |
>LP | 97.0 (90.3–100) | 62.8 (50.1–74.5) |
>HM | 100 (100–100) | 45.2 (36.5–57.6) |
Location | | |
>NLP | 45.0 (33.3–55.2) | 100 (100–100) |
>LP | 73.0 (59.1–86.1) | 93.0 (86.0–98.0) |
>HM | 85.0 (66.7–100) | 76.7 (68.3–86.0) |
Motion | | |
>NLP | 48.0 (36.8–58.8) | 100 (100–100) |
>LP | 87.0 (75.0–96.4) | 97.7 (93.2–100) |
>HM | 92.0 (77.8–100) | 75.0 (66.1–84.5) |
The light and time, and location and motion test modules yielded very similar results in our patients. We do not view the time and motion modules as superfluous; however, they are more challenging to the visual system as they additionally assess a different dimension of vision—namely, the temporal domain. Time discrimination and temporal resolution are important and, in principle, are possible without pattern vision. The motion module requires both spatial and temporal resolution. These aspects will be useful to assess actual visual prostheses, as these often invoke time multiplexing, thus limiting temporal resolution and possibly resulting in temporal aliasing. In line with these considerations, preliminary results from patients wearing subretinal visual prostheses indicate that in these patients there may be bigger discrepancies between the light/time and location/motion modules results, respectively.
In addition, if BaLM is to mimic clinical examination with the advantage of providing quantitative data while avoiding examiner bias, the motion module must be regarded as the equivalent of HM. We noted a ceiling effect in all four test modules for acuity categories above HM. In other words, the tests become too easy for subjects with vision approaching normal. We do not view this ceiling effect as a shortcoming, because starting with HM and above, other acuity tests can take over. For instance, the FrACT
25,27 is also a computerized automated test that can reproducibly estimate visual acuity starting with HM (2.3 logMAR),
19,26 ETDRS (Early Treatment of Diabetic Retinopathy Study) charts can be applied starting with CF.
26 Spatial resolution of gratings without requiring fixation ability can be assessed by using BaGA,
22 for example.
All subjects in this feasibility study performed all tests. For applications assessing implant efficacy, failure to reach criterion on the light module renders the rest of the test battery unnecessary. This recommendation is based on our observation in this study that the patients with NLP, almost none of whom passed the light criterion, performed at chance levels in the remaining test modules.
Our results indicate that the test battery allows quantitative assessment of visual function in the targeted very-low-vision range from NLP over LP to HM. We are not aware of any other test that systematically and quantitatively discriminates among NLP, LP, and HM, and we have already found it useful in patients wearing visual prostheses.
Supported by German Federal Ministry of Education and Research (BMBF) Grants 01IN502A-D and 01KP0008-12, the Kerstan-Foundation, and Retina Implant AG, Rütlingen, Germany (MB).
Disclosure:
M. Bach, Retina Implant AG (C);
M. Wilke, None;
B. Wilhelm, Retina Implant AG (I);
E. Zrenner, Retina Implant AG (I);
R. Wilke, None
The authors thank Walter-Gerhard Wrobel (CEO, Retina Implant AG), for financing the development of the test battery BaLM, which Retina Implant AG now offers as a commercial product; Wilhelm Durst for technical support; and Carolin Kuttenkeuler, Thomas Zabel, Anna Bruckmann, Johannes Koch, and Katarina Porubska for contributing to patient examinations in the first pilot trial of a subretinal implant.