The Brain-Image Database Project:
Lesion-Deficit Analysis Research



Determining associations between the structure and the function of the human brain has been one of the main goals of the Human Brain Project. Lesion-deficit analysis has been greatly facilitated by the advent of new methods for obtaining data from large cohorts; examples of such studies, which we designate as image-based clinical trials(IBCTs), include The Cardiovascular Health Study and The Baltimore Longitudinal Study of Aging. Despite these and other advances in data acquisition, as well as impressive progress in 3D image registration, IBCT's have been hindered to varying degrees by a broad range of data-processing problems, particularly those related to segmentation and statistical analysis of 3D MR data sets for large cohorts. Solving these problems will enable researchers to integrate, manipulate, and analyze large volumes of clinical and image data, and will thus enable them to conduct large-scale epidemiological trials of stroke, trauma, multiple sclerosis, psychiatric disorders such as schizophrenia, and other afflictions. Although we focus herein on the development of techniques that facilitate lesion-deficit analysis, the methods we discuss are applicable to other neuroinformatics projects.

Registered binary images, in which each voxel is either normal or abnormal (i. e., part of a lesion), combine with functional variables to form the data for each subject. For the purposes of discussion, assume that there are on the order of f = 102 functional variables per encounter, and that a typical brain image contains on the order of i = 107 voxels. Analytic methods operate on a resolution range, between examination of image data at the voxel level, and examination of a set of possibly overlapping structures, such as the caudate nucleus and basal ganglia, assumed to correlate with functional variables. Assuming that there are at least 102 structures of interest, the number of variables ranges from 102 + 102 to 107 + 102. To the degree that these structural variables reflect functional units, statistical power will increase; this is the concept of functional segregation [9]. If, for example, there is an association among certain functional variables and lesions in the left internal capsule, model-based analysis of the left internal capsule would be more sensitive to these associations than voxel-based analysis. However, absent such knowledge, application of the wrong atlas could group the internal capsule with other, unrelated structures, resulting in misleading results.

Voxel-Based Analysis

We implemented voxel-based analysis by defining a sphere, choosing logit(functional variable) as the dependent variable, and selecting logit(fraction of the sphere that contains lesions) as the independent variable. We then optimized the least-squares fit by varying the center and radius of the sphere, using simulated annealing. This method yields results that correspond to those found with atlas-based analysis (ref, ref); at the expense of increasing this method's computational requirements, it could be made more general by increasing the number of spheres being optimized, since a single sphere cannot represent non-contiguous (e. g., bilateral) lesions.

Atlas-Based Analysis

A key feature that distinguishes neuroanatomic knowledge from geologic or meteorological spatial knowledge is the existence of atlases. To the extent that one believes that there is functional segregation in the brain, an atlas-based approach may greatly increase statistical power by reducing the number of independent variables for a given number of subjects. For example, we could designate each structure in a given atlas (e. g., Brodmann) as normal if no lesions intersect the structure, and then perform chi-square analysis of the cross-product of functional and (binary) structural variables. We have implemented this method in BRAID, and have applied it to a subset of the Cardiovascular Health Study data, resulting in 1,260 analyses, which resulted in several clinically meaningful structure-function associations (ref). The usual means of evaluating each contingency table is computation of the chi-square statistic. Alternatively, one may apply the hypergeometric distribution (i. e., Fisher exact test). Historically, most statisticians had considered the exact test to be unnecessarily complicated, because the Gaussian approximation is probably adequately accurate for sufficiently large n, i. e., np(1 - p) = 10, where p is the probability of a structure or finding being normal in the population. However, widespread access to statistical software largely obviates such heuristics, and in fact the chi-square test is probably often used inappropriately [26].

Regardless of the statistic computed, pairwise analysis has two principal limitations. First, such analyses may not detect multivariate interactions; a simple example is a deficit that occurs only after two distinct structures are lesioned. Even if one were to restrict oneself to the detection of pairwise associations, examination of all possible pairs of ns structures and nf findings would require that one compute ns x nf statistics, leading to the multiple-comparison problem [25, chapter 12]. If the tests are independent of each other, one can apply the Bonferroni correction, in which the significance level is divided by the number of tests. This correction is too conservative if the tests are not independent; this is the case for lesion-deficit data, in which overlapping atlas structures (e. g., occipital lobe versus optic radiations) are tested against somewhat redundant functional variables (e. g., right hemiparesis versus right upper extremity weakness). Statisticians have designed heuristic modifications of the Bonferroni correction [27], but none of these heuristics are guaranteed to adjust the statistics correctly for a given data set.

Multivariate analysis avoids the multiple-comparison problem. Log-linear models allow multivariate generalization of chi-square contingency-table analysis [28]. However, this approach does not provide a normative means for deciding among alternative models; for example, it is not always clear whether to select a simple model that has a slightly higher p-value than a more complex model. This is particularly true because the asymptotic equivalence of the chi-square statistic to the hypergeometric distribution becomes poorer as the number and order of interactions increases (i. e., as the expected counts of cells decreases). Furthermore, such methods provide relatively simple methods for generating candidate models, usually relying on modifications of greedy search, such as stepwise model selection.

Multivariate analysis of contingency tables [29-31] allows for exact multivariate tests of conditional independence among discrete variables, obviating heuristic corrections, and does not rely on approximations, in contrast to log-linear analysis. This approach has been applied to standard databases for testing machine-learning algorithms, with favorable results when compared with other statistical methods, such as neural networks and decision trees [30].

The principal limitation of atlas-based lesion-deficit analysis is its very dependence on an atlas: the results are only as good as the atlas. Important, strong lesion-deficit associations cannot be detected if the structures corresponding to the clinical deficits are not delineated in the atlas used for analysis. For example, our analysis of associations between the distribution of post-traumatic brain lesions and subsequent development of attention-deficit hyperactivity disorder demonstrated interesting associations among lesions in subcortical structures. However, we were unable to analyze cortical structures because they are represented as surfaces 1 or 2 millimeters thick [4]. Early neuroanatomists demonstrated that white matter just deep to cortex within a gyrus consists of axons arising from cortical neurons [32]. Thus, we would expect that an augmented gyral atlas, which includes a thin layer of subcortical white matter, would have greater sensitivity (i. e., more voxels that are associated with a particular set of functions) in the context of lesion-deficit analysis than would a standard gyral atlas.

This file was last modified