Last week, an interesting news article about a research paper was forwarded by mail and I took a few minutes to read it. What is interesting is not just the content of the paper (which actually is very relevant in terms of selection bias) but most of all the reactions it provokes!
In short, the article of Vul et al. describe 50+ (high profile) social neuroscience articles in which high correlation is found between imaging data and behavioral measures. It divides the articles in two categories: green (good) and red (probably bad).
Vul and his co-authors say they wrote the paper because they were concerned by what they considered to be the “implausibly high correlations” reported between brain activation and particular forms of behaviour, and the lack of methodological details provided. So they selected 54 papers in social neuroscience and sent a brief questionnaire to the authors requesting details of their analyses.
They concluded that in a ‘red list’ of 31 cases — often in high-profile journals, including Nature and Science — the authors made fundamental errors in data handling and statistics.
They particularly criticize a ‘non-independence error’, in which bias is introduced by selecting data using a first statistical test and then applying a second non-independent statistical test to those data. This error, they say, arises from selecting small volumes of the brain, called voxels, on the basis of their high correlation with a psychological response, and then going on to report the magnitude of that correlation. “At present, all studies performed using these methods have large question marks over them,” they write.
A swift response to this article was written by former colleague Jabbi and current scientific director Keysers and was emailed to us only a few hours after previous email–note: offended scientists are to be handled with care….
The essence of the article is about statistical analysis methods, about good practice in statistics and selection bias. Selection bias can occur in neuroimaging if the voxels tested are selected on the basis of criteria that depend on the test. That is, if the voxels that survive a certain statistical threshold (whether it is or is not corrected for the multiple comparisons problem) are correlated with the criteria that were used to select the voxels, you’re bound to have high correlations. A better option (as recommended by the authors) is to make a selection based on independent criteria (e.g. based on anatomical knowledge or make use of a localizer-task).
Wheter Vul et al. and the response of some of our colleagues (Jabbi and Keysers) are talking about the same issue remains questionable but both advocate good practice in statistics and it would be strange if they didn’t.
Whatever the final conclusion will be, there is a vivid discussion going on, even before the article is published and it seems that it provokes a lot of comments. At least the division of articles into a green list and red list causes some disturbance and evokes some reactions. Here the link to the site of Vul et al. for the updates….
Who said science is dull?