Nevins and Potti Respond To Perez’s Questions and Worries
We regret the fact that you have decided to terminate your fellowship in the group here and that your research experience did not tum out in a way that you found to be positive. We also appreciate your concerns about the nature of the work and the approaches taken to the problems. While we disagree with some of the measures you suggest should be taken to address the issues raised, we do recognize that there are some areas of the work that were less than perfect and need to be rectified. We thought it would perhaps be best to summarize our view and also steps we have decided to take in relation to several of the problems you cite.
1. Concerning the use of various forms of the BinReg algorithm and the fact that validations have not always been adequately performed when switching from one version to the next.
As we think you know, we have struggled with the use of BinReg and what works best in various settings. This reflects not so much the nature of the program but rather the reality of doing these studies in an imperfect world—datasets with different characteristics being predicted with training sets of varying characteristics. While we would very much like for all of the samples that we use to be perfectly compatible, this is virtually never the case and necessitates measures to adjust and accommodate the differences. As we think you know, the two versions of BinReg try to accomplish these goals in different ways and we are frankly still evaluating what might be optimal in different circumstances. That said, we have tried to be careful in presenting analyses with a different version of the program to be sure that the results are valid. I suspect that we likely disagree with what constitutes validation.
2. Concerning the methods for developing a predictor that involve feature selection.
We recognize that you are concerned about some of the methods used to develop predictors. As we have discussed, the reality is that there are often challenges in generating a predictor that necessitates trying various methods to explore the potential. Clearly, some instances arc very straightforward such as the pathway predictors since we have complete control of the characteristics of the training samples. But, other instances are not so clear and require various approaches to explore the potential of creating a useful signature including in some cases using information from initial cross validations to select samples. If that was all that was done in each instance, there is certainly a danger of overfitting and getting overly optimistic prediction results. We have tried in all instances to make use of independent samples for validation of which then puts the predictor to a real test. This has been done in most such cases but we do recognize that there are a few instances where there was no such opportunity. It was our judgment that since the methods used were essentially the same as in other cases that were validated, that it was then reasonable move forward. You clearly disagree and we respect that view but we do believe that our approach is reasonable as a method of investigation.
3. Concerning discrepancies in datasets that have been used for validation and that were posted on our web pages.
In one instance, you made note of the fact that an adriamycin response dataset contained a number of duplications or triplications of samples. It turns out that upon discussion with the individuals at St. Jude who provided this data to us that the duplications and triplications were generated by them when they assembled the data to provide to us. This was unfortunate and clearly something that we wish was recognized prior to this time. In retrospect, it might have been a good idea to do a data quality check upon first receiving this data that might have then uncovered the fact that there were duplicated or triplicated samples. Unfortunately, this was not done and we only recognized the issue you have pointed it out. We are grateful to you for identifying this issue and we are in the process of correcting the dataset on the web page along with a notation to users of the site to alert them to this change. We have also examined the consequence of this error on the predictive accuracy of adriamycin response. The original accuracy reported in the paper was 81% when we eliminate the repeated samples, the accuracy is 76%. As such, the conclusion that the signature developed to predict adriamycin sensitivity does predict clinical response is still valid.
You also make note of a second instance of data duplication in one of our datasets—this involves data from the thrombosis study reported in Blood in 2006. You noted that there were several samples that were clearly duplicated in the database. We have also now reviewed this data and realize that indeed there were several samples that received different names in the process of generated the final data. This did not involve using the same samples multiple times in the assays reported in the paper but rather represents duplicate entries of the samples when the final table was assembled. Thus, this has no effect on the results reported in the paper. We have now corrected this database on the web page and again, we appreciate the fact that you have made note of the error.
Given these two instances, we have now decided to go back through each and every dataset that we have posted in relation to various publications to ensure that there are no errors. As you might imagine, this is a laborious process that requires quite a lot of checking of data to ensure that what is reported is accurate. But, we do believe this is important and in the end will be in everyone’s best interest. The reality is that these errors do occur, and no degree of quality control will likely complete [sic] eliminate the problem. In most instances they are corrected as a result of someone trying to use the data. As maybe was the case for you, and in the course or doing so notices problems of this sort and then points them out to us. We then respond by making the corrections. But, we’re sure that there are likely other cases that people have problems, get frustrated, but then give up without contacting us and that would be unfortunate. So, in the end we believe trying to make these sources of information as accurate as possible is in everyone’s best interest, including ours. We appreciate that you have pointed out these mistakes to us. We do wish to emphasize, however, that we have never misrepresented data or methods in the web page material as you seem to suggest in the initial draft statement to HHMI. We may have neglected to include necessary information or, as described above, we may have inadvertently introduced mistakes into some of the data, but this was in no way intentional. When problems or errors or the need for additional information has been reported to us by other investigators, we have always responded promptly and made the changes or provided the information. This happens continually and is part of the normal scientific process.
We recognize that these responses are likely only partially satisfactory to you and that in some instances, such as the nature of the validations that are appropriate for use of a signature, you remain in disagreement. We understand that position and respect it—in no way, would we want to force you into a circumstance that was inappropriate in your mind. But, at the same time, we believe it is important to recognize that many of these cases are judgment calls and that others might have a different point of view or standard for the science from your own. We don’t ask you to condone an approach that you disagree with but do hope that you can understand that others might have a different point of view that is not necessarily wrong.
Finally, we would like to once again say that we regret this circumstance. We wish that this would have worked out differently but at this point, it is important to move forward.