Discussion on the Facebook PsychMap group today about removing outliers from our datasets. My opinion is that this should never be done unless we are 100 percent certain that there is a good reason to (the computer crashed for instance). Otherwise, we must keep the datapoints.
It is up to us, as researchers, to monitor our data collection to avoid ambiguities.
We want more and more data and to get it we let our research subjects run themselves (alone in a lab room with a computer; on MTurk; wherever). As we move farther from our research subjects our data get less reliable.
If a researcher doesn’t know that the participant sent a couple of texts in the middle of the experiment (I’d be surprised that it doesn’t happen a lot), well that is indeed a problem.
We should strive more greater quality of data, even at the expense of quantity.