Data cleaning; is it time to stop sweeping it under the carpet? An example from the Dogslife project.

Even with careful study design and extensive validation, large datasets are often heterogeneous and require cleaning prior to analysis to prevent losses in research validity, quality and statistical power. Many publications report that data was ‘cleaned’ but few studies document the process reproducibly and values identified as ‘outliers’ are commonly deleted without reporting the possible Read more about Data cleaning; is it time to stop sweeping it under the carpet? An example from the Dogslife project.[…]