if in a repeated measures design, extremely low scores improve over time (or extremely high scores decrease over time), which threat to internal validity should we be worried about?