8.3: Background- Why Do We Reject Artifacts?

Last updated
Save as PDF

Page ID: 137768

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Let’s start by asking why we reject epochs containing artifacts. You might think this is a dumb question. Obviously, we don’t want any artifacts in the epochs that we will be using to make our averaged ERPs! However, every single time point in every scalp EEG recording in human history contains artifactual activity. That is, the scalp EEG signal is always a mixture of brain activity, non-neural biological signals (e.g., skin potentials, EMG), and non-biological signals (e.g., line noise from nearby electrical devices). If we rejected every epoch containing an artifact, we wouldn’t have any data left.

We therefore reject epochs that problematic artifacts, defined as artifacts that interfere with the fundamental goal described in the previous chapter: accurately answering the scientific question that the experiment was designed to address. There are three common ways in which artifacts can be problematic from this perspective:

Reduced Statistical Power. Artifacts add noise to the data, reducing the signal-to-noise ratio (SNR) of our averaged ERPs. This makes our amplitude and latency measurements less precise, which in turn decreases our statistical power. However, when we reject epochs containing artifacts, we have fewer epochs in our averages, and that also makes the averages noisier and decreases our power. As a result, we need to balance the need to eliminate epochs with large artifacts with the need to include as many epochs as possible.
Systematic Confounds. Artifacts can produce systematic confounds in our studies. For example, if participants blink more in response to deviant stimuli than in response to the standards, we will see a difference between deviants and standards in the averaged ERPs that is due to EOG activity rather than to brain activity. As we will see in one of the exercises in this chapter, this is not just a theoretical possibility.
Sensory Input Problems. In visual experiments, EOG artifacts can indicate a problem with the sensory input. For example, if a blink occurs just before or during the stimulus presentation, this means that the stimulus wasn’t actually seen by the participant. Similarly, a deflection in the horizontal EOG can mean that the eyes weren’t pointed at the center of the display. The first exercises in this chapter will use data from an auditory experiment so that we won’t need to deal with this issue initially. However, we’ll switch to a visual experiment in the last part of the chapter to examine how ocular artifacts might alter the sensory input.

Artifact correction can be much better than artifact rejection for addressing the problem of reduced statistical power, because we get to keep all of our epochs. Correction can also help with systematic confounds, but only to the extent that the correction fully removes the artifacts and doesn’t produce any new artifacts. For example, if correction reduces the blinks by 99%, the remaining blink activity would still be 1-2 µV in the frontal channels (because uncorrected blinks are typically 100-200 µV in these channels). That might be enough to produce a significant confound. Artifact correction doesn’t help at all with sensory input problems. For example, if participants are looking leftward in one condition and rightward in another condition, correcting for the EOG voltage produced by the eye movements doesn’t eliminate the confound of a different sensory input in the two conditions.

For these reasons, I recommend combining artifact correction and artifact rejection for most experiments. You can use correction to minimize the noise produced by blinks (and certain other artifacts, as discussed in Chapter 6 in Luck, 2014). And then you can use rejection to eliminate epochs with blinks or eye movements near the time of the stimulus (for visual experiments) and to eliminate epochs that contain large artifacts that are not easily corrected (e.g., occasional EMG bursts).

When we’re using rejection to deal with reduced statistical power, we would ideally have an algorithm that determines which epochs should be removed to best balance the benefits of eliminating noisy epochs with the cost of reducing the number of epochs that are included in our averages. There are several methods that take this approach (e.g., Jas et al., 2017; Nolan et al., 2010; Talsma, 2008). However, they try to optimize the signal-to-noise ratio in a generic sense, which may not actually maximize statistical power for the specific amplitude or latency measurement that you will be using to answer your scientific question.

The standardized measurement error (SME) was specifically designed to quantify the data quality for your specific amplitude or latency measure and is directly related to statistical power (Luck et al., 2021). The SME can therefore be used to determine the artifact detection parameters that will lead to the best power. At this moment, ERPLAB doesn’t include an automated approach for determining which trials should be rejected to minimize the SME, but you can manually check the SME to compare different artifact detection parameters. We’ll use this approach in several of the exercises later in this chapter.

Keep in mind, however, that low noise isn’t the only consideration. For example, imagine that a participant blinked on every trial. This would be very consistent, which would lead to a low SME (because the SME reflects the amount of trial-to-trial variability in the data). However, the resulting averaged waveforms would mainly contain blink activity instead of ERPs, which could lead to completely incorrect conclusions. So, you need to consider potential confounds as well as the data quality when selecting artifact detection parameters.

You should also keep in mind that the SME quantifies the data quality of the averaged ERPs (which is, of course, influenced by the noise in the EEG). As a result, the SME depends on the number of trials being averaged. That’s a good thing, because the number of trials can have a big impact on your statistical power (Baker et al., 2020).