Preproc - Standard artefact rejection

Preproc - Standard artefact rejection

This example script shows how to perform basic artefact detection and rejection. The main input required is an MEEG object.

Contents

A key part of most analysis pipelines is some form of artefact detection and rejection. Aside from ICA (and indeed before running ICA) it can be important to identify bad channels, epochs, or trials. This functionality is provided in OSL by osl_detect_artefacts - if you are already familiar with OSL, this is essentially the same as osl_detect_badevent but a bit cleaner and more versatile.

There are broadly two kinds of artefacts we might want to identify

  • Bad channels - where an entire channel should be rejected. Rejection is performed by setting D.badchannels
  • Bad times - periods of time in the recording that should be rejected. For continuous recordings, this is performed by setting D.badsegments. For epoched recordings, this is performed by setting D.badtrials.

To start with, let's load in an MEEG object. We will remove all existing bad channels

D = spm_eeg_load(fullfile(osldir,'example_data','roinets_example','subject_1'));
D = D.badchannels(1:D.nchannels,0);

First, we can run artefact detection for both bad channels and bad times

D2 = osl_detect_artefacts(D);
Detecting artefacts in channel types: MEGGRAD
Channel 113 (MEGGRAD - MLT31) is bad
Channel 161 (MEGGRAD - MRF14) is bad
Channel 238 (MEGGRAD - MRT21) is bad
Channel 245 (MEGGRAD - MRT31) is bad
Channel 252 (MEGGRAD - MRT41) is bad

Under the hood, the continuous recording is epoched into dummy trials, outliers in those trials are used to identify bad segments, and then those bad segments are marked in the original recording (and the temporary epoched data is deleted). If you pass in an MEEG object that is already epoched, then these epochs/trials will be used instead. This is determined by whether D.type is continuous or not.

osl_detect_artefacts has a number of different options. Firstly, you may only want to detect bad channels - for example, if you have a set of trial data that hasn't been epoched yet, and want to just mark bad channels first (and reject bad trials later)

D2 = osl_detect_artefacts(D,'badtimes',false);
Detecting artefacts in channel types: MEGGRAD
Channel 113 (MEGGRAD - MLT31) is bad
Channel 161 (MEGGRAD - MRF14) is bad
Channel 238 (MEGGRAD - MRT21) is bad
Channel 245 (MEGGRAD - MRT31) is bad
Channel 252 (MEGGRAD - MRT41) is bad

Alternatively, you may have already marked bad channels and now want to identify only bad times

D2 = osl_detect_artefacts(D,'badchannels',false);
Detecting artefacts in channel types: MEGGRAD
Bad times - rejected 9.00s (20%) in modality MEGGRAD

Note that any bad segments, trials, or channels that are already marked will be carried forward in osl_detect_artefacts - that is, you will not lose anything that has already been marked.

Another important parameter you may wish to vary is the artefact rejection threshold. This is specified as a signifiance level for the underlying outlier test (which is a Generalized ESD test). The default level is 0.05. To detect more artefacts, we can raise this value. You can set thresholds for bad channels (channel_significance) and bad times (event_significance) independently. To demonstrate the effect of the threshold, compare we can run the artefact detection for bad channels with different threshold values.

D2 = osl_detect_artefacts(D,'badtimes',false,'channel_significance',0.01);
Detecting artefacts in channel types: MEGGRAD
Channel 113 (MEGGRAD - MLT31) is bad
Channel 161 (MEGGRAD - MRF14) is bad
Channel 245 (MEGGRAD - MRT31) is bad

If we raise the threshold, we find more artefacts

D2 = osl_detect_artefacts(D,'badtimes',false,'channel_significance',0.05);
Detecting artefacts in channel types: MEGGRAD
Channel 113 (MEGGRAD - MLT31) is bad
Channel 161 (MEGGRAD - MRF14) is bad
Channel 238 (MEGGRAD - MRT21) is bad
Channel 245 (MEGGRAD - MRT31) is bad
Channel 252 (MEGGRAD - MRT41) is bad

And this trend continues

D2 = osl_detect_artefacts(D,'badtimes',false,'channel_significance',0.2);
Detecting artefacts in channel types: MEGGRAD
Channel 30 (MEGGRAD - MLF14) is bad
Channel 113 (MEGGRAD - MLT31) is bad
Channel 161 (MEGGRAD - MRF14) is bad
Channel 238 (MEGGRAD - MRT21) is bad
Channel 245 (MEGGRAD - MRT31) is bad
Channel 246 (MEGGRAD - MRT32) is bad
Channel 252 (MEGGRAD - MRT41) is bad

You can limit the maximum number of channels that are marked bad if you wish

D2 = osl_detect_artefacts(D,'badtimes',false,'channel_significance',0.05,'max_bad_channels',2);
Detecting artefacts in channel types: MEGGRAD
options.max_bad_channels was reached for MEGGRAD, there may be additional bad channels present
Channel 161 (MEGGRAD - MRF14) is bad
Channel 245 (MEGGRAD - MRT31) is bad

You can view the results of your artefact detection using oslview. Check osl_detect_artefacts.m for a full list of options. You can also display summary plots of the output for bad channels

D2 = osl_detect_artefacts(D);
report.bad_channels(D2);
Detecting artefacts in channel types: MEGGRAD
Channel 113 (MEGGRAD - MLT31) is bad
Channel 161 (MEGGRAD - MRF14) is bad
Channel 238 (MEGGRAD - MRT21) is bad
Channel 245 (MEGGRAD - MRT31) is bad
Channel 252 (MEGGRAD - MRT41) is bad

or for bad segments

D2 = osl_detect_artefacts(D,'badchannels',false);
report.bad_segments(D2);
Detecting artefacts in channel types: MEGGRAD
Bad times - rejected 9.00s (20%) in modality MEGGRAD

Note that in this example, the bad segments are caused by bad channels, so if the bad channels are removed, then no bad segments are detected. For epoched data, use report.bad_trials instead of report.bad_segments.

Modality-specific artefacts

The artefacts that you detect are specific to a particular modality. This means that artefacts are detected separately in each sensor modality - for example, MEGMAG and MEGPLANAR could have different bad times. In contrast, bad trials are detected across all sensors. You can verify this in oslview by changing the modality/sensor type. When you specify artefacts manually in oslview, they only apply to the selected modality. If you specify bad segments in artefact channels such as EOG or ECG, they will be excluded when computing artefact channel correlations in AFRICA.

Selecting good samples

It's important in your analysis to exclude the artefact times. When you mark an artefact, the information is stored in the events field of the MEEG object. However, all of the data is still present. While SPM generally automatically excludes bad data, you need to do this manually when writing your own code. This can be accomplished using the good_samples function to identify good samples. The function takes in an MEEG object, and optionally a list of channels. It then returns the times when all samples were good. For example

figure
plot(D.time,good_samples(D2))
xlabel('Time')
ylabel('Good sample')
set(gca,'YLim',[-0.1 1.1])

Each channel has its own list of bad samples, depending on the modality and whether or not the channel was bad. By default, good_samples will exclude all bad channels, but will aggregate over all remaining channels regardless of their modality. In other words, the usage for good_samples matches the channels indexed from the MEEG. For example

chans = D.indchantype('ALL','GOOD'); % Retrieve all good channels
dat = D(:,:);
dat = dat(chans,good_samples(D)); % Keep only the good samples

It might be more common to wish to retrieve only MEG data, in which case you could use

chans = D.indchantype('MEGANY','GOOD'); % Retrieve all good channels
dat = D(:,:);
dat = dat(chans,good_samples(D,chans)); % Keep only the good samples

Or equivalently (with less memory consumption)

chans = D.indchantype('MEGANY','GOOD'); % Retrieve all good channels
dat = D(chans,:);
dat = dat(:,good_samples(D,chans)); % Keep only the good samples

Note that if any bad channels are present in the channel list passed to good_samples, all samples will be shown as bad. If you have a trial structure, good_samples will return a matrix where the third dimension corresponds to each trial.

good_samples vs badsamples

Both the good_samples function and D.badsamples read events whose type starts with 'artefact' and returns a matrix with whether or not the artefact was present. However, there are three key differences between the good_samples function and D.badsamples

  • good_samples returns True for good samples, whereas badsamples returns True for bad samples - so they are logically inverted.
  • good_samples aggregates over channels, whereas D.badsamples does not. This means that D.good_samples returns a single vector for all channels, whereas D.badsamples returns a matrix with one row for each channel. This aggregation makes good_samples considerably faster than D.badsamples
  • good_samples automatically maps modalities across online montages, whereas D.badsamples does not. For example, if you perform beamforming, the sensor data channel type might be MEGPLANAR but the beamformed channels will have channel type LFP. Artefacts marked in sensor-space will apply to MEGPLANAR but because this modality is different to LFP, D.badsamples will not recognize any of them as applying to the beamformed data. This is an extremely serious issue, because it results in artefacts remaining in the analysis. In contrast, good_samples will read the online montage, and see that the LFP channels are computed from MEGPLANAR, so it will account for artefacts in both LFP and MEGPLANAR.

It is recommended to always use good_samples instead of D.badsamples.