11.10: Exercise- Loops
-
- Last updated
- Save as PDF
In this exercise, we’ll introduce the concept of loops , which will make your life much easier by allowing you to automate the processing of multiple participants. There are several kinds of loops, but the most common is called a for loop . It allows you to repeat the same sequence of steps multiple times, but with one variable changing (e.g., a variable that indicates which participant should be processed). We’re going to go through several examples in this exercise, starting simple and working our way up to a script that processes the data from multiple participants.
As usual, let’s start by quitting EEGLAB, typing clear all , and restarting EEGLAB. Then open Script3.m by double-clicking on it in the Current Folder pane. Run the script (e.g., by clicking the Run button) and see what it does. You should see that it prints a list of numbers between 1 and 10. In a later script, these will be the ID numbers of our participants. Note that the number 5 is missing. This is meant to indicate that Subject 5 is being excluded (because that subject had too many artifacts and was excluded from the final analyses).
Now take a look at the script. The main body of the script is this:
for subject = [ 1 2 3 4 6 7 8 9 10 ]
display(subject);
end
In Matlab, a for loop begins with for and ends with end . The lines of code between the for and end lines are the body of the loop. This body will be executed multiple times, once for each element specified in the array on the for line. The for line defines a variable (which we’ve named subject in this example) and specifies an array of values that will be stored in this variable as we go through the loop ( [ 1 2 3 4 6 7 8 9 10 ] in this example).
In Script3.m , the body of the loop is a single line consisting of the display(subject) command, which just prints the value of the variable named subject that we specified on the for line. Note that it’s conventional to indent the body of a loop using tabs. The tabs are ignored by Matlab, but they make it easier to see the structure of a script. Not required, but highly recommended!
Each time we go through the loop, the variable named subject is set to a new value in the array of values following the equals sign. When the loop starts, subject will be set to 1 (because 1 is the first value in the array). The display(subject) line will then execute, and it will display a value of 1 because that’s the value of the subject variable. Then the end line occurs, telling Matlab to go back to the start of the loop and set subject to the next value in the array (2). The display(subject) line will then execute, but this time it will display a value of 2 because that’s now the value of the subject variable.
Matlab will keep repeating the loop, setting subject to 3, then to 4, then to 6, etc. There is no 5 in our array of values, so subject will never take on that value. The array is just a list of values, and any sequence of values will work. For example, we could use [ 3 1 5 9 ] and then subject would take on that set of values in that order (3, then 1, then 5, then 9). We could even use non-integer numbers (e.g., [ 5.23 6.1 -5.442 10021.2 ] ) or character strings (e.g., [ ‘S1’ ‘S2’ ‘S3’ ‘S5’ ] ). Matlab is much more flexible than most programming languages in this regard. Spend some time playing with the array in the script so that you get a good sense of how it works.
Now close Script3.m and open Script3b.m , which is a slightly more sophisticated version of the same set of ideas. Run the script to see what it does. You should see a set of lines starting with this:
Processing Subject 1
Processing Subject 2
Processing Subject 3
Now look at the script. You’ll notice two main changes from the first script. First, instead of providing an explicit list of subject IDs in the for line, we’ve defined a variable named SUB that stores this array:
SUB = [ 1 2 3 4 6 7 8 9 10 ]; %Array of subject IDs
We then specify this variable as our array in the for statement:
for subject = SUB
This approach embodies my #1 principle of writing good code: All values used by a script should be defined as variables at the top of the script . I’ll say more about this principle later in the chapter.
The second change to the script is that it uses the fprintf command instead of the display command to print the value of the subject variable. The fprintf command is much more powerful and flexible. It takes a little time to learn to use it, but it’s well worth the time. Here’s the specific version used in our script:
fprintf('Processing Subject %d\n', subject);
The first parameter in the fprintf command is a formatting string . It contains plain text that is printed by the routine, and it also includes formatting statements for variables that appear as subsequent parameters. The %d tells the command to print a whole number (the value of the subject variable), and the \n tells it to print a newline (a return). You can do much more than this with fprintf , but we’re keeping it simple for now (see the fprintf documentation for details).
Once you understand how this script works, close it and open Script3c.m . If you run it, you’ll see that the output has more information than provided by the previous script. For each iteration of the loop, it indicates how many times we’ve gone through the loop, plus the subject’s ID. This allows us to see that the fifth subject has an ID of 6, not 5.
Look at the script to see how it works. Near the top, you’ll see that we use a Matlab function called length to define a variable named num_subjects that stores the number of subjects in the SUB array:
num_subjects = length(SUB);
We’ve then used this new variable to define the array of values for the loop, which we’ve defined as 1:num_subjects . Type 1:num_subjects on the command line. You’ll see that it is equivalent to a list of integers between 1 and num_subjects (rather than 1 through 10, skipping 5). As a result, we’re no longer looping through the subject IDs. As a result, I’ve changed the name of the variable in the for statement to subject_index .
Each time through the loop, we get the subject ID by finding the element in SUB that corresponds to subject_index and store it as a text string in a variable named ID . For example, when subject_index is 5, we get the 5 th element of SUB , which is 6 (because SUB skips subject 5). SUB is an array of numbers, but as you’ll see in the next script, it’s useful to store the ID as a text string. We therefore use a Matlab function called num2str to convert the number to a string before storing it in ID . Note that the format string for the fprintf command uses %s to indicate that this command should print a string variable for ID .
Why it Pays to Include Good Comments and Meaningful Variable Names in Your Scripts
When you’re in the middle of writing a script to process the data for an experiment, you will get very focused on getting the job done . That is, you just want to script to work so that you can get to the next step of the project (and ultimately to the point of submitting a paper for publication). However, the fastest route to a goal is not always the straightest: If you focus too much on the immediate goal of getting the script to work, you may actually slow your progress toward the final goal of getting the paper submitted. It really pays to take your time when writing a script and write the code in a way that will be optimal in the long run.
In practice, this means following good coding practices that reduce the likelihood of errors, like defining all important values as variables at the top of the script. Errors can really slow you down if you don’t realize the error until you’re near the point of submitting the paper and now need to repeat all the analyses, change all the figures, and update the text of your paper. It’s also important to realize that you will probably need to come back to your script many months after you’ve written it (e.g., when you’re writing your Method section or you realize you need to reanalyze your data), and you will save yourself a lot of time if you write your code in a way that’s easy to read later.
There are two straightforward ways of making your code more readable. The first is to use variable names that have an obvious meaning. For example, I could have used something like ns as the name of the variable that holds the number of subjects, but instead I used num_subjects . The second is to add lots of comments. For examples, take a look at the example scripts I created for this chapter. Of course, these scripts were designed to be read by other people, so I put more work into that comments than I might have for a regular script that I wasn’t planning to share. But I still include tons of comments in scripts that I don’t plan to share. It’s a gift to my future self, because I know I will probably need to come back to those scripts after months or even years, and the comments will make my life much easier then. And when I come back to a script with good comments, I always try to thank my past self for the gift.
You should also keep in mind that it’s becoming more and more common to make your data and scripts available online when you publish a paper. This means that you’re never really writing scripts just for yourself. Other researchers are going to be looking at your scripts, so you don’t want the scripts to be embarrassing, and you want the other researchers to be able to understand your code. If you make the code easy to understand, this increases the likelihood that the other researchers will follow up on your research, which means that your research will have a larger impact. And aren’t you doing research so that it has an impact?
I’ve found that people often spend a huge amount of time polishing their scripts (making them more logical and adding lots of comments) right before they’re going to submit the paper for publication. They often find mistakes, and then they end up having to change their figures and the statistics in their Results sections. It’s really inefficient. It makes much more sense to write your scripts with the intent of sharing them—including clear logic and lots of comments—right from the beginning. This takes a lot of discipline, because when you’re writing the script you just want to get the job done. But this approach will save you a lot of time and agony in the long run.