Defining the dependent variable (Reisberg, methods, ch. 14) (e.g., inter-rater reliability for ‘creativity’)

In our very first methods essay, we discussed the importance of testable hypotheses, that is, hypotheses that are framed in a way that makes it clear what evidence will fit them and what evidence will not. Sometimes, though, it's not obvious how to phrase a hypothesis in testable terms. For example, in Chapter 14 of the textbook, we discuss research on creativity, and, within this research, investigators often offer hypotheses about the factors that might foster creativity, or perhaps undermine it. Thus one hypothesis might be: "When working on a problem, an interruption (to allow incubation) promotes creativity." To test this hypothesis, we would of course have to specify what counts as an interruption (five minutes of working on something else? an hour?). But then we'd also need some way to measure creativity-otherwise we couldn't tell if the interruption was beneficial or not.

For this hypothesis, creativity is the dependent variable-that is, the measurement that, according to our hypothesis, might "depend on" the thing being manipulated. The presence or absence of an interruption would be the independent variable-the factor that, according to our hypothesis, influences the dependent variable.

In many studies, it's easy to assess the dependent variable. For example, consider this hypothesis: "Context reinstatement improves memory accuracy." Here the dependent variable is accuracy, and this is simple to check-for example, by counting up the number of correct answers on a memory test. In this way, we would easily know whether a result confirmed the hypothesis or not. Likewise, consider this hypothesis: "Implicit memories can speed up performance on a lexical decision task." Here the dependent variable is response time, and so, again, is simple to measure, allowing a straightforward test of the hypothesis.

The situation is different, though, for our hypothesis about interruptions and creativity. In this case, people might disagree about whether a particular problem solution (or poem, or painting, or argument) is creative or not. This will obviously make it difficult to test our hypothesis.

Psychologists generally solve this problem by recruiting a panel of judges to assess the dependent variable. In our example, the judges would review each participant's response, and evaluate how creative the response was, perhaps on a 1-to-5 scale. By using a panel of judges, rather than just one, we can check directly on whether different judges have different ideas about what creativity is. More specifically, the researcher can calculate the inter-rater reliability among the judges-the degree to which they agree with each other in their assessments. If they disagree with each other, then it would appear that the assessment of creativity really is a subjective matter and cannot be a basis for testing hypotheses. But if the judges agree to a reasonable extent, then the investigator can be confident that their assessments are neither arbitrary nor idiosyncratic, and so can be used for testing hypotheses.

In the same way, consider this hypothesis: "College education improves the quality of critical thinking." Or: "Time pressure increases the likelihood that people will offer implausible problem solutions." These hypotheses, too, involve complex dependent variables, and might also require that we use a panel of judges to obtain measurements we can take seriously. But by using these panels, we can measure things that seem at the outset to be unmeasurable, and in that way appreciably broaden the range of hypotheses we can test.