An Investigation into Problem-Based Recall Comparing Gut Instinct v. Untimed Deliberation EMRA

An Investigation into Problem-Based Recall Comparing Gut Instinct v. Untimed Deliberation

The specialty of emergency medicine requires frequent, rapid, critical decision making. Some decisions must be made with minimal data, while others can be deliberated over as time allows. In the education process leading to this career path, physicians are tasked with acquiring a significant amount of knowledge. In a test-taking scenario, two primary modes of analysis can be used to answer a question. As best explained by Daniel Kahneman—Nobel prize winner in Economics, and best-selling author—response approach can be categorized into System 1 and System 2. System 1 is best defined as automatic and impulsive, while System 2 is understood to be a more deliberate, slower, data-driven process.¹

In our study, we aim to investigate the relationship between System 1 (gut instinct) and System 2 (thinking slowly) in the context of board-style test taking. In a test-taking scenario, residents have the opportunity to exercise either system. In our study, residents will be limited to one system at a time in order to determine potential benefits of System 1 compared to System 2.

The purpose of this pilot study was to determine if there were differences in medical competence of emergency medicine residents based on System 1 versus System 2 thinking styles. This will be determined based on correlations between In-Training Examination (ITE) scores and the two modes of thought.

METHODS

The present study utilized an educational experimental research approach. EM residents were asked to schedule a session with a member of the research team to answer 20 EM focused questions. The questions, designed by author IS, were created to test resident knowledge on patient care fundamentals and EM concepts. Ten questions were open-ended and ten questions were multiple choice. Questions were independently ranked by difficulty by authors IS and MH; the questions were allocated by type of question (open-ended versus multiple choice) and difficulty level into two groups. This was done in an attempt to limit differences as a result of question difficulty and provide more insight into System 1 versus System 2 thinking.

The research team met to go over the questions and discuss testing sessions. After the research team met, authors PG and MC led sessions with the residents. Residents were assigned into two groups based on PGY-level and most recent test scores (ITE scores for residents, and Step 2 Clinical Knowledge scores for interns) to balance each group’s test-taking ability. During each session, the residents were provided a clinical scenario or case similar to what they would typically encounter in the emergency room or on a board examination. For ten questions, residents were required to answer by going with their gut instinct (System 1). For these 10 questions, the residents had five seconds to answer the question. For the other ten questions, the residents had as much time as they needed to think through their response (System 2). Data collection began in the spring of 2020 but was paused due to COVID-19. Data collection was completed in October 2021.

The primary outcome was to determine if residents performed better on questions when they had to use System 1 thinking compared to questions where they used System 2 thinking. A secondary outcome was to assess if using System 1 or System 2 was associated with previous ITE scores for those who took the exam in 2021.

Data are presented as means (with standard deviations) and medians (with interquartile ranges) for System 1 versus System 2 scores by residency year. Pearson correlations were conducted to examine relationships between study scores and the ITE. Additionally, a linear regression was conducted to control for other variables in examining each relationship between scores. The regression results are presented with coefficients and 95% confidence intervals (CIs) for the coefficients. Statistical significance was at p < 0.05. Data analysis was completed using IBM SPSS Version 27.

RESULTS

Twenty-nine residents completed the study (14 assigned to group A, and 15 assigned to group B). The sample included 10 PGY-3, 9 PGY-2, and 10 PGY-1 residents. All residents are MDs or DOs between the ages of 26 and 33. The highest score (out of 20) was 18, with one individual from group A and two from group B achieving these scores. Two individuals reached a perfect score (10/10) on gut instinct and one individual accomplished a 10/10 on thinking slowly questions.

Differences in PGY levels on thinking fast (System 1), thinking slow (System 2), and total scores are provided in Table 1. PGY-3s scored highest on both System 1 and System 2 questions and each group performed slightly better on System 2 versus System 1; however, all group differences were within one correct response. The trend remained similar across groups in terms of improvement on both System 1 and System 2 questions as residency years progressed.

	PGY1 (N=10)		PGY2 (N=9)		PGY3 (N=10)
	Mean [SD]	Median (IQR)	Mean [SD]	Median (IQR)	Mean [SD]	Median (IQR)
Gut Instinct	6.3 [2.2]	6.5 [4.5, 8]	6.9 [1.5]	7 [6, 8]	7.2 [1.9]	7 [5.75, 9]
Thinking Slowly	6.7 [1.5]	7 [6, 8]	7.1 [1.5]	7 [6, 8]	7.7 [0.9]	8 [7, 8.25]
Total	13 [2.5]	14 [10,15]	14 [2.3]	14 [12.5, 15.5]	14.9 [2]	15 [13, 16.5]

Table 1. Thinking Fast Versus Thinking Slowly Scores by Residency Year

Gut Instinct and Thinking Slowly Scores out of 10 possible points; total out of 20. SD = Standard Deviation; IQR = Interquartile Range

In-Training Examination Score

Figure 1. Emergency Medicine Residents’ Differences in Thinking Fast Versus Slow

Cutline: Figure 1 shows residents’ (N=29) differences on scores between rapid (System 1) versus untimed (System 2) thinking. Positive scores indicate a resident performed better in the analytical approach or slower approach category. Negative scores indicate a resident performed better while thinking fast (gut instinct category). A value of zero indicates the resident performed the same.

Figure 2. Scatterplots of Thinking Fast Versus Slow and Emergency Medicine In-Training Examination

Cutline: Figure 2 shows the relationship between residents’ (N=19) In-Training Examination Scores and gut instinct (10 questions) versus thinking slowly (10 questions) scores.

Figure 1 provides a histogram of differences in scores between System 1 and System 2. This shows that some residents scored four points better on System 1 whereas others scored four points better on System 2 with all other residents scoring within this range.

Correlation outputs were examined to determine if higher scores on System 1 were associated with higher scores on System 2. There was no statistical significance between scores on each thinking mode (r = 0.043, p = 0.826). Next, for the residents who had previous ITE scores (N=19), we examined correlations of System 1, System 2, and total score with training exams to determine associations. There were no statistically significant associations between System 2 and ITE (r = 0.106, p = 0.667) or total score and ITE (r = 0.371, p = 0.117). There was a statistically significant association between System 1 and ITE (r = 0.493, p = 0.032).

Figure 2 provides scatterplots between ITE scores and thinking slow (System 2) versus gut instinct (System 1).

A linear regression was conducted to see if ITE performance could be predicted by System 1 or System 2 scores after controlling for other variables. After adjusting for residency year and System 1 questions, the System 2 scores were not a statistically significant predictor of ITE performance B = 0.08 (95% CI of B: -1.30, 1.46), p = 0.908. After adjusting for residency year and System 2 questions, System 1 scores were a statistically significant predictor of ITE performance, B = 1.90 (95% CI of B: 0.05, 3.75), p = 0.045.

Discussion

Overall, there were no statistically significant differences in performance of System 1 vs System 2 thinking on resident surveys. However, there was a significant difference in how System 2 correlated positively to ITE performance. Additionally, there was no significant correlation between high System 1 performance and ITE score. Overall, these results suggest that there is an advantage to System 2 thinking that does not exist with System 1 in the context of ITE board-style examinations.

Based on this pilot study, with a more data-driven, methodical approach to test-taking residents may find themselves with improved test performance when compared to answering questions on gut-instinct primarily. Some of the limitations encountered include a single-site study limited to emergency medicine residents specifically, as well as an underpowered study overall.

The results of our study best correlate to standardized test-taking rather than clinical decision making. System 1 and 2 have been analyzed in a clinical decision-making capacity and System 1 has been shown to favor bias,5 and infringe on optimal outcomes. A similar conclusion may be applied to test-taking. Methodical, consistent study over time has been shown to provide optimal test results;3 in line with these results, we support the proposal that test taking using the same approach will yield better test scores when compared to reacting on gut instinct primarily.

Conclusion

In the context of ITE examinations, answering questions using a System 2 based approach shows a correlation to higher exam scores when compared to answering based primarily using System 1. Current residents can tailor their test-taking strategies to favor System 2 processing for a potential increase in ITE board scores.

An Investigation into Problem-Based Recall Comparing Gut Instinct v. Untimed Deliberation