In evaluating professional learning or the deployment of key instructional initiatives, we often use self-report surveys to measure teacher use or implementation of key instructional practices. This is a good practice, but it is unfortunate that the way we define, measure, and analyze the data about these instructional practices undermines the meaning and narrative that we desire from our data.
To illustrate, let’s look at a typical case that presents two options for measuring the implementation of key instructional practices in mathematics.
Measuring Frequency of Implementation of Instructional Practices in Mathematics
For this illustration, let’s pretend that we have just provided professional learning to teachers on how to use number talks to engage students in meaningful mathematical discussions, and how to use the “five practices” to synthesize key learning concepts at the end of a math lesson (Smith & Stein, 2011).
One of the ways we typically measure teacher use or implementation of these key practices would be to use a frequency scale. Teachers would report whether they “always, often, sometimes, rarely, or never” use each of the instructional practices in their classroom. If we took this route, the results from the survey could be visualized by reporting the distribution of scores for each response category, like this:
Frequency of Use of Key Instructional Practices in Math
What exactly do we learn about teacher implementation of these two instructional practices from this chart? We may interpret the data by saying:
- Most teachers (66%) affirm that they “always” or “often” use number talks by the end of the year.
- The percent of teachers who “always” or “often” used both practices grew from the beginning to the end of the year.
- The percent of teachers who “always” or “often” used the number talks exceeded those who “always or often” used the five practices for lesson synthesis by the end of the year.
We may be able to conclude that teachers don’t use the five practices to synthesize lessons as frequently as they use number talks. But what would we do with this information?
If you are like most who have sat in front of charts like these, you are probably feeling the same sense of confusion that others have experienced. You may already know that the use of the five practices is a more challenging competency to master than the use of number talks, and think that these results “confirm your hunch” about this. But beyond this, you may not feel confident about how to act upon these data, or that you are still missing information to help you plan next steps in professional learning, coaching, or other supports.
Let’s review another way of measuring teacher self-reported use or implementation of these teaching practices.
Measuring Integrity of Implementation of Instructional Practices in Mathematics
While we may care deeply that teachers frequently implement a particular instructional practice, we often care more that they use the practice with “integrity.” Implementation integrity is best defined as “doing what matters most and works best while accommodating local needs and circumstances” (LeMahieu, 2011). In order to define what integrity looks like, we should design instruments that clearly illustrate the key features of the practice that are essential to address student challenges, many of which can still be adaptable to a variety of contexts without abandoning the true intent of the practice. In many cases, frequency of use may be one of these design features, but there are other characteristics that are just as or more important. The idea of frequency is subsumed under integrity, and is not the main quality we should measure to understand how to plan, support, and empower teachers.
What if instead of focusing on frequency, let’s use a mini-rubric (a term first coined by Davidson, 2014) to describe the progression for these two instructional practices. An implementation rubric may look something like this:
Implementation of Key Instructional Practices in Math
In order to gather teacher self-report on these indicators, we can embed these progressions into a survey tool and ask teachers to describe which of the descriptors best sounds like their use of the practice at this time. Then we can use a retrospective pretest design to ask them to reflect upon which described their practice at the beginning of the year, or before a particular initiative began (Danks, 2019).
To analyze the results, we can still report the distribution of teachers who rated themselves at each level on the progression. The graph may look something like the following chart: (Note that this chart reports the exact same distribution, or scores, as the previous chart, but using the rubric scale instead of a frequency scale.)
Implementation of Key Instructional Practices in Math
The data have the same shape, but carry a very different meaning. What exactly do we learn about teacher use of these two instructional practices from this chart this time around? We could interpret the data as follows:
- Most teachers affirm that they are “innovating” or “implementing” number talks by the end of the year. Based upon what we know from how we measured these levels on the rubric, this means that 40% of teachers are:
- Using number talks each time they are specified in the curriculum.
- Making sure that each student shares their strategies with a partner.
- And at least 26% of all teachers are:
- Attempting to integrate number talks into other parts of the lesson.
- Are learning how to structure number talks to have the greatest impact for all students.
- Only about 40% of teachers are “innovating” or “implementing” the five practices, which means they are at a minimum:
- Anticipating which responses may be shared by students, and monitoring students to identify which to call on to share their strategies that connect to the key learning concepts.
- The percent of teachers who are “innovating” or “implementing” used both practices grew from the beginning to the end of the year. The ways in which they grew can be observed by reviewing the descriptors and noticing differences in scores.
- The percent of teachers who are “innovating” or “implementing” the number talks exceeded those who are “innovating” or “implementing” the five practices for lesson synthesis by the end of the year.
From two survey questions, we now glean so much more information!
Now what would we do with this information? We could now take steps to focus on the elements that help teachers confidently rate themselves at the “implementing” level for either of these indicators, but may prioritize the five practices, if this is a key strategic effort. Based upon how we measured “implementation” on the rubric, we may wish to focus on helping teachers anticipate which responses may be shared by students, monitor students to identify which to call on to share their strategies, and incorporate strategies that connect to the key learning concepts. And the year after that, we could move to the next level on the progression, focusing instead on sequencing which students to call on first in order to connect student strategies to key learning concepts. Suddenly our trajectory for supporting learning is clearer and easier to communicate to all stakeholders.
Wrap-Up
By asking teachers to slow down and reflect upon their practices for indicators of implementation integrity instead of racing through a lot of indicators of frequency of implementation, we can better understand what their practices may look like in their classrooms. We can embed frequency language into a rubric if it is an important attribute of our initiative, but frequency of use is seldom the most important thing we care about as we seek to implement with “integrity.”
Shifting our measurement practices from focusing on frequency of implementation to key elements of implementation integrity provides more meaningful information for planning professional learning, coaching, or other supports.
References
Danks, S. (2019, November). The Ultimate Measurement Mash-up: Retrospective Rubrics for Measuring Complex Change. TD Magazine.
Davidson, J. (2014). Minirubrics. Genuine Evaluation.
LeMahieu, P. (2011). What We Need in Education is More Integrity (and Less Fidelity) of Implementation. Carnegie Center for the Advancement of Teaching. Carnegie Commons Blog.
Smith, M. S., & Stein, M. K. (2011). Five Practices for Orchestrating Productive Mathematical Discussions. The National Council of Teachers of Mathematics.