Forms and Supporting Materials

Interpreting Student Survey Results

Pilot Year Guidance for Evaluators and Teachers

The Minnesota student engagement surveys used in this pilot were developed to align with the definition of “engagement” used in the Model, which refers to behavioral, cognitive, affective and academic engagement. Survey design experts partnered with MDE to create the surveys, and Minnesota educators reviewed items and provided feedback along the way.

A primary vision for this Model is to embed professional learning throughout the system. Teachers can learn a great deal about their students from surveys and can use perception data to engage in reflection, identify strengths and opportunities for growth, and plan accordingly.

As with any source of data or evidence, multiple interpretations are possible. The purpose of this document is to help teachers and evaluators a) control for false interpretations and b) understand the considerations and variables so that more accurate interpretations can be made. Or, to be plain, the following are some “Dos and Don’ts” with regard to the survey.


1. Do NOT use the percentiles to determine performance ratings (unsatisfactory, development needed, effective, and exemplary in the Model)

While the percentiles give a general indication of where a teacher’s average item scores fall in relation to other teachers, they are NOT a straight forward indicator of performance.  It is easy to draw false conclusions based on percentile rankings. That is, large differences in percentiles might only reflect rather small differences in overall average survey responses when many teachers have similar averages.

This purpose of this survey is not to rank teachers. Percentiles are offered only as one (of several) ways to examine the data and are inappropriate for summative evaluation.

2. Do NOT equally distribute teachers across performance ratings. (Do NOT expect a traditional “bell curve” either)

“Equal distribution” means, if you had 100 teachers, 25 teachers would get each of the Model’s four performance ratings. This is NOT appropriate for this or any teacher effectiveness measure.

Most teachers, according to the MET project and other studies using student survey results, receive positive ratings from student surveys, with only a very few teachers receiving unsatisfactory or exemplary ratings at the “tails.” If results from the Minnesota surveys mirror other surveys, the mean scores for teachers will be closer to 4 (on a possible scale from 1-5) while very few if any teachers will have a mean score below 3.

3. Do NOT make a big distinction between a small difference.

As you look at comparisons of subgroups (gender, race/ethnicity, class periods, etc) the differences in mean (average) responses may or may not be significant. Consider this example:

The average female response for an item is 3.8 and the average male response for the item is 4.0.

The difference between female and male responses probably does not have any practical importance because it is too small. Since this survey is new and being piloted, we do not have enough data to statistically determine a “significant difference.” Reviewing differences can be valuable for reflection and planning (see list of “DO” below), but in most cases reliable, quantitative judgments cannot be made. When working with individual teacher data, subdividing into small groups reduces reliability considerably, so any interpretations must be based on very large subgroup differences and be regarded as tentative.