Performance Standards for Music: Assessment Strategies for Music

PrefaceIntroduction • Assessment strategies for music • Prekindergarten (ages 2—4)Grades K—4Grades 5—8Grades 9—12NotesGlossaryStandards publications

Description of the Assessment Strategies

In this publication, one sample assessment strategy is provided for each achievement standard appearing under the nine voluntary national content standards for music for grades K—12 as well as under the four content standards for prekindergarten instruction. In addition, a description of characteristics of students’ responses is provided for basic, proficient, and advanced levels of achievement.

Like the achievement standards, the assessment strategies for grades K—4, 5—8, and 9—12 are designed for students in grades 4, 8, and 12, respectively. The PreK assessment strategies are intended for four-year-olds. With suitable adjustments, however, many of the strategies can be used in a developmentally appropriate manner at earlier stages as well.

The assessment strategies are designed for use with individuals rather than groups, except where a standard specifically refers to groups. Some of the strategies can be modified so as to be usable with groups when necessary. Most strategies that require singing or playing instruments must be administered individually, though strategies requiring written responses may often be administered in groups.

Several of the assessment strategies do not call for all of the skills and knowledge specified in the achievement standards on which they are based because some of the achievement standards include diverse skills and knowledge that require diverse assessment strategies. In every case, the sample assessment strategy provided is based on skills and knowledge considered fundamental to the achievement standard. If a student can demonstrate these skills, he or she may be able to demonstrate the other skills called for as well. In order to be certain, however, it is necessary to devise parallel assessment strategies based on the other skills and knowledge called for in the achievement standard. In several strategies emphasis is placed on skills or knowledge not assessed in other strategies. In a few cases, where there are distinct and equally important components in an achievement standard, two tasks are specified: task A and task B.

The description of response does not necessarily include every criterion that should be considered. For example, a strategy designed to assess expressive performance or diversity of repertoire may not specify that the performance should be in tune and in rhythm. In any strategy requiring performance, however, pitch and rhythm, as well as other elements of performance that are emphasized in other assessment strategies, are understood to be valid considerations even though they are not explicitly mentioned.

Although the expectation is not always explicit, achievement at the proficient level is intended to imply exceeding all of the criteria for the basic level and meeting additional criteria as well. Similarly, achievement at the advanced level is intended to imply exceeding all of the criteria for the proficient level and meeting additional criteria as well.

A student is expected to meet all of the numbered criteria for a given level (i.e., basic, proficient, or advanced) before he or she is considered to have achieved that level. If, for example, a student meets all of the criteria for the proficient level and most, though not all, of the criteria for the advanced level, he or she is considered to have met only the proficient level until the remaining criteria for the advanced level have been met. The assessor, however, is assumed to have discretion in this matter, particularly when the most important criteria have been met and when the student demonstrates major strengths that are not addressed directly in the criteria. Some of the criteria are obviously more important than others and they are not intended to be weighted equally.

Within each numbered criterion, a student response may meet some expectations but not others. If the purpose of the assessment is diagnostic or analytical, this detailed information may be helpful. If the purpose of the assessment is to draw generalizations, the assessor must judge whether or not, all things considered, the student can be said to have met the essence of the criterion. Again, some expectations within each criterion are more important than others, and they are not intended to be weighted equally. Further, there are many possible ways in which the various characteristics of student responses may be combined other than those addressed specifically in the descriptions of response. Here, too, the assessor must judge which level of achievement, all things considered, the student response most closely approximates.

Some of the criteria cited are irrelevant in certain circumstances. For example, if the student is playing a keyboard instrument, it is not relevant to judge pitch. Tone quality is not a relevant criterion in assessing performance on some mallet percussion instruments, nor is it possible to sustain tones for their full value if the instrument one the student chooses is a guitar.

The procedures described in these performance standards allow teachers considerable flexibility in administering assessment strategies and in interpreting results. It is important to remember that teachers can affect the results to a very considerable extent, not only through the judgments they make of student responses but also by the selection of assessment materials that are complex or simple, familiar or unfamiliar, difficult or easy. Any changes or inconsistencies in the assessment or the scoring procedures reduce the reliability of the assessment. Allowing prompts or suggestions by teachers also has that effect, although the impact may be lessened if the intent is to help all students equally.

The sample assessment strategies include no examples of multiple-choice, matching, or other objective tests. These techniques, however, can be used efficiently to assess many aspects of the notation and terminology skills, listening skills, and cognitive learning called for in, for example, standards 5, 6, 8, and 9.

The assessment strategies suggested here tend not to be specific or detailed enough to ensure high reliability, though teachers can increase reliability by applying them in a consistent manner for all students. Some teachers believe that minor losses in reliability may be acceptable in low-stakes assessment when there is an opportunity to teach and assess simultaneously. However, when results are to be compared across schools or districts, it is necessary to establish strict procedures that ensure uniformity in administering and scoring assessment exercises.

The voluntary national standards for music say nothing about how they are to be achieved. That is left to the states, local districts, and individual teachers. Because assessment procedures must be based on instructional procedures, differences in assessment procedures are expected, as well as differences in methodology. Teachers should feel free to devise alternative assessment procedures that will work in their situations. Regardless of the different approaches taken by their teachers, however, students should be expected to achieve the skills and knowledge called for in the standards for the specified grade levels.

The performance standards, like the voluntary national content and achievement standards, are intended for use with all students, including students with disabilities and students with limited English proficiency. In some cases, special accommodations may be necessary. In other cases, it may be impossible for students to meet the expectations set forth. But insofar as possible, the goals of quality and equity should be pursued with equal vigor for all students.

Assessment Procedures

Assessment procedures should be designed to provide students with an opportunity to demonstrate their capabilities in a fair and accurate manner in an authentic setting that is integrated into the instructional process insofar as possible. As in any well-managed classroom, there should be a nurturing environment in which all students are motivated to do their best and to focus on the task at hand. They should not fear negative consequences for any performance above or below that of the group as a whole.

The administration of assessment strategies requires fairness and consistency. Every student should be given the same opportunity for success. Instructions should be clear and understandable. All musical examples should be reproduced with good fidelity and should be clearly audible to every student. Each student should have the same amount of time. All necessary materials, instruments, and equipment should be available and in good working order. The environment should be free of extraneous noise, distractions, and interruptions. The teacher should make every effort to ensure that the student is as comfortable as possible.

Students must be able and willing to perform the tasks called for in the assessment strategies. They must have been involved in similar tasks frequently during the instructional process. They must be motivated. Some of the assessment strategies suggested in this publication call for behaviors that some teachers might think their students are unwilling to engage in individually. This should not be true if the students have had sufficient prior experience. When the assessment task is presented within a context that is familiar to the student, it should not cause undue anxiety or concern.

Prior to administering any assessment strategies, the assessor must establish expectations for judging success. If students are to be assigned to one of the three levels of achievement used here (basic, proficient, and advanced levels are used in these performance standards), the assessor must determine clearly what behaviors correspond to each level. Written descriptions for the various levels of achievement are a useful first step, but to achieve satisfactory reliability in scoring assessment exercises, it will usually be necessary for written descriptions to be supplemented by sample student responses for each task at the various levels.

For those strategies involving music performance or improvisation, the samples, or exemplars, should consist of tape recordings, by students, representing the basic, proficient, and advanced levels for each assessment task. For strategies involving composition, the samples should be student compositions representing each level. For other strategies, they should be sample written responses of students’ work. In judging the responses obtained in the assessment, the sample responses serve as illustrations of the benchmark responses described in this publication. They help to ensure that all students are assessed by the same standards.

In the interest of fairness, accurate and well-kept records of student assessment are necessary. Such records are also necessary for answering any subsequent questions from parents or school administrators concerning the bases for the student’s placement or grade. Results of assessment may be recorded by means as simple as a list of students on a clipboard with a brief checklist or rating scale beside each name: the teacher quickly enters a check to indicate the performance of that student. Or results may be entered directly into a computer database. In any case, the recording of results should be done quickly and accurately. (One of the most promising recent developments in assessment is the use of handheld electronic devices that make it possible for the teacher to move around the classroom and record immediately a score or rating for each student. The results are later downloaded and compiled.)

Some assessment is most easily carried out with only the teacher and the student present, though this may be difficult or impossible in many elementary school general music classes and other large groups. Prior experience on the part of the student in working alone with the teacher will help to ensure that the student is comfortable performing in this setting.

Ideally, when the assessment strategy calls for the student to sing, play instruments, or move, the student’s response should be audiotaped or videotaped for subsequent scoring. That allows the scorer to better control the conditions under which the scoring is done and makes possible subsequent confirmation of the scoring if desired.

Some assessment strategies call for recording the student’s performance during a rehearsal. That can be accomplished by using neck microphones and multiple tape recorders or a large, multichannel tape recorder. It may also be accomplished by using multiple small handheld tape recorders or by having the teacher move around the room listening to each student. Teachers may assist one another in assessing their students’ performances, and students may assist teachers in making the tapes. Students unaccustomed to these procedures may be uncomfortable at first, but when a procedure becomes routine, it will no longer arouse anxiety. Students may also record their own performances at home or in a practice room.

Reporting Assessment Results

Assessment results may be reported in any number of ways. The question, “How well is the student doing with respect to this standard?” may be answered either by a single score or by a profile showing the student’s progress on various assessment tasks related to the standard. Similarly, the larger question, “How well is the student doing in music?” may be answered either by a single score or by a profile showing the student’s progress with respect to various standards.

The purpose of a student profile is to identify and display both strengths and aspects needing improvement. A student may perform at the basic level with regard to one criterion and at the proficient level or the advanced level with regard to other criteria for the same assessment task. If the purpose of the assessment is to plan effective follow-up instruction, the most helpful reporting format may be a detailed profile. If, on the other hand, the purpose of the assessment is to generalize about the student’s achievement, the most helpful reporting format may be a single, holistic score.

Scores may be combined mathematically in a variety of ways. For example, the teacher may assign a score of “1” for any assessment strategy in which the student meets the basic level, a score of “2” for any strategy in which the student meets the proficient level, and a score of “3” for any strategy in which the student meets the advanced level. It would then be possible to report the student’s progress toward meeting a given content or achievement standard by calculating a mean score for all of the assessment strategies related to that standard. As many or as few strategies as desired could be included. It may be important to assign varying weights to the strategies in making the calculation because usually some assessment strategies are more important than others.

Assessment results for individuals can be combined to create either a profile or a holistic score for the class. If assessment tasks have been administered and scored in the same manner for every student, the results can be further combined to form a profile or a holistic score for the school district, the state, or the nation. Such composite results may be of great interest to parents, school administrators, and the public. Results showing how well students are doing with respect to each standard can be helpful to students and teachers by confirming successes and suggesting where additional work is needed.

Time Constraints

Assessment takes time. Some of the assessment strategies suggested here may appear to require more time than is available. If so, there are ways in which the amount of time required can be reduced. Some of these timesaving techniques will likely result in a loss of reliability, but a slightly less reliable assessment may be better than no assessment at all. For example, teachers may save time by checking fewer samples of the work of each student; they may assess less frequently; they need not listen to an entire piece, but may stop the student as soon as his or her level of achievement becomes clear; they may divide the class into small groups and assess several individuals simultaneously; or when the purpose of assessment is to draw inferences about the group, they may assess a random sample of individuals from the group.

In some cases, if the task is described appropriately to the student, a single assessment strategy may be used to assess progress toward two or more achievement standards. For example, the same performances or tapes may be used to assess two standards. Similarly, the same works of music may be used for several instructional and assessment purposes–a single movement of a Mozart symphony, for example, can be used to teach (and to assess) many things.

Normally, the student should be given sufficient time to complete every task. In some tasks speed is necessary, but usually it is more important to know whether the student can complete the task than to know how quickly he or she can complete it.

Students may be taught to assess tapes of their own performances or the performances of other students. Assessment by students will likely be less reliable than assessment by teachers, but the ability to assess one’s own work is an important outcome of education. Teaching self-assessment is a particularly important aspect of the instructional process.

Assessment of the performance skills of individual students is an important aspect of music assessment. Individual assessment is more time-consuming and labor-intensive than assessment in groups, but it is often necessary. When faced with the practical difficulties of individual assessment in singing and playing instruments, states or school districts sometimes give up and limit their music assessment to those skills assessable by paper-and-pencil testing. Any comprehensive assessment of music learning must include assessing the ability to perform and create music as well as the ability to perceive and analyze it. Such assessment is worth doing, and it is worth doing well.


In some of the assessment strategies, it is difficult to make meaningful distinctions between the basic and proficient levels or, in other cases, between the proficient and advanced levels. For a few strategies, no meaningful distinction between proficient and advanced levels is identifiable unless the strategy is repeated with more complex materials and the student provides a more sophisticated response.

There may be some strategies for which more than three meaningful levels could be constructed. Or there may be very simple strategies for which there is no meaningful Basic Level: the student can either demonstrate the competence or cannot. Assessment procedures using more or fewer than three levels of response are legitimate, of course, although adjustments must be made when procedures using different numbers of levels of response are combined to generate a student profile or a holistic score.

Further, it cannot be assumed that the various descriptions of responses represent comparable levels of achievement across assessment strategies. If, for example, the proficient level for one strategy does not represent a level of achievement comparable to that represented by the proficient level for another strategy, it would be spurious to calculate a mean score including the two values. These issues can be resolved only by further study.

In some cases, the ability of the student to meet performance standards may depend on the willingness or ability of the teacher or the school to provide appropriate learning experiences. For example, a student in a performing group will likely be unable to meet a standard with respect to diversity of repertoire or familiarity with major works unless the teacher selects suitable repertoire.

One of the most fundamental problems in establishing performance standards is the difficulty of describing differences in quality using words rather than examples. Differences in quantity can usually be described much more easily than differences in quality. But quality is usually more important than quantity. To help in judging quality, a range of sample student responses, or exemplars, should be assembled for use as illustrations of the benchmark responses provided in this publication. The sample responses should consist of tape recordings, compositions, and other written responses by students representing the basic, proficient, and advanced levels for each assessment task.


The standards movement has altered the landscape of education substantially by bringing assessment to the center of the stage and giving it high visibility. When the curriculum is based on activities students engage in, meaningful assessment is often impossible. When the curriculum is based on standards students are expected to meet, assessment becomes possible. But standards do more than make assessment possible: they make it necessary.

The time has come to take assessment seriously. Music educators can no longer be ambivalent toward assessment. Developing and implementing standards-based curricula and finding effective ways to assess student learning in music may be the supreme challenges facing music education at the end of the twentieth century.

Learning Network