Running Head: RATING DIFFERENCES IN MULTI-RATER FEEDBACK Rating Differences in Multi-Rater Feedback: A New Look at an Old Issue
نویسنده
چکیده
"Three hundred and sixty-degree feedback" ("360") is the popular name for performance feedback collected from multiple raters. In the typical 360 process, supervisor(s), subordinates, peers, and (less frequently) internal or external customers provide feedback on performance for each target ratee, using some type of standardized instrument (London & Smither, in press; Tornow, 1993a). The ratee then is expected to use the data, along with his/her self-ratings, to make appropriate behavioral changes to improve performance (London & Smither, in press). Multi-rater feedback has assumed a substantial role in U.S. organizations in the last decade (O'Reilly, 1994). In spite of its rapid growth, however, much remains to be learned about its application and interpretation (London & Smither, in press). While there have been a number of studies documenting the lack of agreement among rater groups (London, 1995), few have focused at the level of dimensions rather than overall ratings. Additionally, few have attempted to determine why a lack of agreement exists (Cardy & Dobbins, 1994). The purpose of this research, therefore, is to contribute to a broader understanding of 360 and rating differences among different rater groups. This paper will begin by exploring the 360 process. The research on the level of agreement that has been exhibited among rater groups will be presented. Finally, a study to explore rater agreement at the dimension level will be reviewed. 360 is a complex, multi-step process. Based upon a literature review, a model (Figure 1) was developed to illustrate key components of the 360 process, and to provide structure to the broad spectrum of literatures with relevance to 360. This process model will now be explored. Purpose. The first component in Figure 1's process model is the purpose for which multi-rater feedback is to be used. 360 has been used by researchers and practitioners to address a variety of individual and organizational goals. They include: 1. To improve the subjective measurement of performance. Although supervisory ratings are widely used for measuring performance, they are subject to a variety of intentional and unintentional errors (Cardy & Dobbins, 1994). 360 has Abstract Understanding rater disagreement in multi-rater (360) feedback efforts is important to both scientists and practitioners. This study used structural equations modeling to test for the presence of (a) construct definition differences and (b) rating scale point differences in each of sixteen performance dimensions. Data used were from the Center for Creative Leadership's "Benchmarks ® " …
منابع مشابه
A Study of Raters’ Behavior in Scoring L2 Speaking Performance: Using Rater Discussion as a Training Tool
The studies conducted so far on the effectiveness of resolution methods including the discussion method in resolving discrepancies in rating have yielded mixed results. What is left unnoticed in the literature is the potential of discussion to be used as a training tool rather than a resolution method. The present study addresses this research gap by exploring the data coming from rating behavi...
متن کاملMany-Facet Rasch Measurement
This chapter provides an introductory overview of many-facet Rasch measurement (MFRM). Broadly speaking, MFRM refers to a class of measurement models that extend the basic Rasch model by incorporating more variables (or facets) than the two that are typically included in a test (i.e., examinees and items), such as raters, scoring criteria, and tasks. Throughout the chapter, a sample of rating d...
متن کاملRating leniency and halo in multisource feedback ratings: testing cultural assumptions of power distance and individualism-collectivism.
This study extends multisource feedback research by assessing the effects of rater source and raters' cultural value orientations on rating bias (leniency and halo). Using a motivational perspective of performance appraisal, the authors posit that subordinate raters followed by peers will exhibit more rating bias than superiors. More important, given that multisource feedback systems were premi...
متن کاملIranian Non-native English Speaking Teachers’ Rating Criteria Regarding the Speech Act of Compliment: An Investigation of Teachers’ Variables
Among topics in the field of pragmatics, some seem to be in a more rigorous need of investigation. Pragmatic assessment and specifically the issue of pragmatic rating are among issues which deserve more thorough consideration. The purpose of this study was to examine rater criteria and its consistency and variability in the assessment of Iranian EFL learners’ production of compliments based on ...
متن کاملToward Using Text Summarization for Essay-Based Feedback
We empirically study the impact of using automatically generated summaries in the context of electronic essay rating. Our results indicate that 40% and 60% discourse-based essay summaries improve the performance of the topical analysis module of e-rater. E-rater is a system that electronically scores GMAT essays. We envision using automatically generated essay summaries for instructional feedba...
متن کامل