Therefore, the common probability of an agreement will remain high, even in the absence of an “intrinsic” agreement between the councillors. A useful interrater reliability coefficient (a) is expected to be close to 0 if there is no “intrinsic” agreement and (b) increased if the “intrinsic” agreement rate improves. Most probability-adjusted match coefficients achieve the first objective. However, the second objective is not achieved by many well-known measures that correct the odds. [4] Bland and Altman[15] expanded this idea by graphically showing the difference in each point, the average difference and the limits of vertical agreement with the average of the two horizontal assessments. The resulting Bland-Altman plot shows not only the general degree of compliance, but also whether the agreement is related to the underlying value of the article. For example, two advisors could closely match the estimate of the size of small objects, but could disagree on larger objects. Krippendorffs Alpha[16][17] is a versatile statistic that evaluates the agreement between observers who categorize, evaluate or measure a certain number of objects against the values of a variable. It generalizes several specialized agreement coefficients by accepting any number of observers applicable to nominal, ordinal, interval and proportional levels of measurement, capable of processing missing and corrected data for small sample sizes. In the absence of rating guidelines, ratings are increasingly influenced by the experimenter, i.e.

by a trend in credit ratings that drift towards what he expects from the advisor. In processes with repeated actions, the correction of board drift can be addressed by regularly retraining to ensure that advisors understand the guidelines and measurement objectives. There are several operational definitions of “inter-rated reliability” that reflect different views on what a reliable agreement between advisors is. [1] There are three operational definitions of agreements: the common probability of an agreement is the simplest and least robust measure. It is estimated as a percentage of the time advisors agree in a nominal or categorical evaluation system. It ignores the fact that an agreement can only be made on the basis of chance. The question arises as to whether a random agreement should be “corrected” or not; Some suggest that such an adaptation is in any case based on an explicit model of the impact of chance and error on business decisions. [3] Subsequent extensions of the approach included versions that could deal with “under-credits” and ordination scales. [7] These extensions converge with the intra-class correlation family (ICC), which allows us to estimate reliability for each level of measurement, from the notion (kappa) to the ordinal (or ICC) at the interval (ICC or ordinal kappa) and the ratio (ICC).