پایان نامه ارشد رایگان درمورد 4.5، three، R1

دانلود پایان نامه ارشد

participants fell outside the acceptable range (4-8), which meant that they had to be left out of the study. This notwithstanding, they were not told about this and continued to go through the other steps without their performance having an effect on the study.
In addition to these seven, some twenty five subjects failed to take part in all the required stages in the experiment. Since the experiment period was relatively long (two academic semesters equal to one academic year), some left one or the other of the two courses unfinished, some were excessively absent during the terms, some failed to turn up on the fixed times for the tests, etc. As a result, the fatality rate was high and there remained seventy participants in the study; thirty five in the control group and thirty five in the experimental group.

Table 4.1 GE Test Scores for Experimental Subjects
Subject
Score

Subject
Score
S1
6

S19
5.5
S2
5

S20
5
S3
4.5

S21
7
S4
4.5

S22
6.5
S5
5.5

S23
4
S6
5

S24
4.5
S7
6

S25
4
S8
4

S26
6.5
S9
5

S27
6
S10
5

S28
5
S11
4.5

S29
6.5
S12
7

S30
4.5
S13
5.5

S31
5.5
S14
6.5

S32
6.5
S15
5

S33
6.5
S16
4.5

S34
6
S17
6.5

S35
6.5
S18
5.5

Table 4.2 GE Test Scores for Control Subjects
Subject
Score

Subject
Score
S1
5.5

S19
6
S2
5.5

S20
4
S3
4.5

S21
5
S4
4

S22
4.5
S5
6

S23
4
S6
7

S24
4.5
S7
5.5

S25
5
S8
4

S26
4.5
S9
4.5

S27
5
S10
4

S28
7
S11
4

S29
6
S12
6

S30
6
S13
4

S31
6
S14
4.5

S32
5.5
S15
6.5

S33
6
S16
5.5

S34
6.5
S17
5

S35
6
S18
7

Tables 4.1 and 4.2 present the GE test scores obtained by the participants in the experimental and control groups respectively. It can be seen from the tables that while the minimum score and maximum score for both groups were the same (minimum = 4 and maximum = 7), the mean for the scores obtained by experimental subjects (5.47) turned out to be slightly higher than that of the control subjects (5.27). The negligible difference between the two groups indicates that the mean values were close enough to ensure homogeneity among the participants in the study.

4.3 SI Test Scores
The present study primarily focused on how interpreting trainees could handle the task of simultaneous interpreting before and after the experiment. In this section, the test results will be presented, appropriate statistical procedures will be applied, and the findings will be discussed.
An important issue in dealing with the SI test results that had to be taken into consideration before any statistical procedures could be applied to the scores was the question of inter-rater reliability.

4.3.1 Inter-Rater Reliability
We stated earlier that in order to make sure the pretest and posttest scores could be relied on, three raters were asked to score the subjects’ performance on the SI tests. There were one hundred and forty SI sessions (seventy subjects each taking a pretest and a posttest) to be scored by three judges producing a total of four hundred and twenty scores (140 × 3) which were tabulated for the purpose of analysis. Tables 4.3, 4.4, 4.5, and 4.6 below show all the scores given by the three judges to the SI sessions by the experimental and control groups in pretest and posttest.

Table 4.3 Three Raters’ Scores for Control Subjects on SI Pretest
Rater
Subject
R1
R2
R3

Rater
Subject
R1
R2
R3
S1
15
5
10

S19
70
75
55
S2
20
15
10

S20
5
5
10
S3
5
10
5

S21
25
20
10
S4
10
5
5

S22
15
10
20
S5
15
25
30

S23
10
5
5
S6
35
25
15

S24
5
15
10
S7
25
15
5

S25
15
10
5
S8
10
5
5

S26
10
5
15
S9
15
5
5

S27
10
5
5
S10
5
10
5

S28
30
45
35
S11
5
10
5

S29
45
35
40
S12
30
15
20

S30
50
45
35
S13
5
10
5

S31
5
15
20
S14
5
10
5

S32
5
15
10
S15
35
20
30

S33
5
15
20
S16
25
35
15

S34
40
30
20
S17
20
15
35

S35
5
10
15
S18
30
20
45

Table 4.4 Three Raters’ Scores for Experimental Subjects on SI Pretest
Rater
Subject
R1
R2
R3

Rater
Subject
R1
R2
R3
S1
20
20
25

S19
15
10
5
S2
10
5
10

S20
5
5
5
S3
10
5
5

S21
65
50
45
S4
5
10
10

S22
55
65
75
S5
10
15
5

S23
5
5
5
S6
5
10
5

S24
5
5
5
S7
20
15
40

S25
10
5
5
S8
10
5
5

S26
45
25
35
S9
15
25
30

S27
25
20
10
S10
30
20
10

S28
20
15
30
S11
10
15
15

S29
30
40
55
S12
10
15
5

S30
5
5
10
S13
10
10
15

S31
45
30
25
S14
35
15
25

S32
50
25
30
S15
20
15
15

S33
35
45
20
S16
10
5
15

S34
35
15
40
S17
40
35
20

S35
45
20
30
S18
5
10
5

Table 4.5 Three Raters’ Scores for Control Subjects on SI Posttest
Rater
Subject
R1
R2
R3

Rater
Subject
R1
R2
R3
S1
15
25
35

S19
80
75
80
S2
45
35
20

S20
5
5
10
S3
15
10
25

S21
20
10
15
S4
10
5
15

S22
10
15
25
S5
35
40
25

S23
10
5
5
S6
35
55
40

S24
5
10
5
S7
15
15
25

S25
25
10
10
S8
5
10
5

S26
15
10
20
S9
15
10
15

S27
10
5
15
S10
10
5
5

S28
55
45
40
S11
5
10
5

S29
55
60
40
S12
25
35
40

S30
70
55
65
S13
10
5
5

S31
45
20
35
S14
5
10
5

S32
45
35
25
S15
30
45
35

S33
30
40
55
S16
45
30
25

S34
95
65
80
S17
35
25
40

S35
55
40
30
S18
50
35
45

Table 4.6 Three Raters’ Scores for Experimental Subjects on SI Posttest
Rater
Subject
R1
R2
R3

Rater
Subject
R1
R2
R3
S1
60
50
70

S19
50
60
45
S2
30
45
50

S20
20
35
25
S3
50
40
35

S21
60
55
75
S4
40
30
35

S22
80
60
70
S5
45
40
60

S23
20
30
25
S6
50
35
40

S24
30
25
20
S7
80
70
60

S25
30
25
20
S8
30
35
25

S26
70
60
80
S9
40
55
35

S27
40
50
35
S10
70
55
65

S28
60
50
75
S11
50
75
60

S29
55
60
75
S12
45
55
60

S30
30
25
20
S13
40
30
50

S31
65
70
75
S14
75
60
70

S32
55
60
70
S15
35
45
30

S33
55
65
70
S16
30
40
50

S34
55
65
60
S17
75
65
70

S35
55
65
70
S18
25
35
20

Since every subject’s final score was the average of the three scores, we needed to check that there was consistency among the raters in scoring the tests. To this end, the following formula (Hatch & Lazaraton, 1991, p. 533) was made use of:

r_tt=  (n r_ABC)/(1+ (n-1) r_ABC )  

In the above formula,r_tt stands for the reliability of all the judges’ scores, n is the number of judges, and r_ABCis the average correlation among the three raters.
To find the value of r_tt, we had to calculate the average correlation among our three raters. Table 4.7 presents the Pearson correlation matrix for the three raters:

Table 4.7 Pearson Correlation for Raters

R1
R2
R3
R1

0.883
0.860
R2


0.878
R3

These correlation values needed to be corrected using the Fisher Z transformation table (Hatch & Lazaraton, 1991, p. 606).

Table 4.8 Z Transformation for Data

R1
R2
R3
R1

1.376
1.293
R2


1.333
R3

The average for these correlations would be:

Average(r_(R_2 R_3 ),r_(R_1 R_3 ),r_(R_1 R_2 )) =(1.376+1.293+1.333 )/3=  1.334

Then we were able to calculate the overall reliability as follows:

r_tt=  (n r_ABC)/(1+ (n-1) r_ABC )  =  (3×1.334 )/(1+2× 1.334 )  =4.002/3.668= 1.091

Transformed back to Pearson correlation, 1.091 equals a Pearson correlation between 0.79 and 0.80. Consulting the table for critical values for Pearson Product-Moment Correlation (Hatch & Lazaraton, 1991, p. 604), we find out that for p (level of significance) value set at 0.05 and df (degree of freedom/number of pairs minus 2) of more than 100, the value of r must be equal to or greater than 0.1946. The value of r obtained for the scores was way beyond this number indicating that the scores given were reliable and there was consistency in the raters’ scores.
In order to be able to see how well the three raters’ scores for each SI session corresponded to one another, the following four figures are presented. These figures will help us to show the extent of correspondence among the three judges’ scores schematically. Since there were a total of one hundred and forty SI sessions rated by the judges, it would be really unclear to show them all in one graph. Therefore, figures 4.1, 4.2, 4.3, and 4.4 below depict the results pertaining to control subjects’ pretest scores, experimental subjects’ pretest scores, control subjects’ posttest scores, and experimental subjects’ posttest scores separately and respectively.

Figure 4.1 Inter-Rater Reliability Diagram for Control Subjects’ Pretest Scores

Figure 4.2 Inter-Rater Reliability Diagram for Experimental Subjects’ Pretest Scores

Figure 4.3 Inter-Rater Reliability Diagram for Control Subjects’ Posttest Scores

Figure 4.4 Inter-Rater Reliability Diagram for Experimental Subjects’ Posttest Scores

In all the four graphs above, it is easy to see that the three lines representing the scores given by the three judges correspond very well and move, more or less, in the same direction signifying the high level of inter-rater reliability. This lets us move on more confidently to the subsequent phases of statistical operations in the process of data analysis. The average of the three scores was taken to be the final score

پایان نامه
Previous Entries پایان نامه ارشد رایگان درمورد test، possible، each Next Entries پایان نامه ارشد رایگان درمورد t، group، difference