9 French laboratories took part in the experiments, together with 18 laboratories from other European countries. Two of the French laboratories were chosen randomly and included with the laboratories from the other countries to give a "European" experiment involving 20 laboratories. The data from the 9 French laboratories were treated as a separate "French" experiment, although this number of laboratories is too small to give reliable estimates of the precision of the test method.
The laboratories have been given numerical codes that will be used in all the cross-testing experiments in the current year of the programme. For the purposes of this report they have also been assigned letter-codes (because single-character codes are needed in the histograms and Mandel plots).
Samples of three materials were prepared and distributed by Partner 4. The three materials were chosen to give flakiness indices of approximately 10%, 30% and 50%. The materials used for the two lower levels of flakiness were production materials. The material used for the highest level of flakiness was a crushed flint gravel, and was taken from underneath the conveyor belt that carries the aggregate away from the crusher. (The flaky particles tend to stick to the underside of the conveyor belt, and are carried back a little way towards the crusher, whereas the more rounded particles are thrown out towards the middle of the stockpile.)
The samples were prepared, for each level of the experiment, as if they were laboratory samples all taken from one bulk sample, and the participants were required to prepare and test duplicate test portions from each sample. Hence the measures of repeatability and reproducibility given by the experiment are consistent with the definitions of r1 and R1 given above.
Fractional shovelling was used to divide each bulk sample into 40 laboratory samples. A bulk sample was tipped into a line (about ten metres long and half a metre wide). 40 containers were placed around the line of aggregate. A small flat-bottomed shovel was used to scoop up aggregate from the end of the line, placing a shovelful into each container in turn, until laboratory samples of the required mass were obtained. The size of the shovel was chosen so that at least 30 shovelfuls went into each laboratory sample.
The same three materials were used in the cross-testing experiment involving the determination of the shape index that was carried out at the same time as the experiment reported here on the flakiness index test, and a number of the laboratories tested their samples by both methods. The results therefore allow a check to be carried out to see if fractional shovelling produced samples that did not differ between the laboratories. Figure 1 shows results obtained by the Shape Index test method plotted against results obtained by the Flakiness Index test method. Each point in this figure represents the results obtained by one laboratory on one laboratory sample at one level. (The code numbers for the laboratories are used in this figure because they are the same in the reports on the two experiments.)
When the results for the three levels are viewed together, it is possible to see a positive correlation between the Flakiness Index and Shape Index test results. If the samples prepared for a level of the experiment varied between laboratories, then it would also be possible to see evidence of a positive correlation between the two methods when looking at just the results for that level. However, this is not the case, so it appears that the method of fractional shovelling has produced satisfactory samples.
Figure 1. Comparison of results from the Flakiness Index and Shape Index tests.
(The numbers are numerical codes for the laboratories.)
Where a participant failed to reported a test result, the missing result is shown as "---" in the data tables.
The Flakiness Index test method requires test results to be rounded to the nearest whole number. However, for the purpose of the cross-testing experiment, the test results were recorded to the nearest 0.1%. This was to prevent rounding of the data affecting the assessment of the repeatability and reproducibility of the test method.
Laboratory averages are used to calculate the reproducibility of the test method, and to assess laboratory biasses. Between-test-portion ranges are used to calculate the repeatability of the test method, and to assess the repeatability of tests from individual laboratories. The averages and ranges are shown in the histograms, and the averages are plotted in the Mandel plots. Because only nine laboratories took part in the French experiment, these graphs show the points for the French laboratories (plotted as lower case letters) superimposed on the points for the laboratories from the other countries in the European experiment. This is done to allow the French laboratories to see where their results fall in relation to the results from all the participants.
The averages and ranges are also used to test for stragglers and outliers. Where these have been found, they are indicated throughout using a single question mark (?) to indicate a straggler, and a double question mark (??) to indicate an outlier.
Standardised values of the averages and ranges are shown in the Mandel plots. These figures are used to identify laboratories that give rise to large laboratory biasses, or large between-test-portion ranges, in more than one level of an experiment. The horizontal broken lines in these graphs show the critical values of the "h" and "k" statistics at the 5% and 1% significant levels, taken from the revised ISO standard on precision (ISO 5725 Accuracy (trueness and precision) of measurement methods and results; Part 2 Basic methods for the determination of repeatability and reproducibility of a standard measurement method.1994.).
It will be of interest to compare the precision of the Flakiness Index test with that of the Shape Index test. The repeatability and reproducibility standard deviations, or limits, for different test methods cannot be compared directly because they relate to different scales of measurement. Sensitivity ratios are dimensionless, so they do not suffer from this disadvantage, and they may be used to compare the precision of different tests. From the formula used to calculate them, it will be seen that they involve the average results for the materials used in the cross-testing experiments, so it is essential that different test methods are assessed using the same materials.