Checking our incoming raw-material quality often involves rating nonmetallic inclusions, particularly when heat treating to high hardnesses. We want to know that the steel we are heat treating is clean. Learn why obtaining consistent results is a challenge for more than one reason.


Over the years, ASTM Committee E-4 on Metallography has conducted interlaboratory test programs to evaluate the precision and bias associated with measurements of microstructure using proposed and existing test methods. ASTM decided in the late 1970s that all test methods that generated numerical data must have a precision and bias section defining the repeatability and reproducibility of the method.

Defining bias associated with a test method is difficult unless there is an absolute known value for the quantity being measured, and this is not possible when microstructural features are being measured. This paper shows the results for an interlaboratory test using Method A, worst-field ratings of inclusions in steels by ASTM E-45. The results from nine people who were reported to be qualified, regular users of the method revealed consistent problems of inclusion-type misclassification and a wide range of severity ratings for each specimen.


Created in 1942, ASTM E45 was based on an earlier[1,2] chart developed by Jernkontoret in Sweden. The charts were designed to determine the size, distribution, number and types of indigenous inclusions (naturally occurring particles that form before or during solidification due to limited solid solubility for O and S) in steels.

Originally, E45 included three charts – Plates I, II and III – but now there are two, Plates 1r and II. Plate 1r replaced Plates I and III after these charts were measured[3] and corrected in the creating of the image-analysis method for making E45 JK inclusion ratings,[4,5] which was published as E1122 in 1992 and incorporated into E45 in 2006.

The JK chart – the original Plate I – categorized indigenous inclusions as sulfides (type A), aluminates (type B), silicates (type C) and globular oxides (type D), although the classification was stated to be only by morphology. There were thin and thick categories of each based on their thickness (or diameter for the D types), and the severity ratings varied in whole increments from 1 to 5. Plate III was similar, but the severity limits were in 0.5 increments from 0.5-2.5.

Inclusion Rating Challenges

Basing the categorization of A and C types on morphology alone creates inherent confusion in ratings because both elongated, malleable inclusion types look similar. The charts do not show the gray-level difference between gray sulfides and darker, blackish, glassy-looking silicates. Some raters seem to regularly confuse the two types. Obviously, sulfides and silicates have markedly different effects on steel products.

Similar charts for rating inclusions have been developed by numerous countries and companies over the years and by ISO.[3]

At least two such charts depict sulfides as being lighter than silicates. The tips of sulfides also appear to be more rounded than silicates, which appear to be sharper, but these differences may be difficult to see at the usual depiction of the inclusions at 100X magnification by the chart pictures.

These chart ratings were all done qualitatively until E1122 was developed, which utilized an image analyzer to make the ratings. The operator scans a specific-sized area on a polished specimen and then records the worst ratings of each inclusion type and thickness observed (Method A). Alternatively, the operator can scan the area field by field and record the ratings of the inclusions in every field (Method D). Method A, of course, takes far less time to perform manually than Method D and is more commonly utilized. By image analysis, there is no real-time difference between performing Methods A and D.

The success of such ratings manually in defining the inclusion content in a heat of steel hinges upon a number of factors:

  • Relevance of the billet test locations and test-plane orientation in defining the total inclusion content and the distribution of the inclusions relative to the ingot location or concast billet location
  • Quality of the specimen preparation
  • Grading of the inclusion dimensions by the chart and the relevance of the deformation pictured to the degree of hot reduction of the billets
  • Similarity of the inclusion morphologies depicted in the chart pictures to those in the steel being evaluated
  • Validity of the rating method
  • Statistical value of the chart ratings
  • Correlation of the chart ratings to other methods for assessing the inclusion content

While manual chart ratings are relatively simple to perform and the analysis time is reasonably fast, they do suffer from numerous disadvantages that substantially degrade the reliability, reproducibility and repeatability of the measurements. The E45 charts were developed based upon the effect of hot-working deformation on the inclusion length, or stringer length, going from ingot to a 4- x 4-inch billet size. Naturally, this degree of deformation is not obtained in a casting or in a large forging and will be greater for plate, bar, sheet or strip products.

Round-Robin Interlaboratory Test Program Results

The writer organized a round robin using specimens cut sequentially along billets of three steels with varying sulfur content and melting practices, some of which were Al-killed and some that were not. Nine different people analyzed the specimens using Method A (worst field) of ASTM E45. The data is summarized in Tables 1-3. The specimen used for the data in Table 1 was type S7 tool steel, which is not Al-killed and does exhibit very classic silicate inclusions of the C type. Its oxygen content is a bit on the high side for a 0.50% carbon tool steel (electric furnace, non-degassed). Its sulfur content is relatively normal for tool steels.

Note the wide range of severity values for all inclusion types, indicating imprecision and insensitivity in the ratings. S7 definitely has silicates but should have virtually no oxide stringers of the B type, but the B thin and B thick ratings ran from severities of 0 to 3 and 0 to 2.5, respectively, with averages of 0.78 and 0.89.

Meanwhile, some raters did not rate any oxides as C types, although they predominate in S7. No doubt, they rated the silicates as type A sulfides. There will be a few isolated oxides that are not elongated enough to be classified as stringers and can be rated as D types. The A ratings are a bit high for a steel with 0.017% S (compare these A ratings to that of the resulfurized steel in Table 2).

Table 2 shows E45, Method A ratings for a resulfurized 41S50 alloy steel at 0.065% S (3.8 times the sulfur content of the S7 specimens shown in Table 1). Unlike the S7 heat, the 41S50 was Al-killed and its oxygen content is a bit lower. Despite the much greater S content, the A-type sulfide ratings are not much different than for the S7 tool steel.

The A thin and A thick ratings both varied from 0 to 4, versus 1-4 (thin) and 1-3 (thick) for the S7 specimen with one-quarter as much sulfur. The averages for the nine A thin and thick ratings were 2.33 and 2.0 for the S7 steel with 0.017% S and 2.78 and 1.72 for the 41S50 specimens with 3.8 times as much sulfur. The 41S50 specimen should have B-type aluminate stringers present, but no C-type elongated silicates. However, the C thin and thick ratings for the 41S50 specimens varied from 0 to 5 and 0 to 4, with mean values of 2.0 and 0.89. These C ratings must actually be for A-type sulfides. Note that the % relative accuracy values for the 41S50 ratings are noticeably higher than for the S7 ratings.

Table 3 shows E45 Method A ratings for 52100 bearing steel made in the electric furnace, vacuum degassed with electrodes then remelted by the electroslag remelting (ESR) process. ESR produces a very low sulfur content; 0.003% is barely above the solubility level of sulfur in Fe, so there should be no type-A sulfides at or above the 0.5 severity level. There were A thin and thick ratings from 0.5 to 1.5 and 0 to 1.0, however, with mean values of 1.12 and 0.5 respectively.

The oxide content should be very low, with an oxygen content of 37 ppm. Since 0.020% Al was present, it would be unlikely to see type-C silicates in these specimens. Also, with the very low sulfur level and the higher-than-expected A ratings, it is hard to envision sulfides being rated as silicates in these specimens.

Overall, the inclusion ratings are much lower than for the S7 and 41S50 specimens. Based upon my experience, however, the EF-ESR 52100 inclusion ratings seem to be excessively high. As would be expected, the standard deviations, 95% confidence limits and % relative accuracy values for the EF-ESR 52100, due to its lower S and O contents, are much lower (statistically better) than the data for the S7 and 41S50 specimens.


The data from this round robin (which is in agreement with previous studies) clearly shows that ASTM E45 chart ratings are neither precise nor reproducible. Repeatability was not evaluated in this study.

The overall problem stems from a number of factors, as listed above, which make chart ratings undependable. Yes, they are fast and simple to do, but they are subjective. ASTM E1122 was developed to permit use of image analysis to perform E45 ratings. This method is much more precise and reproducible because inclusions in every field are rated using the exact same criteria as defined in the standard. Even with better image-analysis-generated E45 ratings, however, the value of the data in predicting the performance of components in the field is still dubious.

A far better approach is to use stereologically based measurements of the oxides and sulfides by ASTM E1245. The weakness here is that purchasers do not know what limits to use in purchase specifications. To date, only one commercial product standard[6] is known to this writer using E1245 data for acceptance or rejection.

This problem could be alleviated if image-analysis software produced E1245 measurements simultaneously when doing E45 chart ratings. Then the purchaser would start to understand the valuable nature of stereologically based measurements of the area fraction, number of inclusions per square mm area, average length and cross-sectional area, and spacing of oxides and sulfides using E1245 with mean values and standard deviations of the measurements for data-basing test results.

The mean data values for all specimens from a heat can be averaged and standard deviations calculated. Then differences between heats or variations between melting practices or vendors can be validly determined via simple statistical procedures, such as the student-t test.


For more information:  Contact George F. Vander Voort – Vander Voort Consulting LLC in Wadsworth, Ill.; tel: 847-623-7648; e-mail:; web:



  1. B. Rinman et al., “Inclusion Chart for the Estimation of Slag Inclusions in Steel,” Jernkontoret, Stockholm, Sweden, Uppsala (1936), 24 pages
  2. B. Rinman et al., “Chart for the Estimation of Inclusions in Steel,” Jernkontoret Ann., Vol. 120, 1936, pp. 199-226
  3. “Inclusion Measurement,” Metallography as a Quality Control Tool,Plenum Press, NY, 1980, pp. 1-88
  4. G. F. Vander Voort and J. F. Golden, “Automating the JK Inclusion Analysis,” Microstructural Science, Vol. 10, Elsevier Science Publishing Co., NY, 1982, pp. 277-290
  5. C. Forget, “Improved Method for E1122 Image Analysis Nonmetallic Inclusion Ratings,” Micon 90: Advances in Video Technology for Microstructural Control, ASTM STP 1094, ASTM, Philadelphia, 1991, pp. 135-150
  6. AAR Specification M-107/M-208, Rev. 2009, “Wheels, Carbon Steel”

Heat Treat Top Testing Topics