IODP Proceedings    Volume contents     Search

doi:10.2204/iodp.proc.320321.203.2012

Data reduction methods for later calibration: normalized median-scaled method

Data reduction was achieved through a simple two-step method: (1) data were scaled by the median shipboard-measured bulk sediment elemental composition to scale the elemental peak areas into typical ranges of sediment composition, and (2) scaled components were then summed and normalized to 100% to eliminate variability caused by differences in porosity or cracks. This method of data reduction has a few similarities and several differences with that of Weltje and Tjallingii (2008). Weltje and Tjallingii (2008) normalize the peak areas first and then log transform the peak areas to reduce the range between major and minor XRF-emitters, like our median-scaling step. Finally, they solve a matrix of XRF element/element ratios for composition. The Weltje and Tjallingii (2008) approach has the advantage of being more global and developed from first principles, but it suffers from complexity and is not easily adapted. The advantage of the NMS technique is that it can be quickly implemented, and the calibration step can be used to determine if a more detailed approach is needed.

Sample scaling

Sample scaling is needed to better match the range of XRF peak area measurements to the range of chemical composition along the scan. Without scaling, normalized peak areas can be dominated by effects of one element.

For an elemental scaling Se,

Se = Med%e × (PeakAreae/PeakAreae,med),

where Med%e is the median weight percent of a sedimentary component (e.g., for Fe, we used the oxide Fe2O3, and for Ca, CaCO3). PeakAreae is the measured elemental peak area in a sample, and PeakAreae,med is the median peak area over the data set. There may be errors in absolute scaling because the chemical analyses are far fewer than the XRF sampling. The raw CaCO3 data, for example, scales from 0% to ~120%. However, the normalizing step reduces the total range to between 0% and 100%, and the calibration step correlating the scaled data to ground-truth chemical analyses produces a linear correlation that does a final adjustment to the percentage data.

Scaling the raw peak areas was done because the production of characteristic X-rays of different elements does not scale linearly with elemental ratios in the sample. Scaling to the total summed peak area was rejected because scaling to raw peak area strongly overweights Ca in the carbonate-rich sediment column of Site U1338 and is a significant cause of nonlinearities in later calibration. Scaling to total peak area is effectively equivalent to scaling to Ca peak area, as is shown by comparing Ca proportions in the two scaling schemes. Median peak area of Ca is 95% of summed total median peak area, whereas median CaCO3 is 76% of the summed shipboard chemical analyses. The raw peak area Ca/Si ratio is 38.5, whereas the ratio of median CaCO3/SiO2 from shipboard inductively coupled plasma–atomic emission spectroscopy (ICP-AES) analyses is 4.9. Summing to raw peak areas thus creates a burden that must be removed by the calibration step, whereas scaling to median values reduces the problem.

We scaled each element independently to a median of the compositional data from shipboard ICP-AES analyses (see the “Site U1338” chapter [Expedition 320/321 Scientists, 2010b]). Although the shipboard compositional data set is a much smaller sample set than the XRF scan data, it is appropriate for scaling as long as the compositional data set is a reasonable representation of the total range of composition. The scaling step could also be done with a generic average for the sediment type being studied if chemical data were not available. The use of a different “type” compositional analysis to scale the median value will change the ultimate NMS value and will then potentially change the slope of the calibration line to convert from NMS to calibrated percent. It is thus important to use the same scaling values within a common calibration.

Normalizing sample composition to 100%

Ideally, the sum of all sediment components should be 100% if all major elements are measured and they are properly converted to the appropriate sedimentary components (e.g., Ca is represented in the sediment by the sedimentary component CaCO3, not CaO). However, The XRF-scaled sum of components is often much lower than 100% near the top of the section where porosity is high and dry sample mass in the scan area is low. We used the following components for our data set: Al2O3, SiO2, K2O, CaCO3, TiO2, MnO, Fe2O3, and BaSO4. From the shipboard chemical analyses, these components sum to a median of 94.7 wt% and adequately represent all sediment components. In contrast, the high-porosity upper 50 m of Site U1338 has a median of 67 wt% for the raw sum of components and a range from <20% to ~100%. Clearly, the raw sum has significant noise and is affected by the sediment water content.

The normalization procedure is basic—multiply each component by 100/(raw sum) to bring the total sum of components to 100%, or

NMSc = C × 100/(raw sum),

where NMSc is the normalized median-scaled value for the component and C is the median-scaled value of the component.

Normalization does a good job of removing the volume versus mass XRF effect. Near the surface of the sediment column, the major cause of low sums of median-scaled data is the porosity effect. Deeper in the sediment section, however, the raw sum (and raw peak areas) are often variable because the sediment is stiffer and the core surface is cracked or sufficiently uneven that the X-ray detector assembly lands imperfectly (Fig. F4). Normalization minimizes this high-frequency noise. Figure F4 shows the raw median-scaled CaCO3 and the NMS CaCO3 in a deeper section of the Site U1338 splice (Table T2). Scaling and normalization reduced what appears to be noise in the measurement and made the total range more similar to the variability in the low-resolution CaCO3 record (Lyle and Backman, submitted).

The scaling and normalization process in this data report provides a way to develop a quantitative estimate of sediment concentration based on XRF scans. However, one should always be aware that XRF estimates can have significant errors if the model of sediment composition used is seriously awry (e.g., that the “type” sediment composition used is significantly different from actual sediment composition). Another source of significant error can occur if the model sediment components don’t match those of the sediment or if a major element found in the sediments is not included in the model. However, despite these issues, NMS data are significantly better to study the changes in sediment composition than raw XRF peak area. Raw peak area data can have even larger relative errors resulting from significant differences in porosity between sediment layers or from technical problems landing the detector on a flat sediment surface.