IODP Proceedings    Volume contents     Search

doi:10.2204/iodp.proc.320321.207.2013

Materials and methods

Core and downhole log data

The data used here consist of density core splices and downhole log data obtained at Sites U1337 and U1338. The density core splices were obtained from gamma ray attenuation (GRA) densities measured on whole-round core sections on the R/V JOIDES Resolution. The splices were painstakingly constructed by overlapping GRA density records from three holes at each site (Wilkens et al., in press). The downhole density log data were measured using the Schlumberger Hostile Environment Litho-Density Sonde (HLDS) wireline logging tool in Holes U1337A and U1338B.

The core splice and downhole log data have different depth coverages and measurement resolutions. Although the core splices span the full interval drilled, there are no downhole log data from the top of the hole. This is because the drill pipe is typically lowered to ~80 meters below seafloor (mbsf) during logging to avoid the near-seafloor interval where the borehole is enlarged and irregular. The GRA data have a nominal resolution of <1 cm (measurements were taken every 2.5 cm), whereas the downhole density log has a vertical resolution of ~20 cm. Detailed comparisons, however, show that the difference in resolution is minor and that the decimeter-scale sedimentary features that are most useful for correlation are usually well resolved in both data sets. More details on the GRA and log measurements are in the “Methods” chapter (Expedition 320/321 Scientists, 2010).

A major complication that prevents immediate correlation of core splice and downhole log data is that they are referred to different depth scales. The current IODP terminology (www.iodp.org/doc_download/3171-iodpdepthscaleterminologyv2) defines the depth scale of the core splices as composite core depth below seafloor (CCSF) and the depth scale of the processed downhole logs as wireline matched depth below seafloor (WMSF). The CCSF depth scale is determined starting from the seafloor by splicing the GRA density records measured in different core sections from different holes. The WMSF depth scale is determined by referring the depth measured by the wireline cable length to the seafloor (typically defined by a step change in the natural gamma radiation log). The wireline depths measured in different logging runs are then “matched” by small adjustments that align the natural gamma radiation log acquired in each run.

Reversible jump Monte Carlo sampling

The process of determining a mapping function that relates depth in one record to depth in another is illustrated in Figure F1. Solutions in the literature include the nonlinear optimization methods of Martinson et al. (1982) and Brüggemann (1992) and the exhaustive search method of Lisiecki and Lisiecki (2002). Whereas these approaches focus on obtaining an optimal result, the Monte Carlo method described here also quantifies the uncertainty of the inferred mapping function. The mapping function will have a large uncertainty in intervals where the match between the records remains poor while the mapping function can vary significantly. This measure of uncertainty is important to quantify the accuracy of the correlation.

For comparison, the two records are rescaled to zero mean and unit standard deviation and their match is measured by a residual standard deviation (the standard deviation of the difference between the records). The mapping function is defined at any depth by linear interpolation between a number of nodes (Fig. F1). The problem is then to determine sets of nodes that give a good match between the two records (i.e., a small residual standard deviation). An additional requirement is that the gradient of the mapping function should not contain large fluctuations (Brüggemann, 1992; Lisiecki and Lisiecki, 2002). In the context of this study, large changes in the gradient of the mapping function would correspond to unrealistic measured depth errors (for core-log correlations at the same site) or excessive differences in sedimentation rate (for correlations between sites).

The requirements of a smooth mapping function and of a good data match can be combined in a Bayesian formulation, which defines a prior distribution and a likelihood of the mapping function. The value of the prior will be higher for mapping functions whose gradients have smaller fluctuations about their average. Mapping functions that result in closer matches between the records will have a higher value of likelihood. The posterior distribution of mapping functions is proportional to the product of prior and likelihood and balances the competing requirements of a smooth mapping function and of a close match of the records. This Bayesian formulation has been widely used in geophysical inverse problems (Tarantola and Valette, 1982; Jackson and Matsu’ura, 1985; Duijndam, 1988; Tarantola, 2005) and has been applied to cycle stratigraphy (Malinverno et al., 2010) and timescale construction (Malinverno et al., 2012).

A key feature of the correlation problem in Figure F1 is that the number of nodes in the mapping function is one of the unknowns. In the apt terminology of Sambridge et al. (2006), this is a “trans-dimensional” inverse problem. Green (1995) devised a general algorithm that can be applied to these problems, called reversible jump Markov chain Monte Carlo sampling. In the correlation problem treated here, the reversible jump algorithm begins from a starting set of nodes that define an initial approximate correlation and continues as follows:

  1. Propose a “candidate” mapping function by
    • Perturbing the coordinates of an existing node,
    • Adding a new node, or
    • Deleting an existing node (except for the starting nodes).
  2. Accept or reject the candidate mapping function on the basis of its posterior probability as in the Metropolis-Hastings algorithm (Metropolis et al., 1953; Hastings, 1970). The posterior probability is higher for mapping functions that
    • Result in a better match (i.e., a smaller residual standard deviation) and
    • Have smaller fluctuations in their gradient.
  3. Repeat from step 1.

This simple algorithm asymptotically draws samples from the posterior distribution of mapping functions. For examples of reversible jump sampling applied to geophysical inverse problems, see Malinverno (2002), Malinverno and Leaney (2005), Sambridge et al. (2006), Bodin and Sambridge (2009), and Piana Agostinetti and Malinverno (2010).

A sequence of 2000 sampling iterations of the reversible jump algorithm is shown in Figure F2. The residual standard deviation, which measures how close the match is between the two records, decreases as sampling progresses and becomes nearly constant after ~1000 iterations. The likelihood function correspondingly increases. On the other hand, the value of the prior distribution decreases, mostly because the number of nodes in the mapping function increases to achieve a better data match. The decrease in the prior distribution is compensated by a much larger increase in the likelihood, meaning that the cost of adding more nodes is much less than the gain because of better data matching so that the posterior probability of the mapping function increases during the sampling.

The likelihood function and the prior distribution are controlled by target standard deviations of the data residuals and of the mapping function gradient, respectively. The relative size of these target standard deviations will weigh the competing needs of matching the data and minimizing changes in the mapping function gradient. The target standard deviation of the data residuals is set to be one of the unknown parameters as in a hierarchical Bayes formulation (Malinverno and Briggs, 2004). This means that this target standard deviation does not need to be predetermined; the algorithm samples it by iteratively perturbing its value and effectively adjusts it to match the actual residual standard deviation achieved by the sampling (Fig. F2). Numerical experiments showed that a target standard deviation of the mapping function gradient equal to 20% of the mean gradient resulted in a reasonable trade-off between data match and smoothness of the mapping function.

The results described below were all obtained from 100 independent runs of the reversible jump algorithm. Each run consisted of 2000 sampling iterations. The mean mapping function and its standard deviation were determined from the 100 mapping functions obtained at the end of each run. This multiple run strategy minimizes the effect of sampling secondary modes of the posterior distribution. Although the Monte Carlo method will asymptotically sample the global mode of the posterior distribution, the sampling can still remain for a large number of iterations in a secondary mode.