BASIC TOMO: an educational tool for investigating the role of controlling parameters and observation geometry in tomography problems

The paper presents an educational version of the tomography algorithm BASIC TOMO, which is aimed at demonstration of the role of different controlling parameters in inversion. The algorithm uses a simplified approximation of rays with straight lines and a model parameterization with the use of rectangular cells. The tomography procedure is reduced to solving a system of linear equations, which is performed using the LSQR method. This study presents several exercises showing the role of different factors in inversion, such as grid spacing, smoothing, ray configuration, and noise in the data. The calculations are performed for different types of synthetic models. All the results can be easily reproduced using the appended version of the BASIC TOMO code.


INTRODUCTION
Seismic tomography is a method, which is effectively used for studying Earth structure on scales from centimeters to thousands kilometers. There are several established algorithms based on various types of data, such as body and surface waves from active and passive sources. These algorithms are principally different from those used on X-ray computer tomography, where continuous and homogeneous distributions of sources and receivers allow implementing very efficient integral transformation methods (e.g., Natterer, 2001).
In practical applications of seismic tomography, the configurations of sources and receivers are usually far from a perfect distribution, and they often provide strongly uneven ray coverage. In a case of using passive sources, the inversion is complemented with an additional problem of determination of source parameters (Koulakov, 2009). This and other difficulties make using integral transformations non-suitable for practical seismic tomography applications. Instead, the majority of the seismic tomography algorithms are reduced to the solution of a system of linear algebraic equations (e.g., Nolet, 1987). The derived matrix is usually very large, sparse and may contain the elements corresponding to different physical parameters (such as velocity in km/s, source shift in km, source and station time corrections in seconds). Furthermore, the data are often perturbed by a considerable amount of noise, which may make the inversion results unstable. There are several receipts to overcome these problems, but they require adding a set of controlling parameters, such as grid spacing and damping values, that strongly affect the solution. Therefore, finding optimal values of the controlling parameters is an additional problem that needs to be solved separately in most seismic tomography studies.
In the literature, there are many attempts to find some criteria to formalize the definition of the controlling parameters. For example, some authors presume that the number of parameters should be considerably smaller than the number of data, which makes the matrix better determined (e.g., Kissling et al., 2001;Husen et al., 2004).
In this case, they set the grid spacing to achieve a ratio of data/unknown significantly larger than one (usually, not less than 5-10).
To define the damping values, many authors used so-called trade-off-curves showing the dependency of the mean solution amplitude of the residual deviation calculated for different damping parameters (e.g., Eberhart-Phillips, 1993;Syracuse et al., 2015). The corner point on the resulting L-shaped curve is presumed to be an optimal value for the damping. At the same time, I did not see any proof in the literature that this criterion is valid for any kind of tomography inversion. In this article, this issue will be considered based on synthetic modeling.
The problem becomes even more complicated and non-obvious, when multiple iterations of tomography inversion are performed within a non-linear tomography problem.
To investigate issues related to practical performing the tomography inversions, I have developed a simplified algorithm BASIC TOMO, which allows for very easy definition of synthetic models and ray geometry. This is a convenient instrument for teaching students the basics of tomography inversion. Furthermore, it was used for some practical applications to process experimental data, for example for studying microstructure of rock samples (Koulakov et al., 2013). In this article, I describe the general workflow of the BASIC TOMO algorithm and present several tests showing the role of various controlling parameters in inversion.

BASIC TOMO ALGORITHM
BASIC TOMO is a tomography code written in FORTRAN language. The version presented is adapted for the Windows OS. It consists of several program blocks, each of which is executed by running an EXE file.
The calculations start from the definition of the ray geometry. In the basic version of the code, there is a possibility to define rays between source and receiver lines by setting the coordinates of the line ends and spacing for the source and receiver points along the lines. For example, for a squared area, there might be six possible combinations of source and receiver lines placed along its sides: left-right, up-down, left-up, left-down, right-up, right-down. Synthetic model is defined as a superposition of a constant background, which is presumed to be known apriori, and a set of anomalies. BASIC TOMO provides three methods for the definition of anomalies as shown in Synthetic data are calculated by integration along straight rays that connect source-receiver points. The result of this step are the residuals that are used as an input dataset for the tomographic inversion. During the inversion, the model is parameterized by rectangular cells with a regular spacing along the horizontal and vertical directions. The first derivative matrix Aij is calculated as a length of the i-th ray inside the j-th cell. In this case, the tomographic inversion is reduced to the solution of a system of linear equations: where xi, i=1,N is the model vector consisting of N parameters, and bj, j=1,M is the data vector including M rays.
To control the stability of the inversion, in the BASIC_TOMO code, there are two types of regularization.
The first method is called amplitude damping or Tikhonov regularization. It is performed by adding N trivial equations: where W am is the weight of the amplitude damping. The larger the value of this parameter, the greater importance of these additional equations becomes, and the smaller amplitude of the solution is obtained.
The second type of regularization is based on damping by smoothing. In this case, the system of linear equations is complemented with a set of additional equations: where xm and xk are the unknown parameters in neighboring cells, and W sm is the smoothing coefficient. The larger value of W sm is defined, the smoother resulting model is obtained.
Inversion of the matrix is performed using the LSQR method (Page, Saunders, 1982;Nolet, 1987), which is specially adapted for solution of systems with large sparse matrices.
The general workflow of the BASIC_TOMO code includes the following steps (with indications of the executable files that run the corresponding tasks): 1. Calculation of the source-receiver pairs according to the definition file "observation.dat" (0_dataset/dataset.exe).
2. Calculation of the synthetic data along the straight rays between the source-receiver pairs. The synthetic model is defined in "anomaly.dat" (1_forward_model/forward_model.exe).
3. Calculation of the first-derivative matrix corresponding to the length of the j-th ray in the i-th cell (2_matrix/matrix.exe).

Visualization of the derived tomography model (4_visual/visual.exe).
These steps can be run separately or within task-packages.
Below, some results of synthetic modeling are presented, which can be easily reproduced using the appended version of the BASIC TOMO code.

EFFECT OF THE MAIN CONTROLLING PARAMETERS
The BASIC TOMO code was used to investigate the role of the major controlling parameters and data properties on the tomography inversion results. In the first series of tests, I use a free-shaped synthetic model "SMILE_01" (Fig. 2a) and a full coverage of rays connecting sources and receivers placed on all sides of the study area. In this case, the number of rays was 2646.

Role of the grid spacing
The first task consists in investigating an optimal grid spacing for the tomography model. In all cases considered in Fig. 2, we used the damping by smoothing with the coefficient W sm =10; the amplitude damping was zero. According to an existing stereotype, in the first trial, we define a relatively large spacing to enable a smaller number of parameters compared to the number of rays. In the case shown in Fig. 2b, the grid spacing was 10 m, which corresponds to 400 parameters in the model. In this case, the inversion generally restores the shapes of the main anomalies, but the details of the model, which are compatible or smaller than the parameterization cells, cannot be restored in this case. In the next run shown in Fig. 2c, the grid spacing was 5 m, which corresponded to 1600 parameters (still smaller than the number of data). In this case, the recovery quality is higher; however, the parameterization cells are still visible and perturb the resulting model.

Fig. 2.
Example of the synthetic model recovery using different grid spacing. A. Initial synthetic model "SMILE_01". B, C and D present the inversion results with the use of grids having the spacing of 10, 5 and 1 m, respectively. The data is noise-free in all cases The final result was obtained with the parameterization having the spacing of 1 m, which gives in total 40,000 parameters. In this case, the tomography almost perfectly restores the main details of the model.
This result appears to be paradoxical: the best resolution is achieved when the number of unknown parameters is much larger than the number of data. Actually, in the case of the fine grid, a smaller number of equations in the main matrix is complemented with a large number of the additional equations enabling smoothing.
Thus, in summary, the number of equations in the entire system is much larger than the number of parameters.
At the same time, the parameters in individual cells are not independent and linked with each other by additional equations. In this case, the parameterization with cells having the size smaller than an expected resolution can be considered as continuous. If we shift the cells or slightly change their size, this would not affect the resulting images; thus, the parameterization with small cells within a smoothed inversion provides grid-independent solutions.

Role of smoothing and amplitude damping
The next series of tests in Fig. 3 presents the recovery results with noise-free synthetic data calculated in the same model "SMILE_01" as shown in Fig. 2a. In this case, I investigated the role of smoothing on the results, while the coefficient of amplitude damping was always equal to zero. Here, the results derived with four different smoothing coefficients are shown. It can be seen that with both W sm =0 and W sm =1, the recovered model exhibits some instability seen as contrasted artefacts in the background. For the case of W sm =100, the model looks over smoothed and loses some details. The best recovery is achieved with W sm =10, for which there are not artifacts, and all the relevant anomalies look very close to the initial synthetic model.  Fig. 4a). In this case, an optimal damping was around 10, but this value is not distinguishable as a particular point in the trade-off curve. Note that the large values of damping in Fig. 4a cause reducing the mean amplitude of the solution to zero, whereas in the cases of large smoothing in Fig. 4b, the model deviation tends asymptotically to a certain non-zero value (~2 %).
In another series of tests, I investigated the role of smoothing and damping in cases of noisy data. The noise with a predefined mean deviation is randomly added to the travel times of all rays. In Figures 5 and 6, I show the results for the case of noise level of 0.2 s, which corresponds to the signal/noise level of 1.43. In Figure 5, I fixed a zero level for the amplitude damping and played with smoothing. It can be seen that the value of smoothing equal to 10, which is optimal in the case of noise-free data, does not provide any stable solution: the result looks very scattered and the target image is almost not resolvable. To achieve the recovery of the desired structure, we need to increase smoothing. For the smoothing of 50, the image gets visible; however, some noise in the background is visible. The most appropriate recovery with the minimum of artifacts is achieved for the smoothing of 100. At the same time, we see that small details of the model are not recovered anymore, and the amplitudes of the anomalies are slightly lower than those in the original model. When using the smoothing of 250, the model becomes oversmoothed, which is obviously not optimal for our case. It is interesting that in the trade-off curve corresponding to this case (red curve in Fig. 4b) there is no feature that would allow us to determine an optimal solution.
The effect revealed from this series of tests should be taken into account when performing tomography with real data. Presence of noise in the data requests implementing large damping causing that causes some loss of information. Therefore, one should be very careful about quantitative interpretation of numerical values of anomalies derived from tomography as they may appear far from the actual values existing in the real Earth. In most cases of real seismological experiments, when working with noisy data, the true amplitudes of anomalies are hardly achievable. Using small damping causes the appearance of artifacts that worsen our model, whereas the large damping overcomes the instability, but make the amplitudes of anomalies smaller than in the real Earth.
Another series of tests shown in Fig. 6 demonstrates the reconstruction results with a fixed zero value of smoothing and variable values of the amplitude damping. We see that when using the amplitude damping for the inversion of noisy data, the results remain always strongly scattered including the cases of overdamped solutions.
This series of tests shows that using solely the amplitude damping for inversion is not optimal, and it should always be combined with smoothing to remove scattering of anomalies. The results presented in Fig. 6 show that the best solutions, in which the targets anomalies are better distinguishable, are derived with the amplitude damping between 20 and 40. Note that the trade-off curve shown as the red line in Fig. 4a shows that the most optimal value on the corner of this L-shaped curve is located between 10 and 25. This example demonstrates that using the trade-off curve for finding the best damping coefficients is not always appropriate and may lead to wrong solutions.

Role of the ray distributions
The next series of tests demonstrates the importance of the ray coverage and number of data in tomographic inversion. Figure 7 demonstrates the results of recovery of the model "SMILE_01", same as shown in Fig. 2a with the use of three non-complete ray schemes: "cross-well", "tunnel-surface" and "boreholes-surface".
In all cases, I set rather dense distributions of the sources and receivers and defined noise-free data. It can be seen that in all these cases, the target structure appears to be strongly smeared. Further decrease of spacing for the source and receiver distributions does not bring any considerable changes in the results, which shows fundamental incapacity of the tomography algorithm to reveal some structures, when some illumination directions are missing. Same conclusions can be drawn from another series of tests shown in Fig. 8. In this case, I used a checkerboard definition of synthetic anomalies, which can form a model with layers and with a "fault" (second and third columns in Fig. 8, respectively). It can be seen that for the "cross-well" observations (upper row), both models are recovered almost perfectly. However, for the case of rays according to the "tunnel-to-surface" scheme (lower row), the results look much poorer. The rays in the layered medium with equal amounts of positive and negative layers always have the same values of the residual. As a result, the recovered model looks as a completely flat distribution with a constant velocity. At the same time, if in this model, there is a lateral heterogeneity, such as a vertical fault, it is resolved with our data as a singular zone with particularly strong heterogeneities.
In another set of tests shown in Fig. 9, I explore what is the minimum number of rays, which is necessary to retrieve anomalies of a certain size. Here, I use checkerboard models with the sizes of anomalies of 25, 50 and 100 m. I created three sets of data with 360, 121 and 35 rays shown in the left column in Fig. 9. Note that the source and receiver spacing values used to define these ray configurations were irregular to avoid interference of the ray distributions with periodic synthetic anomalies. In all cases, the data were noise-free. It can be seen that for the dataset with the maximum number of rays (upper row in Fig. 9), all of the three synthetic models are correctly reconstructed. For the case of using 121 rays, only anomalies with the sizes of 50 m and 100 m are recovered. When using 35 rays, we can only retrieve a very simple model with four anomalies having the sizes of 100 m. Note that in all these cases, we used a very small grid spacing of 1x1 meters, which resulted at 40,000 unknown in our system of linear equations. However, it is obvious that these parameters are not independent as they are linked with each other by means of additional equations used for smoothing. Therefore, the actual number of independent parameters is much smaller and merely depends on the ray distributions and data numbers. The approximate number of independent parameters in inversion can be estimated from synthetic tests, such as those demonstrated in Fig. 9, and it corresponds to the maximum number of anomalies that can be potentially retrieved.
For example, with 360 rays, we can resolve 64 anomalies; with 121 rays -16 anomalies and with 35 raysonly four anomalies. This gives an approximately same ratio of data/parameters equal to 5-10, which is presumed by many authors for tomography models. However, using large cells to satisfy these conditions leads to strong dependency of the solution on the grid geometry, which makes the solution very unstable. The solution presented in this study, which consists in definition of very small spacing for the grid and controlling the stability by smoothing, appears to be more appropriate to take into account limited conditions of data recording.

Fig. 9.
Reconstructions of three checkerboard models using the three datasets having different numbers of rays. The geometries of rays are shown in the left column. In the recovery results, the shape of synthetic anomalies are indicated with black lines

DEFINITION OF THE MODEL WITH BITMAP IMAGES
Besides the checkerboard and polygon-defined synthetic models, the BASIC_TOMO code allows also using gray-scale bitmap images to define synthetic distributions of anomalies. When converting the image into the model, we can define the map limits and the amplitude range for the black and white scale. In Figure 10, I show an example of the model "SALVADOR" derived from a photo. In this case, we can perform any types of exercises as those considered in previous chapters for other models. In the top-right panel, I present the best solution derived with noise-free-data corresponding to all directions of illumination and a fine grid with the spacing of 1 x 1 m. In the bottom-left panel, the result corresponds to the same dataset, but the inversion was performed using the coarser grid with 5 x 5 m spacing. Finally, the bottom-right panel presents the recovery result with the use of cross-well data.

CONCLUSIONS
The BASIC_TOMO is a simple tomography code, which is very convenient for teaching and for investigating several fundamental aspects related to performing tomographic inversion. In particular, this article demonstrates the role of grid spacing, smoothing, amplitude damping, noise in the data and ray distributions to the quality of the inversion. The presented exercises realized with this simple code have revealed several fundamental conclusions that should be taken into account when processing experimental data with professional tomography codes: 1. The grid spacing used for parameterization of the tomography model should be always smaller than the minimum size of the resolved anomalies. The resolution of the model should be controlled not by grid spacing, but by smoothing.
2. When working with noisy data, amplitude damping alone is not capable of overcoming the instability related to strong scattering in data; it should be applied together with damping by smoothing.
3. The presented tests have shown that the trade-off curves, which are used by many authors to determine optimal values of damping parameters, are not helpful, and they may lead to incorrect definitions. The best way to determine an optimal damping is to perform synthetic modeling with realistic ray configurations and an appropriate noise level.
The BASIC_TOMO code with the complete description of the programs and models can be found in Supplementary materials www.ipgg.sbras.ru/ru/Files/software/basic_tomo.zip. It can be used to reproduce all the results presented in this article and to create new models and exercises.