As human that is living on the Earth we are always trying to understand the earth itself which is hidden, blur, invisible and complex. Using our knowledge and technology we want to reveal the hidden value or phenomena, so we can take advantage from it. To make things more clear, let's answering a question: Why do we use interpolation? To explain the answer, I will give an example in mineral deposit. Mineral deposit is spatially distributed on the ground. It might be visible in some patterns but mostly invisible below the ground. Then a mining company investigate it to estimate the amount of the mineral deposit and later to define if it is feasible or not to mine. Nobody can give an exact answer how much the deposit, but using a systematic approach and methodology like spatial interpolation at least we can get a close answer from the truth. Next questions: What is spatial interpolation? How it works? How to use it? I hope this post and tutorial could give the answer and a better explanation.
Spatial Interpolation Definition and Methods
Interpolation is a method to predict an unknown from known values. From the definition, we need some known values to do an interpolation using any interpolation method. The known values which is commonly called sampling point, can be gathered from some measurements and site investigation like drilling, surveying, etc. Using the known value from some locations, we are trying to predict a value of other neighborhood location that is close to the known location.
There are many interpolation methods available from a simple to a sophisticated one, some to be named are: linear interpolation, Inverse Distance Weighting (IDW) and Kriging. One method could be differed from each other and could give different results. That's why it is very important to understand how a spatial interpolation works, so we can understand how the result is produced, in what condition to apply it, in what way to apply it to get a better result, what errors could we get, etc. In this post we will discuss a spatial interpolation method which is called Inverse Distance Weighting (IDW). We will see how it works and how to apply it using QGIS 3 software.
Inverse Distance Weighting(IDW) Interpolation Method
Inverse Distance Weighted interpolation is a deterministic spatial interpolation approach to estimate an unknown value at a location using some known values with corresponding weighted values. The basic IDW interpolation formula can be seen in equation 1. Where x* is unknown value at a location to be determined, w is the weight, and x is known point value. The weight is inverse distance of a point to each known point value that is used in the calculation. Simply the weight can be calculated using equation 2.
\[x^*=\frac{w_1x_1+w_2x_2+w_3x_3+....+w_nx_n}{w_1+w_2+w_3+...+w_n}\]eq 1. Inverse Distance Weight formula
\[w_1=\frac{1}{d_{ix^*}^p}\] eq 2. Weight Formula
Figure 1 gives the illustration how the IDW interpolation works. Can be seen in the figure, a value at position x will be determined from sampling points 1, 2, and 3, with the distances to x point are d1x, d2x and d3x. Using the equation 2, each respective weight will be calculated and then the value at position x will be determined using equation 1.
Figure 1. Inverse Distance Weight(IDW) Interpolation |
IDW Interpolation Implementation in GIS Software
Now we are having a basic understanding how the IDW is working. Next question. How this IDW interpolation is implemented in a GIS software? The main problem in implementing the IDW interpolation into a software algorithm is to define how many sampling points will be used in the calculation. This can be done with two approaches, using a number of points and radius distance from a point to be determined (point x). For the first approach, a user can define how many points around x point will be used in the calculation process, so it needs an algorithm to calculate a number of closest point to the x point. The second one, a user can specify a radius distance from point x, then the algorithm must select a number of sampling points within the specified radius.What about the Power (P) value? The procedure for defining an optimum P value can be done using cross validation method to find the minimum RMSE between interpolation result and actual values as I explained before. If you are using Geostatistics Analyst in ArcGIS software, it will automatically attempt to select an optimal value. Unfortunately not all GIS software provide an automatic algorithm in defining an optimum P value, so you must do it manually.
How to Perform IDW Interpolation in QGIS
Now I will show you how to do IDW interpolation in QGIS software. For this tutorial I'm using QGIS 3.4.2 Madeira. If you don't have it, you can download QGIS from QGIS official website. The installation of QGIS software is quite simple and straight forward. Read my post about QGIS Introduction that explain more detail about it.Download and Preparing Dataset
In this tutorial we will use the Coal dataset. It's a simulated dataset based on real coal seam in Southern Africa, which is a companion dataset for Practical Geostatistics text book by Isobel Clark. The dataset contains coal seam thickness in meter, calorific value in Mega Joules/tonne, ash content and sulphur content in %. You can download the Coal data at Kriging coal dataset.After downloading the coal data, you must do some cleaning using a text editor software like Notepad or Notepad++. Delete two top lines because we don't need it. It just a short information about the data. Then change the column name separator from tab to comma(,). The data must be look like figure 3.
Figure 2. Original coal dataset |
Figure 3. Coal dataset after editing |
Adding Dataset into QGIS
After the dataset is ready, let's add it to QGIS map canvas. Select Datasource Manager, then select Delimited Dataset. On the Filename, browse and select the coal data. Because the dataset is separated with comma, make sure to select Comma under File Format. Next specify the field for Point coordinates which is column X co-ordinate and Y co-ordinate for X and Y. The coordinates are in meter in local system (I don't find any information which coordinate system is used in the dataset, so I assumed it used local coordinate system). Because there is no information about coordinate system for the dataset, I chose World Mercator Coordinate System (EPSG: 3395). Below the Sample data preview, you can see how the data looks like. If you see the data are separated well for each column, then everything is fine. Lastly select Add button to add the data into QGIS and close the Datasource Manager window.Figure 4. Add data set into QGIS |
Export Data to Shapefile Format
After the data is loaded into QGIS. Export the data into shape file format. It can be done by right click the point layer then select Export >> Save Feature As. The Save Vector Layer as... window will appear. Select the ESRI Shapefile for Format. In the File name specify a name and place where the exported data will be saved. For CRS use the same as previous one, World Mercator.Figure 5. Export coal data to shape file |
Performing IDW Interpolation in QGIS
Now, we are ready to do the IDW interpolation. In QGIS we can do the IDW interpolation using three tools, there are: IDW Interpolation from QGIS Interpolation tool. v.surf.idw from GRASS and GRID(IDW Nearest Neighbor Searching) from GDAL. If you type a keyword IDW, those three tools will appear in the processing toolbox as in figure 6.
Figure 6. IDW available tools in QGIS |
For this tutorial I'm using IDW interpolation tool from GDAL. From the processing toolbox, open the GRID (IDW with nearest neighbor searching) under GDAL tool as in figure 6. When you open the tool then the GDAL IDW interpolation window will appear as in figure 7. In the Point layer make sure you select the correct point dataset to be interpolated. Then you can set some parameters like Weighting Power (P value). Smoothing (higher value will give smoother result, default 0). Search radius (I gave it 300) in dataset unit, in this case meter. Don't leave this option 0, cause it won't search any point although you set a number of maximum points. Next, specify the Z value, which is the field value to be interpolated. I want to make a coal calorific map, so I chose the corresponding field. Then at bottom of the window tool, you can specify a place where the output will be saved. If you want to view the result before you save it, leave it as temporary file. Finally click Run button to start the interpolation. When done the result will be added to your QGIS map canvas.
Figure 7. GDAL IDW Interpolation Tool |
Figure 8. Change Symbology |
Comparing The IDW Interpolation Result
Before closing thi post, I want to share the result of the IDW interpolation using all the three tools. The three results can be seen in the figure 9.Figure 9. IDW comparison result |
Hopefully this post and tutorial about spatial interpolation using Inverse Distance Weighted (IDW) can give you a better understanding what spatial interpolation is, how it works and how to perform the interpolation using free GIS software (QGIS). As I mentioned at the beginning of this post, there are some spatial interpolation methods available. Next time I will try to discuss a famous spatial interpolation method Kriging. Thanks for reading and really appreciate your feedback. Anyway there is also a post about how to create IDW algorithm in Python from scratch, check it out if you're interested.
Geoanalytics IDW Interpolation Inverse Distance Weighting QGIS Tutorial