Sometimes ago I wrote a tutorial about
creating coronavirus map with python
where the case of coronavirus is plotting on a map date by date with a moving
slider. There was a question: Is it possible to create such the map using a
GIS software? Yes. This tutorial will show you how to create coronavirus map
with time series visualization like the figure below using free open source
software QGIS.
Figure 1. Coronavirus time series visualization |
Following this tutorial we will learn how to:
- Getting coronavirus data
- Transform data into time series format
- Ploting the data with QGIS
- Using Time manager to visualize time series cases
Getting Coronavirus Data
The most important thing in mapping activity whatever it is, is data
availability. For creating the time series coronavirus map, we will get the
data from
John Hopkins University (JHU) Covid-19 data repository on Github. From the Github page you will find some datasets and also the data
description. As our mission to create a time series map, look for time series
data (shortcut link). The time series data page is look like figure 2. There you can find time
series dataset for confirmed, deaths and recovered cases both for US and
global in csv format.
Select a dataset, for example time_series_covid19_deaths_global.csv. We
will see the data in a table like figure 3.
Figure 3. Death cases data |
From the table we can see some columns such as
Province/State, Country/Region, Lat, Long and a series of date's
columns to the right. Well, it is a time series data cause there are a number
of date columns. But unfortunately the time series won't work for a GIS
visualization using Time Manager Plugin in QGIS, because Time Manager will
render based on feature. Each rows is a single feature. Therefore a series of
date columns won't work to visualize temporal change. So, what should we do?
We have to transform the table from a series of date's columns to a series of
date's rows as illustrated in the figure 4.
Figure 4. Table transformation for time series data |
To transform the data, I created a little Python code as below. The following
code transform the original csv dataset of global death cases directly from
the github raw url link. If you want to change to other dataset just change
the raw url link.
#COROVIRUS TIMESERIES DATA CONVERSION #CREATED BY IDEAGORA GEOMATIS WWW.GEODOSE.COM import urllib.request import ssl # HEADER ssl._create_default_https_context = ssl._create_unverified_context opener = urllib.request.build_opener() opener.addheaders = [('User-agent', 'Mozilla/5.0')] raw_url='https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv' file=opener.open(raw_url) mybyte=file.read() mystr=mybyte.decode("utf8") file.close() output=open('F:\Covid\output_death_300520.csv','w') #OUTPUT PATH data=mystr.split('\n') head_init=data[0].split(',') head=head_init[::-1] #WRITE OUTPUT output_head='date,country,province,lat,lon,n_death\n' output.write(output_head) for i in range(len(head)-4): for j in range(1,len(data)-1): split_row=data[j].split(',') row=split_row[::-1] if row[i]!='0': date=head[i].rstrip() n_death=row[i].rstrip() lon=row[head.index('Long')] lat=row[head.index('Lat')] country=row[head.index('Country/Region')] province=row[head.index('Province/State')] if country==' South"': province='' country='South Korea' output_row=date+','+country+','+province+','+lat+','+lon+','+n_death+'\n' output.write(output_row) output.close() print ('Finished. Check the output path!')
After running the code, open the output file. The result will be like figure
5.
Figure 5. Output transformed dataset |
Plotting Coronavirus Data in QGIS
After getting and transformed the data. Now let's plot it in QGIS with the
following steps.
Firslty, from the Layer menu, select Add Layer, then
Add Delimited Text Layer as in figure 6.
Figure 6. Add delimited text layer |
The data source manager window will appear like figure 7. Make sure to select
Delimited Text. In the File name select the output of
transformed csv file. Then in the Geometry Definition option select
column's name for X field and Y field. To make sure the data
will be parsed correctly, look at the Sample Data part. If the columns
are split correctly, then everything is fine.
Figure 7. Datasource Manager Window |
After pushing the Add button, the points of coronavirus death cases
will be plotted on the QGIS map. You will see so many points just like scatter
plot. To make it more meaningful, add a reference map or a basemap like
CartoDB Dark. You can find this basemap and add it easily to QGIS using
Tile+ plugin. Figure 8 shows the coronavirus location all over the world on the CartoDB
Dark basemap.
Figure 8. Coronavirus death cases all over the world |
Coronavirus Time Series Visualization
Now let's visualize the data with time series, so we can see the change of
death cases date by date. To do the time series visualization in QGIS, we are
using
Time Manager Plugin. The plugin can be found from the Plugins menu as shown in figure 9.
If you don't find it, install it using
Manage and Instal Plugins menu.
Figure 9. Time Manage Plugin |
After toggling the visibility of the Time Manager plugin, it will docked at
the bottom of QGIS window like figure 10.
Let's play with it. Click the Settings button. The time manager setting
window will appear. From the window, select Add Layer. Then
Select layer and column(s) window will appear as in figure 11. Select
the csv data layer. Then choose the Start time column's name. To
accumulate the case from starting date to end date, make sure to click the
Accumulate features option.
Figure 11. Time manager setting |
Before playing the time series visualization, make sure to set the
Time frame size to 1 days, because we want to see the change date by
date (1 day). Now let's see it in action. Push the play button, you should see
the time series visualization of corona virus death cases like figure 12.
Figure 12. Coronavirus time series visualization |
In case you have the date setting problem like figure 13 below. Open the csv
file using a spreadsheet software. Then select the date column and do cell
formatting by selecting Custom category and set the type to
yyyy/mm/dd as shown in figure 14.
Figure 13. Time manager date problem |
Figure 14. Format date column |
So far we did it. We can see the temporal change of coronavirus death cases
date by date. But let's make it more pretty by showing the number of death
cases proportional to dot size, so we will see the larger dot for more cases.
To do this, we just play with marker symbology setting.
Right click the point layer. In the layer properties window, select
control feature symbology icon. Select a marker, chose your favorite
color and set the Opacity to something like 30 or 40%. See figure
15.
For the marker size we can't set it directly to death column's name to relate
with the death case number due to the size unit. It will be just too big. We
need to re-scale it using a log scale. To do this, next to Size option,
click edit. The Expression String Builder window will be opened like
figure 16. I used natural logarithm (ln). For that, in the expression editor
type: ln("n_death").
Figure 16. Expression string builder window |
Done. Now let's play it again. You should see the visualization like figure
17. The death case will be accumulated day by day with larger dot size.
Figure 17. Coronavirus time series temporal change with dynamic marker
size |
That's all the tutorial how to create coronavirus time series map using QGIS.
In this tutorial we learned how to get coronavirus dataset, transform it to
proper format that can be used in time manager plugin, plotting the case on
the map and visualize the temporal change day by day. I hope this tutorial
will be useful for you. May the pandemic will be gone from our lovely planet.
Thanks for reading!
Watch the tutorial video!