Coronavirus has spread rapidly across the globe since it's first outbreak in Wuhan, Hubei Province, China in the beginning of January 2020. On 30th January reported more than 9000 people has been infected, with total deaths 213. The virus has been spreading outside China, and infections have been confirmed in France, Australia, Japan, Singapore, Malaysia, Germany, Italia, Sri Lanka, Cambodia, Nepal and many more. No one knows when it will over, in fact the number of confirmed cases are growing day to day.
In this post I will discuss how to create a simple application to track the Coronavirus spreading using Python. At this end of this tutorial we will get a html page that shows a map with infected locations, including a slider to track the virus spreading based on date like figure 1. For this tutorial I'm using Python 3.7, Pandas, Plotly 4.1.0 and Jupyter Notebook.
Figure 1. Coronavirus spreading map |
Importing Libraries
Let's start this tutorial with importing Plotly and Pandas library as the code below. Before proceeding to the next step, try to run the code. If no error appear, then all required libraries already properly installed. Otherwise check Plotly and Pandas official page for installation instruction and further information. If you don't have Jupyter notebook on your personal hardware, I suggest to use Google Colab, which provide it on the cloud.
import plotly.offline as go_offline import plotly.graph_objects as go import pandas as pd
Data Processing
The data that is using in this tutorial can be seen here. It is a shared google spreadsheet and updated with one day delay. Many thanks for those who compiling the data, really appreciate it!We will read the data using Pandas read_csv method. But firstly, we need to change the data link a little bit from:
https://docs.google.com/spreadsheets/d/18X1VM1671d99V_yd-cnUI1j8oSG2ZgfU_q1HfOizErA/edit#gid=0
to:
https://docs.google.com/spreadsheets/d/18X1VM1671d99V_yd-cnUI1j8oSG2ZgfU_q1HfOizErA/export?format=csv&id
The following code defined an url variable for the data link, read the data using read_csv and change the blank cell in the data with NaN values to 0.
url='https://docs.google.com/spreadsheets/d/18X1VM1671d99V_yd-cnUI1j8oSG2ZgfU_q1HfOizErA/export?format=csv&id' data=pd.read_csv(url) data=data.fillna(0)
The data structure understanding is very important in this step, cause it will determine the data processing approach. Try to view the data using data.head(). First 5 rows will be appeared as in figure 2. At the left bottom of the figure can be seen that it has 47 columns. Five first columns are: country, location_id, location, latitude and longitude. The other columns are a pair of confirmedcase_dd-mm-yyyy and deaths_dd-mm-yyyy. The total columns when this tutorial was written were 47. It means it had (47-5)/2=21 day dataset. If the starting date was 10-01-2020 then the end date will be 30-01-2020.
Figure 2. Coronavirus data |
The data splitting process was done in a loop. While looping, the output of each dataset were added to a Geoscatter plot using fig.add_trace. In total there were 21 figures added as figure's data. We can confirm this using fig.data command. Type it in another cell, the output should be 21.
The following code shows the code up to this step.
#SOME VARIABLES INITIATIONS fig=go.Figure() col_name=data.columns n_col=len(data.columns) date_list=[] init=4 n_range=int((n_col-5)/2) #LOOPING FOR DATA SPLITTING AND FIGURES for i in range(n_range): col_case=init+1 col_dead=col_case+1 init=col_case+1 df_split=data[['latitude','longitude','country','location',col_name[col_case],col_name[col_dead]]] df=df_split[(df_split[col_name[col_case]]!=0)] lat=df['latitude'] lon=df['longitude'] case=df[df.columns[-2]].astype(int) deaths=df[df.columns[-1]].astype(int) df['text']=df['country']+'<br>'+df['location']+'<br>'+'confirmed cases: '+ case.astype(str)+'<br>'+'deaths: '+deaths.astype(str) date_label=deaths.name[7:17] date_list.append(date_label) #ADDING GEOSCATTER PLOT fig.add_trace(go.Scattergeo( name='', lon=lon, lat=lat, visible=False, hovertemplate=df['text'], text=df['text'], mode='markers', marker=dict(size=15,opacity=0.6,color='Red', symbol='circle'), ))
Creating Slider
In this part we will add slider tool to the map. The code for slider part can be seen as the following code.
The slider's code consist of two main parts, the first one is a loop to construct slider steps array which is showing the respective figure for i-th trace and hiding others. The second part is putting the constructed steps to the the sliders object. When a slider is moving, it will select the respective index of steps array.
#SLIDER PART steps = [] for i in range(len(fig.data)): step = dict( method="restyle", args=["visible", [False] * len(fig.data)], label=date_list[i], ) step["args"][1][i] = True # Toggle i'th trace to "visible" steps.append(step) sliders = [dict( active=0, currentvalue={"prefix": "Date: "}, pad={"t": 1}, steps=steps )]
The slider's code consist of two main parts, the first one is a loop to construct slider steps array which is showing the respective figure for i-th trace and hiding others. The second part is putting the constructed steps to the the sliders object. When a slider is moving, it will select the respective index of steps array.
Showing the Map and Save to HTML
The last part, we will show the map and save it into a html file. In this step we set the first figure's data to be visible. Then the figure layout need to be updated by adding sliders object, title and also height. The last step show the map with fig.show() and save it to html with go_offline.plot method.The code below showing the code for the last step.
#SET FIRST FIGURE VISIBLE fig.data[0].visible=True #SHOW AND SAVE TO HTML fig.update_layout(sliders=sliders,title='Coronavirus Spreading Map'+'<br>geodose.com',height=600) fig.show() go_offline.plot(fig,filename='F:/html/map_ncov.html',validate=True, auto_open=False)
Complete Code
The code below is the complete code to create the Coronavirus spreading map, which already explained above. At the last line, don't forget to change the html output path with yours.
That's all the tutorial how to create Coronavirus spreading map in Python. In this tutorial we had learnt how to read data from a shared google spreadsheet, do data processing with Pandas and create a spreading map with slider using Plotly. The output of html page can be downloaded here. The information of the map depends on the spreadsheet update, so everytime the code is executed the map will reflect the updated information from the data. There are many opportunities to improve the map like adding bar chart, information summay, etc. Just explore it and share the result. It would be nice.Thanks for reading!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 | import plotly.offline as go_offline import plotly.graph_objects as go import pandas as pd #READING DATA url='https://docs.google.com/spreadsheets/d/18X1VM1671d99V_yd-cnUI1j8oSG2ZgfU_q1HfOizErA/export?format=csv&id' data=pd.read_csv(url) data=data.fillna(0) #SOME VARIABLES INITIATIONS fig=go.Figure() col_name=data.columns n_col=len(data.columns) date_list=[] init=4 n_range=int((n_col-5)/2) #LOOPING FOR DATA SPLITTING AND FIGURES for i in range(n_range): col_case=init+1 col_dead=col_case+1 init=col_case+1 df_split=data[['latitude','longitude','country','location',col_name[col_case],col_name[col_dead]]] df=df_split[(df_split[col_name[col_case]]!=0)] lat=df['latitude'] lon=df['longitude'] case=df[df.columns[-2]].astype(int) deaths=df[df.columns[-1]].astype(int) df['text']=df['country']+'<br>'+df['location']+'<br>'+'confirmed cases: '+ case.astype(str)+'<br>'+'deaths: '+deaths.astype(str) date_label=deaths.name[7:17] date_list.append(date_label) #ADDING GEOSCATTER PLOT fig.add_trace(go.Scattergeo( name='', lon=lon, lat=lat, visible=False, hovertemplate=df['text'], text=df['text'], mode='markers', marker=dict(size=15,opacity=0.6,color='Red', symbol='circle'), )) #SLIDER PART steps = [] for i in range(len(fig.data)): step = dict( method="restyle", args=["visible", [False] * len(fig.data)], label=date_list[i], ) step["args"][1][i] = True # Toggle i'th trace to "visible" steps.append(step) sliders = [dict( active=0, currentvalue={"prefix": "Date: "}, pad={"t": 1}, steps=steps )] #SET FIRST FIGURE VISIBLE fig.data[0].visible=True #SHOW AND SAVE TO HTML fig.update_layout(sliders=sliders,title='Coronavirus Spreading Map'+'<br>geodose.com',height=600) fig.show() go_offline.plot(fig,filename='F:/html/map_ncov_slider.html',validate=True, auto_open=False) |
That's all the tutorial how to create Coronavirus spreading map in Python. In this tutorial we had learnt how to read data from a shared google spreadsheet, do data processing with Pandas and create a spreading map with slider using Plotly. The output of html page can be downloaded here. The information of the map depends on the spreadsheet update, so everytime the code is executed the map will reflect the updated information from the data. There are many opportunities to improve the map like adding bar chart, information summay, etc. Just explore it and share the result. It would be nice.Thanks for reading!