



RESEARCH ARTICLE 

Year : 2022  Volume
: 59
 Issue : 4  Page : 337347 

An investigation of the efficacy of different statistical models in malaria forecasting in the semiarid regions of Gujarat, India
Chander Prakash Yadav^{1}, Rajendra Baharia^{2}, Ritesh Ranjha^{2}, Syed Shah Areeb Hussain^{2}, Kuldeep Singh^{2}, Nafis Faizi^{2}, Amit Sharma^{3}
^{1} ICMRNational Institute of Malaria Research, New Delhi; Academy of Scientific and Innovative Research; ICMRNational Institute of Cancer Prevention & Research, Noida, NCR, India ^{2} ICMRNational Institute of Malaria Research, New Delhi, India ^{3} ICMRNational Institute of Malaria Research; Academy of Scientific and Innovative Research; Molecular Medicine Division, International Centre for Genetic Engineering and Biotechnology, New Delhi, India
Date of Submission  24Jun2022 
Date of Acceptance  23Aug2022 
Date of Web Publication  07Feb2023 
Correspondence Address: Dr Chander Prakash Yadav ICMRNational Institute of Malaria Research, Sector8, Dwarka, New Delhi India
Source of Support: None, Conflict of Interest: None  Check 
DOI: 10.4103/09729062.355959
Background & objectives: Robust forecasting of malaria cases is desirable as we are approaching towards malaria elimination in India. Methods enabling robust forecasting and timely case detection in unstable transmission areas are the need of the hour. Methods: Forecasting efficacy of the eight most prominent statistical models that are based on three statistical methods: Generalized linear model (Model A and Model B), Smoothing method (Model C), and SARIMA (Model D to model H) were compared using last twelve years (2008–19) monthly malaria data of two districts (Kheda and Anand) of Gujarat state of India. Results: The SARIMA Model F was found the most appropriate when forecasted for 2017 and 2018 using modelbuilding data sets 1 and 2, respectively, for both the districts: Kheda and Anand. Model H followed by model C were the two models found appropriate in terms of point estimates for 2019. Still, we regretted these two because confidence intervals from these models are wider that they do not have any forecasting utility. Model F is the third one in terms of point prediction but gives a relatively better confidence interval. Therefore, model F was considered the most appropriate for the year 2019 for both districts. Interpretation & conclusion: Model F was found relatively more appropriate than others and can be used to forecast malaria cases in both districts.
How to cite this article: Yadav CP, Baharia R, Ranjha R, Hussain SS, Singh K, Faizi N, Sharma A. An investigation of the efficacy of different statistical models in malaria forecasting in the semiarid regions of Gujarat, India. J Vector Borne Dis 2022;59:33747 
How to cite this URL: Yadav CP, Baharia R, Ranjha R, Hussain SS, Singh K, Faizi N, Sharma A. An investigation of the efficacy of different statistical models in malaria forecasting in the semiarid regions of Gujarat, India. J Vector Borne Dis [serial online] 2022 [cited 2023 Mar 28];59:33747. Available from: http://www.jvbd.org//text.asp?2022/59/4/337/355959 
Introduction   
Malaria is a vectorborne disease transmitted by female Anopheles mosquitoes and caused by Plasmodium parasites. There are mainly five different species of the Plasmodium parasites that cause human malaria, out of which two species are Plasmodium falciparum (Pf) and Plasmodium vivax (Pv), are prevalent contributors in India. In 2016, Indian Government launched the ‘National Framework for Malaria Elimination (NFME) 2016–2030^{[1]}. The goal of NFME2016–30 is to eliminate malaria (zero indigenous cases) throughout the country by 2030, maintain malariafree status in areas with interrupted malaria transmission, and prevent the reintroduction of malaria^{[1],[2]}. In this regard, malaria forecasting plays a vital role as it helps to prepare, prioritize and mobilize resources through the projection of cases in the future. The method that allows robust forecasting, timely case detection in areas of unstable transmission, and work as early warning systems are the need of the hour. Therefore, malaria forecasting is being done in many countries, including China, which has recently eliminated malaria^{[3],[4]}.
Malaria forecasting is a complex phenomenon and requires a holistic approach to deal with various aspects of malaria forecasting. An essential element for accurate malaria forecasting is determining and prioritizing the appropriate predictors among diverse factors and the mechanisms that affect malaria transmission. For example, on the one hand, rainfall helps to propagate the vector density, while excess rains may wash off the breeding habitats of the vector. Ambient temperature plays a vital role in the parasite development within the mosquito (sporogonic cycle), which takes approximately 9 to 10 days at an average temperature of 28°C; however, as the temperature goes below 16°C, the development is interrupted^{[3],[5]}. Similarly, vector survival also requires a specific range of temperature. Precipitation, also related to rainfall, affects the vector bionomics^{[6],[7]}. Therefore, it is not enough to include the appropriate variables in the malaria forecasting model, the way they have been added into the model is also important. Defining and assessing the procedure adopted while feeding the variables into the models is crucial. Another significant issue while dealing with ecological variables is determining the appropriate lag period^{[8],[9]}. A lag period in malaria forecasting is the presumed or inferred gap between climatic factors and malaria incidence. Identifying a relevant lag period is difficult as there is a paucity of scientific evidence regarding choosing a reasonable lag period. Frequently, researchers determine the lag period to be 1 or 2 months based on the parasite life cycle, their intuition, and/or convention. However, in different parts of the world, such as Kenya and Ghana, the lags have reportedly been more than two months^{[10],[11]}. Due to such intricacies, many researchers prefer techniques based on the past pattern of malaria incidence only (they do not require data on extraneous variables).
Different methods^{[12],[14]} have been proposed for malaria forecasting to address the abovediscussed issues, which may be broadly classified into three categories  mathematical methods^{[15],[16]}, machine methods^{[16],[17]}, and statistical methods^{[16],[18]}. Statistical methods in malaria forecasting are more prominent than others because of their intuitive and robust nature. The eight most prominent statistical models for malaria forecasting that can be broadly categorized into three methods: Time Series Regression / Generalized estimating model (GLM)^{[18],[19]}, Smoothing^{[18]}, and Seasonal AutoRegressive Integrated Moving Average (SARIMA)^{[18]} methods were used in this research. Several comparisons among these models have been made to identify the best models in the past. However, their comparative efficacy, the difference between actual and forecasted malaria cases, has not been studied in detail. Therefore, in this study, we compared the forecasting accuracy of the eight statistical models (Model A to Model H) to determine the most appropriate model for future use using the monthly malaria burden data of the last twelve years (2008–2019) from the two districts of Gujarat – Kheda, and Anand.
Material & Methods   
Study site
Two adjoining districts of Gujarat, India (viz. Kheda and Anand), which experience a semiarid climate, were chosen for this analysis [Figure 1]. Despite having the longest coastline in India, most parts of Gujarat lie in the warm semiarid BSh climatic subtype (Mid Latitude Steppe and Desert climate) as per the Koppen climate classification, which is characterized by variable temperatures and relatively low amount of rainfall^{[20]}. Northern Gujarat, comprising the Kachchh region, lies in the hot arid climatic subtype (BWh – Desert Climate), with only Southern Gujarat experiencing the humid tropical savannah climatic subtype (Aw) receives ample rainfall^{[20]}. The districts of Anand and Kheda lie in Central Gujarat that witnesses the semiarid climatic subtype, with average monthly temperatures ranging from 20–35°C. The annual precipitation in these districts lies between 600–800 mm, with July being the wettest month and up to 32 days of precipitation annually.  Figure 1: Study area: Yellow highlighted area in the country map is Gujarat (one of the states of India), and two red highlighted on the right are two districts: Kheda and Anand (study area) of Gujarat.
Click here to view 
Study data
Monthly malaria cases
Monthly malaria cases data from 2008–2019 from the two districts of Gujarat  Kheda, and Anand, were obtained from the state health department of Gujarat.
Environmental data
Along with monthly malaria cases, monthly ecological data such as average rainfall, atmospheric temperature (minimum, maximum, and average), humidity, and wind speed, for the same period were taken from World Weather online website^{[21],[22]} (www.worldweatheronline.com).
Study variables
To assess the role of different climatic variables in driving malaria transmission in two districts of Gujarat, a preliminary correlation analysis was conducted between various climatic predictors and malaria cases. The correlation test identified four primary variables of interest (one outcome variable and three exposure variables).
Data bifurcation and model comparison
Data sets were bifurcated into two: Modelbuilding and Testing data sets. We have created three modelbuilding data sets: modelbuilding data set 1 (2008–16), modelbuilding data 2 (2009–17) and modelbuilding data 3 (2010–18), and three testing data sets: testing dataset 1 (2017), testing dataset 2 (2018) and testing dataset 3 (2019).
All the eight models (Model A to Model H) were assessed thrice. First, we independently forecasted malaria cases for the year 2017 by all the eight models using modelbuilding data set 1 and subsequently compared it with the actual cases of 2017 (Testing data set 1). Then we forecasted malaria cases for 2018 and 2019 by all the models using modelbuilding data sets 1 and 2, respectively, and subsequently compared it with the actual values of 2018 (Testing data set 2) and 2019 (Testing data set 3), respectively.
Methods used
The efficacy of eight statistical models (Model A to Model H) based on three statistical methods (Time series regression/General Linear Model (GLM), smoothing, and SARIMA) in the forecasting of malaria cases in semiarid regions was assessed in this research article. A brief description of these statistical methods and the respective models based on these are briefly discussed.
Models based on Generalized Linear Model (GLM) or Time series regression
A time series is a sequence of data points recorded at a regular epoch of time. In such situations where we have more than one series recorded over the same period, a time series regression can be used to investigate whether the change in the outcome variable is explicable by the change in the predictor variables by assuming a linear relationship between outcome and predictor time series. Two models (Model A & Model B) were constructed using timeseries regression. In Model A, monthly malaria cases were forecasted, taking rainfall, humidity, and minimum temperature as extraneous variables. In Model B, along with these three environmental variables, seasonality and trend components were also added as extraneous variables^{[23],[24]}.
Modelbased on smoothing techniques (or HoltWinter Method)
A HoltWinter method is a smoothing method that exponentially smoothens the time series by reducing weights as the values get older and/or increasing weights the newer or more recent values. We have three variants of HoltWinters methods: single exponential smoothing (used in situations where there is no trend or seasonal pattern in the data, only the data values change over time), double exponential smoothing (used in cases where there is no seasonal component but trend and levels are present), and triple exponential smoothing (suitable when timeseries data consist all the threecomponentstrend, season and level). By using HoltWinter methods, we have constructed one model i.e., Model C, where triple exponential smoothing was used^{[25]}.
Models based on Seasonal AutoRegressive Integrated Moving Average (SARIMA) method
SARIMA methods are one of the most acceptable time series analysis approaches where forecasting is done based on temporal autocorrelation in the data. In this category, we have five models: Model D to H. The notation (p,d,q) (P, D, Q)s are used to describe the temporal pattern, where noncapitalized ‘p’ and ‘q’ represent the order of the nonseasonal autoregressive part and moving average part, respectively, whereas the noncapitalized ‘d’ represents the degree of first differences. Capitalized ‘P’, ‘Q’, and ‘D’ represent a seasonal autoregressive part, seasonal moving average part, and seasonal differencing, respectively, with ‘s’ representing the time interval of seasonal variation. The value of ‘p’ and ‘q’ is determined by calculating the autocorrelation function (ACF) and partial autocorrelation function (PACF) while the value of ‘d’ may be determined by plotting cases against time. Out of five models, the first four models: Model D to Model G, forecast malaria cases by analyzing the past pattern of data only and do not consider extraneous variables in the analysis, while Model H uses extraneous variables (rainfall, humidity, and minimum. temp are in our case) as well as the past pattern of data. Model D and E are the SARIMA model on the original scale based on minimum AIC and BIC criteria, respectively, while Model F and G are the SARIMA model on the log scale based on minimum AIC and BIC criteria, respectively^{[26]}.
Model efficacy
The model efficacy has been determined by taking the difference between actual and forecasted malaria cases and expressing it in terms of Root Mean Square Error (RMSE) and Mean Absolute Error (MAE). The lower the RMSE and MAE, the better is the model^{[27],[28]}.
Statistical software
All the statistical analyses were carried out using R 3.4.1 and Stata 15.0. The geographical map was constructed using GeoArch software.
Ethical statement
The study is approved by Institute Ethics Committee, ICMRNational Institute of Malaria Research, New Delhi, India (Ref No.: ECR / NIMR / EC / 2018/34 dated 26/02/2018).
Results   
As the study was initiated to compare the forecasting efficacy of the eight most prominent statistical models in terms of accurate prediction of malaria cases monthwise for two districts of semiarid regions of Gujarat (viz. Kheda and Anand) [Figure 1] while the secondary objective was to assess the role of epidemiological variables in malaria forecasting in areas with a semiarid climate. As discussed in the method section, forecasting efficacy of all eight models were assessed thrice using three different datasets. First, malaria cases for the year 2017 were forecasted and subsequently compared with actual cases of malaria in 2017, independently for two districts. Similarly, cases were forecasted for 2018 and 2019 using model building datasets 2 and 3, respectively, and further compared with actual cases of 2018 and 2019.
The most appropriate model(s) for district Kheda
Relatively lower RMSE and MAE have been observed for all SARIMA models compared to GLM and HW models for the years 2017 and 2018, but in 2019 along with the SARIMAbased model, the HWbased model had also given lower RMSE and MAE. Amongst SARIMA models, Models F and G (both have the same RMSE (29.5) and MAE (17.8)) were found to be the most appropriate when forecasted for the year 2017 using modelbuilding data set 1 (2008–16). For 2018, only model F (RMSE=21.1, MAE=14.3) was found most appropriate. In 2019, model H was the most accurate in terms of ‘point estimate’. However, the 80% confidence interval is so wider that it has no forecasting utility. Model C was the next most accurate model in terms of point estimate after model H but it also suffered from the same limitation, the confidence intervals were so wider that they did not have any forecasting utility. The third model with the lowest RMSE and MAE was Model F, which had better confidence intervals and, therefore, more forecasting utility. Thus, in a nutshell, model F was found to be the most appropriate model for the prediction of malaria cases in the Kheda district for all three years [Table 1].  Table 1: Comparison of actual and forecasted cases of malaria by all eight models in three consecutive years: 2017, 2018 and 2019 for Kheda district of Gujarat, India
Click here to view 
The most appropriate model(s) for district Anand
Like Kheda, in Anand district too we received similar results. Here also, the SARIMAbased models performed well in 2017 and 2018. In 2019, model H followed by model C, had a better point estimate than model F. Still, their 80% confidence intervals were so broader that they did not have any forecasting utility. Therefore, model F was also recommended for Anand district [Table 2].  Table 2: Comparison of actual and forecasted cases of malaria by all eight models in three consecutive years: 2017, 2018 and 2019 for Anand district of Gujarat, India.
Click here to view 
Forecasting by the model F
The overall distribution of actual and forecasted cases of malaria by the Model F is comparable in all months except September and October, which are generally considered the peaks of malaria [Figure 2].  Figure 2: Comparison of actual and forecasted cases of malaria in two districts of Gujarat: Kheda and Anand, in the years 2017, 2018, and 2019 using model F.
Click here to view 
Monthly distribution of environmental variables over the years and their role in malaria forecasting
In general, rainfall over the past years in these two districts was relatively stable and followed a similar pattern. The rains start in June and ends in September, peaking either in July or August. Similar to the rainfall, the humidity pattern over the past years in these two districts are also identical. Generally, high humidity is observed from mid of July to the mid of September. The pattern of minimum temperature in different months over the study period is also comparable between these two districts [Figure 3]. The peak in malaria cases can be seen in the month of September, and it is also the same for both the districts, as rainfall peak is observed in the month of August, and we believe is one month lag between rainfall and malaria transmission [Figure 4].  Figure 3: Distribution of ecological variables (rainfall, humidity & minimum temperature) over the months and across the years for Kheda (A, B & C) and Anand (D, E, & F) districts.
Click here to view 
 Figure 4: Distribution of malaria cases over the months and across the years for Kheda (A) and Anand (B) districts, Gujarat.
Click here to view 
In this research, three out of eight models (viz. Model A, Model B and Model H) use environmental data as extraneous variables, to assess the role of environmental variables. Amongst three, only Model H was found appropriate in terms of point estimate when forecasted for 2019. Still, foresting intervals from the Model H were so broad that they lost significance in utility in malaria forecasting. It signifies that there is a role of environment variables in malaria forecasting, but at the same time, if we have the model that makes proper autocorrelation adjustments, it can be more fruitful. We can make a reasonable forecast even without incorporating data on extraneous variables.
Discussion   
Malaria forecasting is a valuable tool for malaria control program officers as it helps prepare, prioritize and mobilize resources for effective control through future case projections, which gains even more importance in regions with unstable and epidemic malaria. Statistical, mathematical, and machine learning methods are the three most widely used techniques in malaria forecasting. In this work, we focused on statistical methods and compared the relative efficacy of eight different models. SARIMAbased models were found to be the most appropriate for malaria forecasting, these models were recommended by other researchers as well^{[29],[31]}. Within the SARIMAbased models, Model F was found to be the most appropriate. It must be reiterated that Model F is a type of SARIMA model which uses its past data to predict future data. This means data on other covariates (or exogenous variables such as rainfall, humidity, minimum temp, etc.) are not needed in this model. The forecasting accuracy of model F was 75% for the year 2017 and 83% each for the subsequent years 2018 and 2019 in the range of ±20 cases in Kheda districts. On the other hand, in the Anand district, it was 58%, 75%, and 83% for the years 2017, 2018, and 2019 respectively in the range of ±20 cases. Forecasting accuracy of 75% or more in the range of ±20 is quite reasonable and has practical utility, and may be used to make a good forecast of malaria cases in advance. Prediction from the Model F is reasonably accurate for all the months except September and October, where forecasted cases are significantly higher than the actual cases. However, even in these months, 80% confidence interval captures the actual value in most scenarios. We also suggest that these models should be incorporated into malaria dashboard so that programme personal may have reasonable guess of future cases and they may modify their strategies as per predicted caseload^{[32],[33]}. The forecasted cases in malaria transmission seasons i.e., September and October, are found much higher in all the models, including model F. This overestimation in malaria cases may be understood in the context of the underlying model’s assumptions. All these models presume that past patterns would remain unchanged in the future, assuming that no new intervention(s) would be implemented in the forecasted period. Hence, the model building time series does not account for new interventions. However, biological control and use of Alphacypermethrin as an insecticide were introduced in Gujarat in 2005 and 2008 respectively. Since their introduction, the coverage and use of these two interventions have improved (change) during the study period. Two other major interventions introduced during the study period were the distribution of Long Lasting Insecticidetreated Nets (LLINs) and the use of ACTs introduced in 2011 and 2013, respectively, and are still in operation. After introducing these two interventions in malaria control, a significant reduction in the malaria cases in subsequent years was observed in both the districts, with a relatively higher impact in the Kheda district. After the introduction of LLINs, there was a continuous decline in malaria cases in three subsequent years (viz. 2012, 2013 & 2014), but the cases again rose in 2015 and 2016. Arguably, the rise in cases could have led to stronger onfield control measures, particularly in the malaria season, leading to a reduction in the actual number of cases on the ground. Such control measures could have resulted in fewer actual cases, especially in Kheda district.
The role of ecological variables like rainfall, humidity, and temperature cannot be ignored in malaria forecasting. However, we observed that if past patterns of data are analyzed appropriately, we could make a reasonable forecast for regions with a semiarid climate without incorporating environmental variables. Given the intricacies of dealing with multitudes of such variables and the paucity of appropriate mechanisms to feed them into a model, this is an important advantage. For instance, the minimum temperature in particular months has around 30 values (number of days in a month). Given that rains tend to reduce temperatures, there could be wide differences in these values. However, we analyze an average value representative of that month, which is not appropriate considering the fact that the minimum temperature is associated with the sporogonic cycle of mosquitoes that affects transmission.
Similarly, the average rainfall for the month ignores subtle differences, such as incidences of three or four rains of differing intensity during the month. In such situations, the first rainfall may be responsible for propagating the vector density, while other highintensity rainfall may wash off the breeding habitats. The statistical models are thus incapable of accounting for the differential impact due to average inputs. Such operational difficulties reduce the possibility of an association between malaria cases and environmental data.
Limitation   
The study was conducted in an area with a stable and predictable pattern of rainfall that does not fall into extremes, it limits the external validity for regions with an unstable or extreme pattern of rainfalls. Extremities in climate brought about due to climate change can significantly affect the magnitude of the outbreaks, and accurately forecasting these outbreaks for early response and prevention is often a challenge. However, to our knowledge, this is the first research comparing the utility and appropriateness of the different models for malaria forecasting in semiarid regions. Given malaria’s potential to reappear, forecasting has a vital role in its elimination.
Abbreviations   
SARIMA: Seasonal AutoRegressive Integrated Moving Average; GLM: Generalized Linear Model; RMSE: Root Mean Error Square; MAE: Mean Absolute Error; A: Actual Cases; _{L}, P _{U: Lower bound of 80% CI} Point forecast_{Upper bound of 80% CI}; ModelA: Generalized Linear Model / Time Series Regression taking rainfall, humidity, and min. temperature as covariates; ModelB: Generalized Linear Model / Time Series Regression taking rainfall, humidity, and min. temperature, seasonality, and trend as covariates; ModelC: HoltWinter method with triple smoothing; ModelD: SARIMA model based on lowest AIC criteria; ModelE: SARIMA model based on lowest BIC criteria; ModelF: SARIMA model based on lowest AIC criteria on logtransformed data; ModelG: SARIMA model based on lowest AIC criteria on logtransformed data; ModelH: SARIMA Regression with rainfall, humidity, and minimum temperature as exogenous variables.
Conflict of interest: None
Acknowledgements   
We are very thankful to the State authority of Gujarat, India for providing the data and ICMRNIMR for all logistic support. We are very grateful to Mr. Sanjeev Gupta, STO, ICMRNIMR, for his help in GIS mapping.
References   
1.  
2.  Home :: National Vector Borne Disease Control Programme (NVBDCP). 2021; published online Aug 24. https://nvbdcp.gov.in/ (Accessed on August 24, 2021). 
3.  Alemu A, Abebe G, Tsegaye W, Golassa L. Climatic variables and malaria transmission dynamics in Jimma town, South West Ethiopia. Parasit Vectors 2011; 4: 30. 
4.  Okunlola OA, Oyeyemi OT. Spatiotemporal analysis of association between incidence of malaria and environmental predictors of malaria transmission in Nigeria. Sci Rep 2019;9. 
5.  Craig MH, Snow RW, le Sueur D. A climatebased distribution model of malaria transmission in subSaharan Africa. Parasitol Today Pers Ed 1999; 15: 105–11. 
6.  Okunlola OA, Oyeyemi OT. Spatiotemporal analysis of association between incidence of malaria and environmental predictors of malaria transmission in Nigeria. Sci Rep 2019;9. 
7.  Lindblade KA, Walker ED, Wilson ML. Early Warning of Malaria Epidemics in African Highlands Using Anopheles (Diptera: Culicidae) Indoor Resting Density. J Med Entomol 2000; 37: 66474. 
8.  Wu Y, Qiao Z, Wang N, et al. Describing interaction effect between lagged rainfalls on malaria: an epidemiological study in southwest China. Malar J 2017; 16: 53. 
9.  
10.  Hashizume M, Terao T, Minakawa N. The Indian Ocean Dipole and malaria risk in the highlands of western Kenya. Proc Natl Acad Sci 2009; 106: 1857–62. 
11.  Klutse NAB, AboagyeAntwi F, Owusu K, NtiamoaBaidu Y. Assessment of Patterns of Climate Variables and Malaria Cases in Two Ecological Zones of Ghana. Open J Ecol 2014; 4: 76475. 
12.  Yu CM, Abraham WT, Bax J, et al. Predictors of response to cardiac resynchronization therapy (PROSPECT)study design. Am Heart J 2005; 149: 6005. 
13.  Lauderdale JM, Caminade C, Heath AE, et al. Towards seasonal forecasting of malaria in India. Malar J 2014; 13: 310. 
14.  Zinszer K, Kigozi R, Charland K, et al. Forecasting malaria in a highly endemic country using environmental and clinical predictors. Malar J 2015; 14: 245. 
15.  MacDonald G. The epidemiology and control of malaria. London: Oxford University Press, 1957. 
16.  Zinszer K, Verma AD, Charland K, et al. A scoping review of malaria forecasting: past work and future directions. BMJ Open 2012; 2. 
17.  James A. Anderson. An introduction to neural networks. Cambridge, MA: The MIT Press, 1995. 
18.  RJ Hyndman, G Athanasopoulos. Forecasting: Principles and Practice. otexts, 2013. 
19.  
20.  Beck HE, Zimmermann NE, McVicar TR, Vergopolan N, Berg A, Wood EF. Present and future KöppenGeiger climate classification maps at 1km resolution. Sci Data 2018; 5: 180214. 
21.  
22.  
23.  
24.  
25.  
26.  
27.  
28.  
29.  TwumasiAnkrah S, Pels WA, Nyantakyi K, Addo DK. Comparison of Statistical Techniques for Forecasting Malaria Cases in Ghana 2019; 4: 102. 
30.  Anokye R, Acheampong E, Owusu I, Isaac Obeng E. Time series analysis of malaria in Kumasi: Using ARIMA models to forecast future incidence. Cogent Soc Sci 2018; 4: 1461544. 
31.  Wang M, Pan J, Li X, et al. ARIMA and ARIMAERNN models for prediction of pertussis incidence in mainland China from 2004 to 2021. BMC Public Health 2022; 22: 1447. 
32.  Yadav CP, Sharma A. National Institute of Malaria ResearchMalaria Dashboard (NIMRMDB): A digital platform for analysis and visualization of epidemiological data. Lancet Reg Health  Southeast Asia 2022; 100030. 
33.  Rahi M, Sharma A. For malaria elimination India needs a platform for data integration. BMJ Glob Health 2020; 5: e004198. 
[Figure 1], [Figure 2], [Figure 3], [Figure 4]
[Table 1], [Table 2]
