:

: Reference evapotranspiration (ETo) is a valuable factor in the hydrological process and its estimation is a sophisticated and nonlinear problem. In this study, the utility of multilayer perceptron regression is investigated to estimate ETo of Jodhpur city, India which has a hot arid climate. Four different multilayer perceptron regression-based models are created and compared in this study. Multilayer perceptron regression is a popular tool used to predict the results of sophisticated problems. Each created model has a different architecture, in which the size (neurons) of the input and hidden layers is decided by the maximal correlation relationship between meteorological attributes and observed ETo using the Food Agriculture Organization Penman-Monteith method (FAO-PM56). This study found that model with meteorology inputs (namely both high and low temperatures, solar radiation, wind speed at 2 m, and humidity) and nine neurons at the hidden layer achieved high predictive accuracy with mean absolute error (MAE) of 0.08, mean squared error (MSE) of 0.01, root mean squared error (RMSE) of 0.10, Pearson correlation (r) of 0.99, and coefficient of determination (r 2 ) of 0.99. The finding of this study is that the multilayer perceptron regression-based models with at least three meteorological inputs (temperature, solar radiation, and wind speed) can effectively utilize to estimate ETo and may receive attention from agriculturists, engineers, and researchers for irrigation scheduling, water resource handling, crop production enhancement, draught area prediction, etc.


Introduction
Efficient water management techniques are required in the fields of agriculture, industry, and the energy sector because they need large amounts of fresh water. Crop-water requirements can be estimated using evapotranspiration methods. Evapotranspiration is a prime element of the hydrological process, and it is a blending process of indivisible elements, namely evaporation from surfaces and transpiration from plants and crops. It can be calculated on grass ground level known as reference evapotranspiration (ETo) and successively used to forecast crop evapotranspiration (ETc). Climatic factors, namely temperature (both high and low), solar intensity (radiation), humidity in the atmosphere, wind speed at 2 m height, and sunshine hours affect the ETo process. It shows the evaporating capacity of a particular region at a particular period. Hence, geographical factors, namely latitude, longitude, and elevation also affect the ETo process. ETo is measured in millimeters of water depth within a given unit of time. The Penman-Monteith method is the standard empirical method recommended by the Food Agriculture Organization of the United Nations (FAO) [1] and is commonly termed as FAO-PM56. The concept of temperature-based estimation of ETo was provided by Hargreaves et al. [2], which required only solar radiation and temperature parameters.
Since the inception of this century, many methods based on artificial neural networks and machine learning techniques have been suggested and applied by various researchers. These methods demonstrate their predictive power to estimate ETo and have been studied extensively.
Multilayer perceptron regression is a popular tool used to predict the results of sophisticated problems. The beauty of the multilayer perceptron is to customize its architecture easily by providing different sizes of inputs. The goal of this investigation is to examine the utility of the multilayer perceptron regression-based models to anticipate the reference evapotranspiration of Jodhpur city, India. Four models (Ann1, Ann2, Ann3, and Ann4) have been proposed and compared. Each model has a different architecture, where the size of the hidden and input layers depends on the elevated correlation between meteorological parameters and observed ETo (FAO-PM56). Performance of these models are evaluated on statistical parameters, such as Pearson correlation (r), coefficient of determination (r 2 ), mean squared error (MSE), mean absolute error (MAE), and root mean squared error (RMSE). Models' performance are represented visually with the help of regression plots, residual plots, and box plots.
The rest of the paper is presented as follows. Section 2 demonstrates the methodology adopted in the present study to forecast ETo of the Jodhpur city. It includes study site and data collection, model formation, and model evaluation parameters. Section 3 presents the results obtained by the proposed models and their comparative performance analysis. Finally, Section 4 summarizes the conclusion and findings of the study besides future work.

Data collection
The proposed multilayer perceptron regression-based models are examined here using climatic source materials of Jodhpur city, Rajasthan, India. 8,036 samples of daily climatic data for 22 years (2000-2021) are received from the India Meteorological Data (IMD), Pune. Collected data includes daily climatic parameters, namely highest and lowest temperature (T high and T low ) in degree centigrade (℃), relative humidity (R hum ) in percentage (%), wind speed at 2 m (u ws ) in meters per second (m s -1 ), and solar radiation (R sr ) in megajoules per square meter per day (MJ m -2 day -1 ). The climate of Jodhpur city is hot and arid with huge variation between winter and summer temperatures (range varies from 1.67 ℃ to 47.65 ℃). The average elevation of Jodhpur city is 231 m. Jodhpur's latitude is 26.2971 and its longitude is 72.97725. The climatic data of Jodhpur city are described in Table 1. A matrix of correlation coefficients among ETo (from FAO-PM56) and the climatic parameters are shown in Table 2 and graphically represented by a heatmap in Figure 1. This matrix shows that temperature (both high and low), solar radiation, and wind speed have a positive correlation with ETo, whereas humidity has a negative one. Therefore, it endorses that ETo increases as such parameters increase. Daily variation in climate at Jodhpur city from 2000 to 2021 is demonstrated in Figure 2.
In this study, simulated results are compared with observed ETo by the FAO-PM56 equation. Many developing and underdeveloped countries have fewer resources to obtain accurate meteorological data. Machine learning-based models may be a better replacement with limited meteorological data to forecast accurate ETo. It motivates us to create different multilayer perceptron models based on different input sizes and compare their performance.

Multilayer perceptron
The effective and accurate data processing capability of the multilayer perceptron attracts researchers from different domains to model and solve complex nonlinear problems. The multilayer perceptron can be considered a directed graph in which nodes (or neurons) are processing elements and edges with weights establish links between neurons. Multilayer perceptron is a popular feed-forward artificial neural network. The simplest version has an input layer and an output layer. Complex architecture can be created by inserting layers called hidden (middle) layers in-between input (first) and output (last) layers to solve sophisticated problems. The size of each layer may vary. There is no specific approach to deciding the architecture of an artificial neural network. Optimal architecture can be attained on a trial-and-error basis. The multilayer perceptron can solve both classification and function approximation problems. The size of the input layer is determined by the input attributes fed to the network. In this study, highest temperature (T high ), lowest temperature (T low ), wind speed (u ws ), solar radiation (R sr ), and relative humidity (R hum ) are input attributes. In this case, the input layer can have at most five neurons. Since the estimation of reference evapotranspiration is a function approximation problem, hence output layer (last layer) requires a single neuron. The most crucial task is to find the size (the number of neurons) of the hidden layer. Many rules of thumb in the literature have been given to set the size of the hidden layer. In this investigation, it is considered as less than twice the size of the input layer. The multilayer perceptron regression has the ability to learn, and makes it an intelligent system. Among the various learning algorithms, gradient descent is the most straightforward that modifies the weights iteratively to minimize the error of the networks. In this study, the Quasi-Newton method is used, which is more efficient than gradient descent, and works well for the intermediate size of data. The adopted multilayer perceptron regression architecture is given in Figure 3. The required parameters for this architecture are summarized in Table 3.
It can be noticed in Table 3 that the structure of these models has a single input layer, a single hidden layer, and a single output layer. The size (number of neurons) of the input and hidden layers are customized according to the models. In the first model (Ann1), two inputs (T low , T high ) are applied, therefore two neurons at the input layer are needed. Neurons (from 2 to 3) at the hidden layer of this model are optimized using the grid search method. In the second model (Ann2), three inputs (T low , T high , R sr ) are applied therefore three neurons at the input layer are needed. Neurons (from 3 to 5) at the hidden layer of this model are optimized. In the third model (Ann3), four inputs (T low , T high , R sr , u ws ,) are applied therefore four neurons at the input layer are needed. Neurons (from 4 to 7) at the hidden layer of this model are optimized. Similarly, In the fourth model (Ann4), five inputs (T low , T high , R sr , u ws , R hum ) are applied therefore five neurons at the input layer are needed. Neurons (from 5 to 9) at the hidden layer of this model are optimized. In this study, the Quasi-Newton method algorithm is used to learn these four models where weights are modified in a way to minimize the error between observed and predicted outcomes. The rectified linear unit (ReLU) is taken as an activation function (transfer function) during the learning process in this study. Hidden layer (its size depends on the size (n) of input layer) that vary from n to 2*n-1 Output layer Figure 3. The architecture of the adopted artificial neural network (multilayer perceptron regression)

Model formation
The model formation is given in Figure 4. In the beginning, the climatic and geographical information of Jodhpur city is loaded into the memory. To obtain quality results, unknown information is generated by the average value of that attribute. Normalization is done by the z-score method in this study. The observed value of ETo using the FAO-PM56 equation is named as a response variable in the loaded data set, whereas the remaining attributes such as lowest and highest temperature, solar radiation, wind speed, and humidity are named as predictors. The climatic data set (8,036 samples) of Jodhpur city, India, is divided into two groups, the first group of samples contains 80% of the entire data and is employed for training whereas the second group contains the remaining 20% and is employed for testing. Four different architectures of artificial neural networks (multilayer perceptron regression) are created and named Ann1, Ann2, Ann3, and Ann4. These architectures can be differentiated based on the dimensions of input and hidden layers. The data grid search method is applied to the training samples to attain an optimum value of neurons at the hidden layer of created architectures. Thereafter, these models are trained along with obtained optimum neurons and tested with the test samples. The appropriateness of these models is measured on prominent statistical indicators, and their names are given in the introduction section.

Model evaluation
The following well-known statistical indices are used to measure the appropriateness of Ann1, Ann2, Ann3, and Ann4.
where EVP x denotes predicted evapotranspiration and EVO x denotes observed evapotranspiration.

Result and discussion
The behavior of the proposed models (Ann1, Ann2, Ann3, and Ann4) are discussed and compared in this section. The sklearn library is used to implement such models. Different sets of input parameters are applied to these models. Meteorological data is randomly divided into two disjoint groups. In the first group, 80% of samples are used for training and 20% are kept for model testing. Data grid search is applied to the first group to achieve an optimal value of the hyperparameter of each model. Table 4 exhibits the predictive performance of these models. The architecture of the Ann1 model is formed with three layers: an input layer, a hidden layer, and an output layer. Two input parameters, namely T low and T high are applied to the input layer of this model. The data grid search approach is used to get the optimal number of neurons at the hidden layer. The optimum performance of this model is achieved when three neurons are considered at the hidden layer and exhibits MAE = 0.95, MSE = 1.49, RMSE = 1.22, r = 0.91, and r 2 = 0.82. The model's regression analysis is depicted in Figure 5(a). The slope value of 0.82 is found in this scenario. Graphically, it can be seen that the ETo values are far away from the fitted line. A high variation in residual from -4.28 to 4.14 is observed, which is shown in Table 5 and demonstrated in Figure 5  Three input parameters (T low , T high , and R sr ) are applied to the input layer of the Ann2 model. The best performance of this model is achieved when four neurons are considered at the hidden layer and exhibit MAE = 0.62, MSE = 0.65, RMSE = 0.8, r = 0.96, and r 2 = 0.92. The model's regression analysis is depicted in Figure 6(a). The slope value of 0.92 is found in this scenario. Graphically, it can be seen that the few ETo values are far away from the fitted line. Variation in the residual from -3.13 to 3.69 is observed, which is shown in Table 5 and demonstrated in Figure 6(b). In this scenario, numerical values reveal that the Ann2 model outperforms the Ann1 model.  Figure 7(a). The slope value of 0.95 is found in this scenario. Graphically, it can be seen that the ETo values are scattered around the fitted line. A high variation in residual from -2.81 to 1.67 is observed, which is shown in Table 5 and demonstrated in Figure 7(b). Numerical values in this scenario reveal that the Ann3 model outperforms the Ann1 and Ann2 models and gives satisfactory performance. Five parameters (T low , T high , R sr , u ws , and R hum ) are applied to the input layer of Ann4. The best performance of this model is achieved when nine neurons are considered at the hidden layer and exhibit MAE = 0.08, MSE = 0.01, RMSE = 0.1, r = 0.99, and r 2 = 0.99. The model's regression analysis is depicted in Figure 8(a). The slope value = 0.99 is found in this scenario. Graphically, it can be seen in the figure that the ETo values are over the fitted line. A very small variation in residual from -0.29 to 0.59 is observed, which is shown in Table 5 and demonstrated in Figure 8(b). Numerical values in this analysis indicate that the Ann4 model shows remarkable performance compared to the Ann1, Ann2, and Ann3 models.  Figure 9. which interprets that, in the Ann4 model the vertical distance (residuals) of predicted ETo values from the fitted line are very less as compared to the Ann1, Ann2, and Ann3 models. Residuals are gradually decreasing from the Ann1 model to the Ann4 model. The performance of models on statistical indicators is shown in Figure 10.

Conclusion and future work
Proper management of water is an incumbent service for the economic rise of a nation. Estimation of evapotranspiration can play a valuable role in irrigation scheduling and planning, water saving, and enhancement of crop production. Multilayer perceptron regression is a well-known tool to model complex nonlinear problems. It can be a better replacement to estimate accurate evapotranspiration. The outcome of this study shows that the Ann3 (considering high and low temperature, solar radiation, and wind speed as input variables along with six optimal neurons on a hidden layer) and Ann4 (considering high and low temperature, solar radiation, wind speed, and humidity as input variables along with nine optimal neurons on a hidden layer) outperform the Ann1 (considering only temperature as input variable) and Ann2 (considering temperature and solar radiation as input variables). The finding of this study is that the Ann3 with an r 2 of 0.98 and the Ann4 with an r 2 of 0.99 can assist the farmer and engineers in irrigation scheduling, crop yield simulation, and drainage study. For further investigation, these models will be applied to the climatic data of major Indian cities that are located in different climatic zones (such as alpine, humid subtropical, tropical wet-dry, semiarid, and arid) to estimate crop water requirements of wheat and paddy crops. Besides it, more hyperparameter tuning on multilayer perceptron will be explored using different optimization approaches.