A general procedure to generate models for urban environmental-noise pollution using feature selection and machine learning methods

The prediction of environmental noise in urban environments requires the solution of a complex and non-linear problem, since there are complex relationships among the multitude of variables involved in the characterization and modelling of environmental noise and environmental-noise magnitudes. More...

Full description

Saved in:
Bibliographic Details
Published inThe Science of the total environment Vol. 505; pp. 680 - 693
Main Authors Torija, Antonio J., Ruiz, Diego P.
Format Journal Article
LanguageEnglish
Published Netherlands Elsevier B.V 01.02.2015
Subjects
Online AccessGet full text
ISSN0048-9697
1879-1026
1879-1026
DOI10.1016/j.scitotenv.2014.08.060

Cover

More Information
Summary:The prediction of environmental noise in urban environments requires the solution of a complex and non-linear problem, since there are complex relationships among the multitude of variables involved in the characterization and modelling of environmental noise and environmental-noise magnitudes. Moreover, the inclusion of the great spatial heterogeneity characteristic of urban environments seems to be essential in order to achieve an accurate environmental-noise prediction in cities. This problem is addressed in this paper, where a procedure based on feature-selection techniques and machine-learning regression methods is proposed and applied to this environmental problem. Three machine-learning regression methods, which are considered very robust in solving non-linear problems, are used to estimate the energy-equivalent sound-pressure level descriptor (LAeq). These three methods are: (i) multilayer perceptron (MLP), (ii) sequential minimal optimisation (SMO), and (iii) Gaussian processes for regression (GPR). In addition, because of the high number of input variables involved in environmental-noise modelling and estimation in urban environments, which make LAeq prediction models quite complex and costly in terms of time and resources for application to real situations, three different techniques are used to approach feature selection or data reduction. The feature-selection techniques used are: (i) correlation-based feature-subset selection (CFS), (ii) wrapper for feature-subset selection (WFS), and the data reduction technique is principal-component analysis (PCA). The subsequent analysis leads to a proposal of different schemes, depending on the needs regarding data collection and accuracy. The use of WFS as the feature-selection technique with the implementation of SMO or GPR as regression algorithm provides the best LAeq estimation (R2=0.94 and mean absolute error (MAE)=1.14–1.16dB(A)). •Machine-learning regression methods are implemented for LAeq prediction.•Non-linear solvers outperform linear solver in estimating urban environmental noise.•SMO and GPR algorithms achieve the best estimation of LAeq.•CFS technique allows the greatest reduction in data-collection cost.•Input variables chosen by WFS technique offers the best results in estimating LAeq.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0048-9697
1879-1026
1879-1026
DOI:10.1016/j.scitotenv.2014.08.060