Funding: Swiss National Science Foundation
Duration: January 1999 - December 2000
Principal Investigator:
Michel Maignan,
Professor of Geostatistics at Lausanne University
Ph.D. student: Nicolas Gilardi
Visitor: Professor Mikhail Kanevski, head of laboratory IBRAE,
Nuclear Safety Institute in Moscow
The goal of this research is to exploit and adapt methodologies
borrowed from artificial intelligence —in particular artificial
neural networks (ANN)— for the analysis of environmental
spatial data. ANNs are universal function approximators and we
will demonstrate that understanding their statistical properties and
adequately using their abilities can be extremely efficient to tackle
difficult problems of geostatistics in the domain of environmental
spatial data analysis.
This work addresses a series of basic research items of spatial data
analysis that was not addressed by the geostatistics specialists, even
according to their latest announcements at Geostat’96 and
IAMG’97:
- highly non stationary spatial processes,
- cartography of distribution functions, as opposed to cartography
of the mean value,
- user and data-driven parameterization for the discrimination
between a stochastic trend and auto-correlated residuals,
- cartography of stochastic deviations related to
advection-diffusion models.
Final solutions proposed for the resolution of geostatistical problems
will mostly be hybrids involving ANNs together with classical
approaches of geostatistics such as kriging estimations and
simulations (for estimating the residuals of the ANN
predictions).
Some benefits of ANNs in geostatistics have already been experienced
through a first-phase research carried out by the principal applicant
and by Professor M. Kanevski (see [1,2,3,4]):
- An ANN performs a non-linear mapping and is thus able to capture
some non-linear trends in the data. It has been shown to be very
efficient to capture large-scale non-linear trends [3], as well
as medium-scale variations [4,5,6].
-
A neuromimetic approach is adaptive and data driven, while the
classical approaches in geostatistics are ad hoc for each different
type of problem, requiring a lot of expertise and hand
tuning [3].
- ANNs are robust against noisy data and work on global and local
scales [6,7].
Beyond these positive points that have already been observed and which
will be deeply investigated, there are several other features of ANNs,
that seem very promising for some specific aspects of geostatistics,
and which will be explored in this project.
- A typical problem in geostatistics is the estimation of a function
f(x,y) given some observations of f in
different locations (x,y). In practice however, the
computation of an estimation f’ of f is not of a
great use if it is not accompanied with the estimation of the variance
s ’(x,y) of the error
between f and f’. The latter information is crucial
for a decision-maker. Lately, some authors proposed new neural
networks designed to predict simultaneously the output and an
associated confidence. In this project, we plan to evaluate
this type of approach for spatial data analysis. (Personal
communication with Dr hab S. Canu for spatial estimation of
distributions).
- Ideally, instead of predicting for each location
(x,y) a single value f’(x,y) or
this value plus a variance s
’(x,y), it would be interesting to estimate the
probability density of the value f(x,y). In a
simpler way, the decision-maker will be highly interested in knowing
the probability density function Pr[f(x,y) ³ t] for a fixed critical threshold
t and for any location (x,y). We will demonstrate
that again, an adequate usage of statistical learning tools can
efficiently address this problem. This is related to the classical
problem of cartography for time-repeated measurements at the same
location.
- An important characteristic of environmental data analysis is
their statistical multivariate composition. Some of them are cheap and
available for a large number of locations (e.g. meteorological
measurements); while others are expensive and are known only on a
spare set of places (e.g. laboratory chemical analysis). The
statistical correlation and spatial cross-correlation between these
variables can be taken into account in the estimation
process [4].
The learning tools investigated to achieve these goals will include
multilayer perceptrons (MLPs) but also the latest and most efficient
developments based on or generalizing neural networks, such as:
mixtures of experts , support vector machines
and Gaussian processes.
The verification and validation of the new developments in this
research project will be carried out using unique environmental data
sets chosen dealing with: post-Chernobyl pollution, chemical analysis
of water and sediments of the lake of Geneva, and heavy metal soil
contamination. The methods developed within this project will be
extensively compared with classical geostatistical methods (such as
kriging).
[1] M. Kanevski. Neural networks and geostatistical spatial
interpolations, IBRAE internal report, 1994.
[2] M. Kanevski and M. Maignan. Neural network kriging estimations,
Japanese Journal of Geoinformatics, 1995.
[3] M. Kanevski, V. Demyanov, and M. Maignan. Mapping of soil
contamination by using ANNs and multivariate geostatistics,
Proceedings of the International Conference on Neural Networks,
ICANN'97, pp 1125-1131, 1997.
[4] M. de Bollivier, G. Dubois, M. Maignan, and
M. Kanevski. Multilayer perceptron with local constraints as an
emerging method in spatial data analysis, New computing techniques in
physics research V, pp. 226-229, 1997.
[5] M. Chentouf, C. Jutten, M. Maignan, and M. Kanevski. Incremental
neural networks for function approximation, New computing techniques
in physics research V, pp 268-270, 1997.
[6] S. Shibli, P. Wong, and M. Maignan. Radial basis function for
spatial estimation, New computing techniques for physics research V,
1997.
[7] M. Kanevski, M. Maignan, V. Demyanov, and
M.-F. Maignan. Environmental decision-oriented mapping with algorithms
imitating nature, IAMG 97 Int. Assoc. Mathematical Geology, pp
520-526, 1997.
[8] M. Kanevski, M. Maignan, V. Demyanov, and M.-F. Maignan. How
neural network 2-D interpolations can improve spatial data analysis:
neural network residual kriging, IAMG 97 Int. Assoc. Mathematical
Geology, pp 549-554, 1997.
|