Show simple item record

dc.contributor.authorPraus, Petr
dc.contributor.authorPraks, Pavel
dc.date.accessioned2007-05-09T13:29:18Z
dc.date.available2007-05-09T13:29:18Z
dc.date.issued2007
dc.identifier.citationJournal of Hydroinformatics. 2007, vol. 9, no. 2, p. 135–143.en
dc.identifier.issn1464-7141
dc.identifier.urihttp://hdl.handle.net/10084/60005
dc.language.isoenen
dc.publisherIWA Publishingen
dc.relation.ispartofseriesJournal of Hydroinformaticsen
dc.relation.urihttp://dx.doi.org/10.2166/hydro.2007.003en
dc.relation.urihttp://www.iwaponline.com/jh/009/jh0090135.htm
dc.subjecthydrochemistryen
dc.subjectinformation retrievalen
dc.subjectlatent semantic indexingen
dc.subjectprincipal component analysisen
dc.subjectsimilarityen
dc.titleInformation retrieval in hydrochemical data using the latent semantic indexing approachen
dc.typearticleen
dc.identifier.locationNení ve fondu ÚKen
dc.description.abstract-enThe latent semantic indexing (LSI) method was applied for the retrieval of similar samples (those samples with a similar composition) in a dataset of groundwater samples. The LSI procedure was based on two steps: (i) reduction of the data dimensionality by principal component analysis (PCA) and (ii) calculation of a similarity between selected samples (queries) and other samples. The similarity measures were expressed as the cosine similarity, the Euclidean and Manhattan distances. Five queries were chosen so as to represent different sampling localities. The original data space of 14 variables measured in 95 samples of groundwater was reduced to the three-dimensional space of the three largest principal components which explained nearly 80% of the total variance. The five most proximity samples to each query were evaluated. The LSI outputs were compared with the retrievals in the orthogonal system of all variables transformed by PCA and in the system of standardized original variables. Most of these retrievals did not agree with the LSI ones, most likely because both systems contained the interfering data noise which was not preliminary removed by the dimensionality reduction. Therefore the LSI approach based on the noise filtration was considered to be a promising strategy for information retrieval in real hydrochemical data.en
dc.identifier.doi10.2166/hydro.2007.003 (chybné)
dc.identifier.wos000245349700005


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record