Publication:
Optimal distance estimation between compressed data series

Placeholder

Program

KU Authors

Co-Authors

Freris, Nikolaos M.
Vlachos, Michail

Advisor

Publication Date

Language

English

Journal Title

Journal ISSN

Volume Title

Abstract

Most real-world data contain repeated or periodic patterns. This suggests that they can be effectively represented and compressed using only a few coefficients of an appropriate complete orthogonal basis (e.g., Fourier, Wavelets, Karhunen-Loeve expansion or Principal Components). In the face of ever increasing data repositories and given that most mining operations are distance-based, it is vital to perform accurate distance estimation directly on the compressed data. However, distance estimation when the data are represented using different sets of coefficients is still a largely unexplored area. This work studies the optimization problems related to obtaining the tightest lower/upper bound on the distance based on the available information. In particular, we consider the problem where a distinct set of coefficients is maintained for each sequence, and the L2-norm of the compression error is recorded. We establish the properties of optimal solutions, and leverage the theoretical analysis to develop a fast algorithm to obtain an exact solution to the problem. The suggested solution provides the tightest provable estimation of the L2-norm or the correlation, and executes at least two order of magnitudes faster than a numerical solution based on convex optimization. The contributions of this work extend beyond the purview of periodic data, as our methods are applicable to any sequential or high-dimensional data as well as to any orthogonal data transformation used for the underlying data compression scheme.

Source:

Proceedings of the 12th SIAM International Conference on Data Mining, SDM 2012

Publisher:

Society for Industrial and Applied Mathematics Publications

Keywords:

Subject

Engineering, Electrical and electronics engineering

Citation

Endorsement

Review

Supplemented By

Referenced By

Copyrights Note

0

Views

0

Downloads

View PlumX Details