Semi-supervised time series classification on positive and unlabeled problems using cross-recurrence quantification analysis

•We show time-domain similarity measurements lead to inconsistent classification due to the noise and local differences;•We use CRQA to compare time series recurrences on Positive and Unlabeled scenarios;•Our approach has achieved better classification performances while classifying time series from...

Full description

Saved in:
Bibliographic Details
Published inPattern recognition Vol. 80; pp. 53 - 63
Main Authors de Carvalho Pagliosa, Lucas, de Mello, Rodrigo Fernandes
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.08.2018
Subjects
Online AccessGet full text
ISSN0031-3203
1873-5142
1873-5142
DOI10.1016/j.patcog.2018.02.030

Cover

More Information
Summary:•We show time-domain similarity measurements lead to inconsistent classification due to the noise and local differences;•We use CRQA to compare time series recurrences on Positive and Unlabeled scenarios;•Our approach has achieved better classification performances while classifying time series from natural phenomena. When dealing with semi-supervised scenarios, the Positive and Unlabeled (PU) problem is a special case in which few labeled examples from a single class of interest are received to proceed with the classification of unseen instances, according to their similarities with the known class. In the scope of time series, most of the current studies propose to address this subject using a self-training approach based on the 1-Nearest Neighbor algorithm. In order to compute the most similar instance, they compare features along the time domain using the Euclidean Distance and the Dynamic Time Warping-Delta. Despite time-domain measurements permit the analysis of local series shapes, they disconsider temporal recurrences commonly found in natural phenomena (e.g. population growth, climate studies) and are more sensitive to local noise and fluctuations, leading to poor classification performances as confirmed in this paper. This drawback motivated us to propose the use of the Maximum Diagonal Line of the Cross-Recurrence Quantification Analysis (MDL-CRQA), applied on the time series phase space, as similarity measurement. The phase space is obtained after applying Takens embedding theorem on the series, unfolding temporal relationships and dependencies among data observations. As consequence, by comparing phase spaces rather than the series themselves, we can assess how their trajectories evolve along time, including their periodicities and temporal cycles, as well as decreasing noise influences. Experimental results confirm MDL-CRQA improves classification results for PU time series when compared against the mostly used time-domain similarity measurements.
ISSN:0031-3203
1873-5142
1873-5142
DOI:10.1016/j.patcog.2018.02.030