Small Sample Kernel Density and Renyi Entropy Estimation

 

Douglas Lake, PhD

Cardiovascular Division and Statistics

University of Virginia

 

Quadratic entropy rate is an important characteristic of heart rate that is effectively measured using a statistic called SampEn (short for sample entropy). Quadratic entropy is related to the Friedman-Tukey (FT) index which is widely used for projection pursuit and other applications involving measures of Gaussianity. The FT index for a random variable X with density f is simply R(f)=E[f(X)] or the integral of f-squared. The basic calculation of SampEn involves estimating joint densities f of time series with a rather simple implementation of kernel density estimation using a uniform kernel. Quadratic entropy is part of a family called Renyi entropy that includes the traditional Shannon definition. However, traditional entropy is much more difficult to analyze and understand which justifies alternative approaches with less optimal theoretical properties. A particularly compelling new application of quadratic entropy rate is in the non-invasive detection of atrial fibrillation (AF) and other abnormal cardiac rhythms in records with as few as n=16 samples. Asymptotic results for optimal bandwidth and kernel selection to minimize the mean integrated square error (MISE) are well-known, but for these problem results are needed for small samples and for the specific criteria of minimizing the mean square error (MSE) in estimating the FT index. Results to be presented include exact small sample expressions for MISE and MSE for the special case of Gaussian white noise and Gaussian Kernels and compared to the asymptotic results. One goal of this presentation is to expose students to new research areas (of which there are many) that they may want to learn more about.