Dr. Keogh is a Distinguished Professor of Computer Science at the University of California. He is the inventor of many of the most commonly used time series data mining primitives including, PAA, LBkeogh, UCR-Suite, the Matrix Profile, SAX, Time Series Motifs and Time Series Shapelets. The last six ideas have gone on to garner at least a thousand citations each.
With 32 papers, he is the most prolific author in the Data Mining and Knowledge Discovery journal and a top-ten most prolific author in ACM SIGKDD, IEEE ICDM and SIAM SDM (with 32/47/27 papers respectively).
He has won numerous awards, including: The Bell Labs Bronze Prize 2021, the ACM SIGKDD 2022 Test of Time Paper Award, the 2021 IEEE ICDM Research Contributions Award, Two Google Faculty Awards, and best paper awards at SIGKDD (twice), SIGMOD (1), ICDM (three times) and SDM. He is the creator of the UCR Time Series Classification Archive, which has been used in more than 5,000 research papers.
Talk title: Finding and Exploiting Repeated Structures in Medical Time Series
Abstract:
It is well understood that the main key to understanding discrete strings such as DNA is to reason about conserved structures, i.e. DNA motifs, both within and between chromosomes.
In this talk I will argue that conserved structures in real-valued time series can be just as useful and actionable. A motif in medical telemetry must have a cause, and in many cases those causes have a semantic interpretation, such as pulsus paradoxus in ECGS, eyeblinks in EOGs, K-complexes in EEGs etc. Once discovered, these motifs can be exploited by downstream algorithms such as classification, clustering, rule-discovery, segmentation, summarization, compression and anomaly detection.
I will further show that recent progress in time series data mining means that the discovery of time series motifs in large medical datasets is now practical with simple tools, and time series motifs ready to be exploited by researchers and medical professionals. I will illustrate my talk with case studies I conducted with leading cardiologists on real datasets. Finally, I will conclude my talk by providing resources such as simple-to-used code and datasets, that will allow the audience to start searching their datasets for time series motifs.