
DONKEY: A Flexible and Accurate Algorithm for Clustering
- Jakub Kára, Kyle Acheson, and Adam Kirrander
- Publication
- May 2, 2025
Abstract:
We propose an accurate clustering algorithm suitable for the varied and multidimensional data sets that correspond to temporal snapshots from on-the-fly nonadiabatic trajectory-based simulations of photoexcited dynamics. The algorithm approximates the underlying probability density function using variable kernel density estimation, with local maxima corresponding to cluster centers. Each data point is then assigned to one of the maxima by employing a maximization procedure. Finally, clusters artificially separated by minor fluctuations in the probability density are merged. The algorithm does not require parameter tuning, which ensures flexibility and reduces the risk of bias. It is tested on several synthetic data sets, where it consistently outperforms conventional clustering algorithms. As a final example, the algorithm is applied to the excited dynamics of the norbornadiene ⇌ quadricyclane (C7H8) molecular photoswitch, demonstrating how distinct reaction pathways can be identified.
Additional Resources
DOI:
Quick Ref:
J. Chem. Theory Comput. 2025, 21, 12, 5789–5802