I. Introduction
Rhythmic Auditory Stimulation (RAS) is a widely used motor rehabilitation technique within Neurologic Music Therapy (NMT), particularly for gait training in patients with strock- or Parkinson's disease-related gait disorders. In clinical practice, the criteria for music selection in RAS are relatively intuitive: music with a regular, stable duple meter feel, such as simple duple meters (e.g. 2/4, 4/4) or certain compound meters (e.g., 6/8), is generally preferred for facilitating gait entrainment.
The effectiveness of RAS has been supported by numerous empirical studies over the past two decades. However, in practical therapy settings, there is often a noticeable shortage of clinically tailored musical material. As a result, selecting appropriate pieces from existing music collections becomes a more realistic solution for therapists. This situation raises the need for efficient methods to screen large music databases according to rhythm-related criteria in RAS.
The essay focuses on a rapid screening approach for categorizing meter fell—specifically duple versus triple—based on symbolic MIDI data. The screening method discussed here constitutes an early filtering module within a larger system designed for RAS music selection. Rather than presenting the system as a whole, the primary aim of this study is to examine this screening layer in depth, with particular attention to its conceptual foundations and algorithmic design.
The system operates in the symbolic domain rather than audio signals for several practical reasons. Fisrt, tempo adjustment is a fundamental requirement in gait training, and MIDI allows precise and lossless tempo modification without affecting sound quality. Second, given the need to process large volumes of music data, symbolic-domain analysis offers a computationally efficient solution. Although audio-based music information retrieval methods can provide higher perceptual fidelity in certain contexts, the purpose of this layer is limited to fast and coarse screening rather than detailed rhythmic interpretation.
In essence, the goal of the proposed screening method is not to perform full meter inference or time-signature recognition, but to identify music with a high likelihood of exhibiting a duple meter feel suitable for RAS. The following sections therefore examine the approach from conceptural assumptions to algorithmic implementation, providing a focused and critical study of this symbolic screening layer.
II. Conceptual Foundations
Before introducing the algorithmic details, it is necessary to clarify several core concepts related to musical time and rhythm perception that underlie the proposed screening approach.
In music perception, tactus, often referred to as the pulse, represents the fundamental temporal unit to which listeners naturally synchronize, such as tapping a foot or nodding the head. Tempo describes the rate at which this tactus is perceived, typically expressed as beats per minute (BPM). When successive tactus beats are organized into recurring patterns of strong and weak accents, a higher-level periodic structure emerges, commonly referred to as the metrical cycle or meter. In simplified terms, tempo specifies how fast the beat proceeds, whereas meter describes how beats are grouped over time.
The hierarchical temporal structures of music is what gives rise to the periodic nature of rhythm and enables listeners to infer meter. At the same time, this hierarchy, which is spanning from note-level onsets to beat- or cycle-level groupings, also makes automatic meter estimation a challenging task, as multiple periodicities may coexist and interact within a single piece of music.
In the context of RAS, meter is not treated as a purely music-theoretical construct but as a functional property related to motor entrainment. Although triple-meter music can be useful in specific scenarios (e.g., when walking patterns involve assistive devices canes), duple meter-dominant music is generally preferred for standard gait training. Consequently, the primary objective of the screening layer is not to identify all possible meter types, but to perform a binary distinction between duple- and triple-dominant periodic grouping in music.
As the representational basis for this analysis, MIDI data offers several benefits. It encodes music as a sequence of discrete symbolic events with explicit on/off times, allowing for highly accurate temporal analysis. At the same time, MIDI lacks timbral, spectral, and fine-grained expressive information present in audio recordings. In clinical applications such as RAS, where precise temporal control and flexible tempo adjustment are prioritized, this trade-off is acceptable and even advantageous. For these reasons, MIDI provides an appropriate representation for the screening approach.


