ABSTRACT
This paper presents a deep learning based method that generates body motion for string instrument performance from raw audio. In contrast to prior methods which aim to predict joint position from audio, we first estimate information that dictates the bowing dynamics, such as the bow direction and the played string. The final body motion is then determined from this information following a conversion rule. By adopting the bowing information as the target domain, not only is learning the mapping more feasible, but also the produced results have bowing dynamics that are consistent with the given audio. We confirmed that our results are superior to existing methods through extensive experiments.
Supplemental Material
Available for Download
- Hsuan-Kai Kao and Li Su. 2020. Temporally Guided Music-to-Body-Movement Generation. In Proceedings of the 28th ACM International Conference on Multimedia. 147–155.Google Scholar
Digital Library
- Nozomi Kugimoto, Rui Miyazono, Kosuke Omori, Takeshi Fujimura, Shinichi Furuya, Haruhiro Katayose, Hiroyoshi Miwa, and Noriko Nagata. 2009. CG animation for piano performance. In SIGGRAPH’09: Posters.Google Scholar
- Eli Shlizerman, Lucio Dery, Hayden Schoen, and Ira Kemelmacher-Shlizerman. 2018. Audio to body dynamics. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7574–7583.Google Scholar
Cross Ref
Index Terms
Bowing-Net: Motion Generation for String Instruments Based on Bowing Information
Recommendations
Audio-Driven Violin Performance Animation with Clear Fingering and Bowing
SIGGRAPH '22: ACM SIGGRAPH 2022 PostersThis paper presents an audio-to-animation synthesis method for violin performance. This new approach provides a fine-grained violin performance animation using information on playing procedure consisting of played string, finger number, position, and ...
Regular expressions as violin bowing patterns
String players spend a significant amount of practice time creating and learning bowings. These may be indicated in the music using up-bow and down-bow symbols, but those traditional notations do not capture the complex bowing patterns that are latent ...
Onset Detection for String Instruments Using Bidirectional Temporal and Convolutional Recurrent Networks
AM '23: Proceedings of the 18th International Audio Mostly ConferenceRecent work in note onset detection has centered on deep learning models such as recurrent neural networks (RNN), convolutional neural networks (CNN) and more recently temporal convolutional networks (TCN), which achieve high evaluation accuracies for ...




Comments