Apply Machine-Learning Model for Clustering Rowing Players

Rowing, as a sport composed of single player or multiple players, performs body movements under certain rhythm with slight variation. The analysis of rhythm alternation or match is important on rowing research and merit our study. Therefore, this study analyzes the rowing movements by the following three procedures, rowing cycle segmentation, feature extraction, rowing cycle clustering. The rowing cycle segmentation procedure segments each player's video to videos of single cycle under the analysis of MediaPipe detected joint points. The feature extraction procedure calculates features from each rowing cycle by selecting amplitudes, angles, angular speeds of 4 selected joint points. At last, the rowing cycle clustering procedure analyzes all one-cycled videos using above features by different clustering and scoring methods. Three clustering methods, including K-means, Birch, and Gaussian-mixture, are experimented in this study for finding the most efficient one. A hybrid measurement from the Silhouette score, the Calinski-Harabasz index, and the Davies-Bouldin index, is proposed for finding the optimal clusters number. Experimental results of 15 players’ videos show that applying K-means clustering algorithm with the proposed hybrid measurement performs better for finding the rowing group.


INTRODUCTION
Recently, many studies have been conducted for analyzing the performance of players in sports domains [1][2][3].While there are many studies that focus on motion analysis, most require the athletes to wear sensors for acquiring their posture data; however, it is easy to accompany with noise due to friction during motions [1].Therefore, the video-based contactless approach has made it easier and more convenient for researchers to analyze the variety of postures in the exciting sports domain [2,3].Many applications of video-based motion analysis have been developed in the past few years, such as the human-computer interaction system [2], the human action understanding system [3], and human pose estimation [4], as based on deep learning with recognition models.It has been noted that the OpenPose system [4] is useful for acquiring the human skeletal joints of complex postures from a simple camera without using additional special hardware to recognize the human joints, such as the Kinect device [2].However, as most previous applications are relevant only for recognizing simple postures, the issue of how to analyze the complex postures of sports players has been neither addressed nor effectively analyzed.
Several previous works [4][5][6][7] have utilized the OpenPose model [4] for various applications.Qiao et al. [5] used a series of coordinate trajectories of joint points to draw a curve for evaluating Tai Chi movements.Tsai et al. [6] used the OpenPose model to estimate the depth distance between the person and the lens in a single image.Nakai et al. [7] proposed a prediction method for a basketball shooting system by using human body keypoints detected from the OpenPose model.In addition, several applications based on deep learning have also been proposed.Xiao et al. [8] proposed a simple baseline method based on ResNet architectures [9] as the backbone network for human pose estimation and tracking.Zhang et al. [10] proposed a golf analysis system that includes a human detection subsystem that are based on the OpenPose model, human tracking, and LSTM deep learning model to detect golf players' postures.Theagarajan et al. [11] proposed an CNN and DCGANbased automatically generated visual analytics and player statistics system for soccer players.Moreover, the issue of pairing athletes to work together for enhancing the performance of a team sport has not been addressed.
This research studies the players matching problem on rowing.This study adopts the MediaPipe model [12,13] for human joint points extraction because of its good efficiency.The proposed method includes three procedures of rowing cycle segmentation, feature extraction, and rowing cycle clustering.By using joint points information detected from the MediaPipe, the proposed rowing cycle segmentation procedure segments all 15 players' one minute rowing video to collections of one-cycled rowing video.The proposed feature extraction procedure extracts 12 features from each one-cycled rowing video by Discrete Fourier Transform [14].At last, the proposed rowing cycle clustering procedures compares three clustering methods, including K-means, BIRCH, and Gaussian Mixture [15,16], with a hybrid cluster number measurement based on Silhouette score, Calinski-Harabasz index, and Davies-Bouldin index [16,17] for finding optimal group match among lots of rowing players.
This study was organized into the following sections.Section 2 gives a brief review of related works, including the MediaPipe system, Discrete Fourier Transform, and clustering methods.Section 3 provides the details of the proposed method.Section 4 presents the experimental results and Section 5 offers concluding remarks.

RELATED WORKS 2.1 MediaPipe pose detection system
In the current landscape, a plethora of technologies has been conceived to facilitate the monitoring of bodily movements, obviating the need for direct attachment of sensors to the physique, all the while enabling the acquisition of commensurate data.In this context, the scholarly endeavor titled "Human deep squat detection method based on MediaPipe combined with Yolov5 network" leverages an artificial intelligence (AI) technology recognized as MediaPipe, a framework introduced by Google.This framework encompasses open-source libraries and tools tailored for video-based human pose evaluation [12].MediaPipe's proficiency in capturing salient landmarks pertaining to hands, faces, and the overall body bestows researchers with pivotal data points, thus furnishing valuable insights into human behavior measurement.Data points wich are extracted from MediaPipe include 33 landmark coordinates.These landmark coordinates proffer researchers with the means to implement their own innovative systems [13].Within the ambit of the study, the utilization of data sourced from MediaPipe has been instrumental in expediting the monitoring and recording of individuals engage long in squat exercises.

Clustering Method
In the research study titled "A Gaussian mixture clustering model for characterizing football players using the EA Sports' FIFA video game system, " the approach involves utilizing data from exemplary football players within the game as attributes.This method encompasses the integration of various data points, encompassing as many as 40 attributes.Given the absence of definitive outcomes suitable for direct comparison, the researchers opt to devise a clustering model.This model serves the purpose of categorizing football players into distinct groups based on their attributes.

Research Gaps
Regarding the research gap in our study, we have formulated the foundational framework for development by incorporating preliminary concepts.We have expanded certain aspects, such as the application of physics principles to augment the collection of additional data points.These physics-based features are derived from various measurements of body articulations captured by MediaPipe.This augmentation aims to provide a broader perspective for data analysis.Subsequently, the obtained results are subjected to clustering through the utilization of a clustering model.An essential focus of this approach is the creation of a comprehensive decision support system, catering specifically to the needs of rowing sports management.By amalgamating the augmented data and the clustered results, we aspire to fashion an adept Decision Support System.This system is intended to empower sports managers and coaches with valuable insights for informed decision-making.Ultimately, our research endeavors to bridge the gap between the intricate biomechanics of rowing and its practical application in sports management.

PROPOSED METHOD 3.1 Dataset
The dataset is composed of 15 side shot videos of players playing on rowing machine.Each video is recorded by the resolution of 1920*1080, 60 frames per second, and duration of around 70 seconds.Figure .2 shows an example of video capturing image.Moreover, all videos are recorded on the same direction and position, for minimizing the variance between different players.

Feature extraction
From MediaPipe, important joint points can be detected from an input image.Since our study selects angles of four joint points, defined as right_shoulder, right_elbow, right_hip, and right_knee in Figure 1, their neighboring joint points are adopted for calculating angles of these four joint points.Moreover, locations of 6 joint points are needed for calculating angles of these four joint points.Relationships between angles and their corresponding joint points are listed in Table 1.
By using MediaPipe, locations of selected joint points, defined in Table 1, are extracted, and applied to (2) for calculating angles of  selected joint points.
( 1 ,  1 ), ( 2 ,  2 ), ( 3 ,  3 ) represent positions in  and  axes of joint points 1, 2, 3, respectively.Therefore, 4 angles are calculated from each frame.Although the locations of all joint points may be roughly detected, the calculated angles are quite close to those we have observed.Figure 2 show results of calculated angles among some rowing examples, in which the calculated angles are displayed at the right-bottom of the frame image.Moreover, the angles of each frame, calculated from 4 points, are depicted as Figure 3. Figure 3 also shows that the angles of four joint points are repeatedly changed.This property illustrates that suitable thresholds on angles of joint points may segment all frames to collections of one-cycled videos.
Since each rowing cycle of the same player may not always be the same, the proposed method adopts heuristic rules to extract each rowing cycle.From moving forward to backward, the detection criteria are the hip angle and knee angle being small enough.On the contrary, from moving backward to forward, the detection criterion is the elbow angle being small enough.Table II lists the criteria for detecting the change point of pose.With heuristically defining three thresholds ℎ ℎ , ℎ  , and ℎ  , the proposed method detects the pose correctly.The criteria in Table II shows that from forward to backward, angles of the player's hip and knee will be small enough.Moreover, from backward to forward, angle of the player's elbow will be small enough.Furthermore, the angle of shoulder does not significantly difference between forward and backward.Therefore, angle of shoulder does not be included in detecting rowing cycle.
After angles of important joint points are calculated, the angular speed , defined as the rate change of angular displacement, is calculated from (3).
in which,  is the radian of central angle that object moves, t is time, T is the time spent in a cycle of movement, and f is movement cycle occurring in one second.Furthermore, the angular speed can be also calculated by (4), in which,  1 and  2 represent the initial degree and the final degree, respectively.Moreover, t 1 and t 2 represent the initial time and the final time, respectively.The proposed method adopts angles of  important joint points to detect players' rowing movement.Frequency results of these angles, as shown in Figure 4, will be then normalized for further processing.Since Fourier Transform is the frequency after converting time domain into frequency domain, the proposed method calculates 4 features for measuring the movement.These 4 features are composed of amplitudes of shoulder, elbow, hip, and knee.Furthermore, the amplitude of each angle shares the same frequency.Based on the Fourier theory, frequency values represent the number of rowing cycles exhibited by the player.Additionally, amplitudes among different players may exhibit different, as shown in Figure 4. Therefore, significance of the selected features among different players can be identified.By using angles of one joint point, the angular speed can be then acquired from (3) or (4).Figure 5 shows angular speed of two different joint points, elbow, and hip, of one player.

Proposed Method
This section introduces the proposed rowing analysis method.Figure 6 shows flowchart of the proposed method, in which three procedures of rowing cycle segmentation, feature extraction, and rowing cycle clustering are included.In the first procedure of rowing cycle segmentation, all players' videos are input and segmented to one-cycled videos.Each one-cycled video is composed of frames determined during a cycle of moving forward and backward by using features calculated from joint points locations extracted from the MediaPipe.Moreover, the calculation is performed based on the angle calculation from (2) and rowing cycle detection by Table 2.The second procedure is featuring extraction, that extracts 12  1)-( 4).Since the amplitude features are calculated, the Fourier Transform is therefore needed for the computation.The third procedure is rowing cycle clustering.After we extract 12 features from each one-cycled rowing video in the second procedure, 3 different clustering methods and 3 scoring methods are applied for finding better strategy on analyzing these one-cycled rowing videos.Furthermore, the analysis focuses on multi-players' matching problem for finding optimal match of 2 players.The features acquired from one-cycled video include amplitudes, angles, and angular speeds of 4 joint points.These features are then applied to clustering methods with different scoring measurements for finding a more efficient way for matching different players.

Proposed hybrid measurement
Since 3 cluster methods and 3 scoring methods are experimented, therefore, a hybrid measurement from 3 scoring methods is presented.Three measurements, defined as Silhouette score, Calinski-Harabasz index, and Davies-Bouldin index, are applied for finding optimal cluster number.These three measurements have their different properties of goodness.The Silhouette score and the Calinski-Harabasz index have the property of being better when higher.However, the Davies-Bouldin Index has the property of being better when smaller.Therefore, a hybrid measurement is calculated in (5) for evaluating these three measurements together to find an optimal k value.
S k , C k , D k represents normalized Silhouette score, normalized Calinski-Harabasz Index, and normalized Davies-Bouldin Index, respectively.

EXPERIMENTAL RESULTS
This section demonstrates experimental results of the proposed method.There are 15 players with each player recording one minute rowing video.Table 3 shows the rowing cycle of each player that the rowing cycle of each player ranges from 17 to 24, according to how fast they practiced.Moreover, the total number of one-cycled rowing video is 322.Therefore, the data number for the following clustering method is also 322.Given the close hybrid measurements for several k values, Figure 7(b) highlights that k=13 represents the optimal measurement.In Figure 7(c), the hybrid metrics for the Gaussian Mixture clustering approach are presented, indicating that k=15 yields the optimal measurement.Based on the optimal selection of k =13 in Figure 8(a), the Kmeans clustering results are given.Notedly, since each rowing cycle obtains 12 features, each data point in the clustering map of Figure 8(a) represents one-cycled rowing video.Therefore, the relationships between data points are not like the relationships in the clustering map due to the usage of dimension reduction.Moreover, Figure 8 indicates that there are 3 groups of two players, named as (2,3), (0,14), and (9,12).
The experimental results in Figures 8, 9, 10 show that although different groups acquired but leads to similar results.Table 5 lists the comparisons of 3 clustering methods, optimal k value and matching results.In Table 5, comparisons among three clustering methods,  the optimal k value, and corresponding results are enumerated.Table 5 demonstrates that different k values may be identified depending on the clustering method employed.Regarding the matching results, (2,3) and (0,14) are consistently clustered across all three methods.Furthermore, (9,12) is grouped when using both the K-means and BIRCH clustering techniques.Consequently, based on these experimental findings, the K-means clustering method appears to yield the most efficient matching results.

CONCLUSION
This study proposes efficient rowing analysis method by clustering algorithm.Three procedures of rowing cycle segmentation, feature extraction, and rowing cycle clustering are presented for finding group match.In the initial step of the rowing cycle segmentation process, heuristically determined thresholds are used to segment all one-cycled rowing videos.Then, the second feature procedure extracts 12 features from each one-cycled rowing video and for the third rowing cycle clustering procedure.By investigating three distinct clustering techniques and employing three scoring approaches, the best groupings can be achieved using the K-means algorithm.Furthermore, the BIRCH clustering method outperforms the Gaussian Mixture.Moreover, the proposed hybrid scoring measurement, composed of the Silhouette score, the Calinski-Harabasz index, and the Davies-Bouldin index, determines optimal k value efficiently.Analysis of each one-cycled rowing video for measuring the quality of one player's video merits our future study.

Figure 6 :
Figure 6: Flowchart of the proposed rowing analysis method

Figure 7 (
a) illustrates the standardized evaluations of Silhouette, Calinski Harabasz, Davies Bouldin, and a combined metric computed from equation 5).Additionally, Figure 7(a) indicates that the most favorable clustering number is k=13 within the range of k values spanning from 2 to 15. Figure 7(b) displays the combined metrics for the BIRCH clustering technique.

Table 3 :
Rowing cycles detected from each player.

Table 4 :
measurements of k-means clustering method.