Optimization of shared bicycle location in Wuhan city based on multi-source geospatial big data

With urban development and a growing focus on sustainability, shared bicycles have become a popular mode of eco-friendly transportation. However, issues like chaotic parking, excessive deployment, and suboptimal distribution need solutions. This study analyzes shared bicycle usage in Wuhan, focusing on time and location patterns to inform better docking point selection. It uses GPS data from Mobike shared bicycles in October 2018, point of interest (POI) data, and population distribution data for Wuhan. The research employs mathematical and statistical analysis, spatial analysis, geographical detectors, and optimization methods. The study uncovers usage patterns, highlights travel trends, and identifies the relationship between factors like POIs, public transportation facilities, population density, and shared bicycle usage. It establishes a model called the Maximum Covering Location Problem (MCLP) for selecting docking points, considering demand intensity. In summary, this study deepens our understanding of shared bicycle usage in Wuhan, provides a model for selecting docking points, and offers valuable insights for urban transportation planning and sustainability.


INTRODUCTION
In recent years, low-carbon living has become a significant topic.Nie et al. [1] pointed out that the deployment of the carbon capture, utilization, and storage supply chain in China is showing a trend from coastal areas to inland regions.In this context, shared bicycles have been widely adopted due to their environmentally friendly mode of transportation [2] .The study of location theory can be traced back to 1909 when Weber [3] explored the layout of warehouses to minimize the distance between warehouses and customers.In 1964, Hakimi [4] introduced the p-median and pcenter problems on networks, marking a pivotal moment in location theory research.Church and ReVelle [5] were the first to propose the maximum coverage location problem, emphasizing the placement of service stations within network nodes.Vahidnia et al. [6] and colleagues introduced a hybrid decision framework based on the Analytic Hierarchy Process (AHP) and Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) for location selection.Researchers have been striving to enhance the accuracy of models for optimizing shared bicycle location selection.Wang et al. [7] employed deep learning to solve the MCLP model in urban spatial computation, offering a new perspective for addressing spatial optimization problems.Liang et al. [8] introduced a new paradigm that combines graph convolutional networks and a greedy algorithm to solve the p-center problem through direct training.In practice, Park et al. [9] optimized shared bicycle locations using both the p-median model, which has spatial fairness advantages, and the MCLP model, which improves station coverage efficiency.Çelebi

ABSTRACT
With urban development and a growing focus on sustainability, shared bicycles have become a popular mode of eco-friendly transportation.However, issues like chaotic parking, excessive deployment, and suboptimal distribution need solutions.This study analyzes shared bicycle usage in Wuhan, focusing on time and location patterns to inform better docking point selection.It uses GPS data from Mobike shared bicycles in October 2018, point of interest (POI) data, and population distribution data for Wuhan.The research employs mathematical and statistical analysis, spatial analysis, geographical detectors, and optimization methods.The study uncovers usage patterns, highlights travel trends, and identifies the relationship between factors like POIs, public transportation facilities, population density, and shared bicycle usage.It establishes a model called the Maximum Covering Location Problem (MCLP) for selecting docking points, considering demand intensity.In summary, this study deepens our understanding of shared bicycle usage in Wuhan, provides a model for selecting docking points, and offers valuable insights for urban transportation planning and sustainability.et al. [10] combined set covering and queuing models to allocate user demand to stations for evaluating service levels.Conrow et al. [11] used coverage models to determine the required bike stations considering investment levels for network and population coverage.Furthermore, Rabab et al. [12] introduced a novel approach that combines the maximum coverage location problem with the importance of bike stations to optimize bike station locations and maximize demand coverage within specific distances.
In summary, despite extensive research focusing on shared bicycle location selection, there is still a need for analysis based on multisource geospatial big data.This paper addresses this gap by exploring the impact of multi-source geospatial big data on the distribution of shared bicycles and integrating it into location selection optimization.

Study Area
Wuhan, a prominent city in central China with over 12 million residents as of 2023, faces challenges in its bike-sharing system, including uneven bike distribution, shortages in some areas, and congestion in popular spots.These issues affect both the bikesharing system's efficiency and the city's overall transportation.Optimizing bike-sharing site selection in Wuhan is crucial to improve commuting experiences, enhance bike-sharing coverage, and ultimately, enhance the city's transportation system.Wuhan's experiences can also serve as a valuable reference for similar cities, promoting sustainable bike-sharing development and innovations in urban transportation systems nationwide.

Data Sources
The shared bicycle data used in this study consists of GPS location data from Mobike shared bicycles, covering GPS location data in Wuhan for two days: Monday, October 1, and Sunday, October 7, 2018.This dataset includes timestamp information, shared bicycle IDs, and latitude and longitude coordinates, totaling 100,342,626 data points.Additionally, Point of Interest (POI) location data were collected by web scraping from the Amap website in September 2018.These POI entries include information about their location, category, and other relevant details, encompassing 14 categories such as dining services, scenic spots, public facilities, corporate businesses, shopping services, transportation facilities, financial and insurance services, educational and cultural services, business services, lifestyle services, sports and leisure services, healthcare services, government institutions and social organizations, as well as accommodation services.Furthermore, population data for the research area was obtained from the World Pop website, offering data at resolutions of 100 meters and 1 kilometer.Through a comprehensive analysis of this data, the study aims to uncover shared bicycle usage patterns, population distribution characteristics, road network conditions, and the distribution of nearby POI facilities.These findings will serve as a scientific guide for optimizing shared bicycle location selection.

Methods
The research methods employed in this study include Kernel Density Analysis, Buffer Analysis, Geographic Detector, and the Maximum Coverage Location Problem Model.In this study, we established a grid system.Within this grid system, we recorded the number of bicycle borrowing and returning points in each grid cell and analyzed their relationship with various influencing factors.We calculated the weight of each influencing factor, thereby obtaining the demand intensity for each grid.The final location selection was optimized to maximize the sum of demand intensities covered by shared bicycle parking points.Specifically, we employed methods such as Kernel Density Analysis and Buffer Analysis to analyze various influencing factors for shared bicycles, including Points of Interest (POI) data, distances to subway and bus stations, and population data.We aggregated multiple Points of Interest to calculate a comprehensive POI index and assigned weights to these influencing factors using Geographic Detectors [13]   .
We then divided the research area into small square grids with a side length of 100 meters.Within these grids, we computed the numbers of shared bicycle rentals and returns, the POI index, distances to subway and bus stations, and population data.Based on the assigned weights, we calculated the demand intensity for each grid.Considering the demand intensity for each grid and the distribution of demand points, we employed a heuristic algorithm to solve the MCLP model and obtain the final location selection results.

Figure 2:
The process of bike-sharing site selection optimization.

Spatial Distribution of Bike-sharing
We import the bike-sharing trip records extracted from GPS location data, including the bike borrowing statistics table and the bike returning statistics table.Divide the study area into a grid with a size of 1 kilometer and calculate the number of bike borrowings and returns within each grid cell.Finally, extract the bike-sharing ride destination distribution.

Factors Influencing the Spatial Distribution of Bike-sharing
Based on the spatial distribution of shared bicycles, we have studied the characteristics of areas with a concentration of shared bicycles, such as a high level of economic development, convenient public transportation, and high population density.Therefore, we conducted an in-depth analysis of the relationship between these characteristics and the usage of shared bicycles.

Points of Interest
Points of Interest (POI) to some extent can represent the level of land development in a particular area and can indicate the economic status of that area.Therefore, analyzing POIs helps establish a direct connection between shared bicycles and economic develop-ment.Through this analysis, we found significant correlations between the distribution of shared bicycles and Points of Interest such as dining, shopping, finance, business, lifestyle services, and sports and leisure.These findings are valuable for studying the usage patterns of shared bicycles.To quantify the impact of Points of Interest on shared bicycle usage, we employed the entropy weight method to calculate the influence of each POI and obtained a comprehensive Points of Interest index through weighted summation.
Table 1.The Results of Entropy Weighting and Correlations

Urban Public Transportation Facilities
Shared bicycles provide a crucial solution to the "last-mile" transportation challenge, particularly for short to medium-distance journeys and enhancing connectivity with urban public transportation networks.Existing research highlights the substantial influence of public transportation facility locations on the distribution of shared bicycles.These facilities primarily comprise subway stations and bus stops.To evaluate their correlation with shared bicycle distribution, we conducted separate analyses for these two types of facilities.In our subway station analysis in Wuhan, we employed circular buffer zones with different radii to scrutinize shared bicycle borrowing and returning activities.For bus stops, given their dense presence in Wuhan, we adopted buffer zones with radii of 100 meters, 300 meters, and 500 meters to ensure a reliable analysis without overly broad coverage.In summary, the proximity of shared bicycles to public transportation facilities significantly affects their utilization.We can gauge the accessibility of public transportation for each grid cell by measuring the distance to facility locations.Notably, subway stations and bus stops have distinct impacts on shared bicycle usage, necessitating separate assessments.

Population
The study conducted a kernel density analysis of shared bicycle borrowing and returning points in Wuhan city and incorporated population data for Wuhan in 2018.It can be concluded that there is a significant correlation between the frequency of shared bicycle usage and population density.In densely populated areas such as Wuchang District, Hankou District, Jianghan District, and others, the borrowing and returning frequency of shared bicycles is the highest.

Location selection
The Maximum Coverage Location Problem (MCLP) is a classic optimization problem that aims to select the best facility locations from possible candidates to maximize the coverage of demand points within a given service range.Typically, this problem involves N demand points and M candidate facility locations.Each facility has a predefined service range within which it can provide service to nearby demand points.In this study, we used a Genetic Algorithm (GA) to solve the MCLP model, identifying 500 deployment points for shared bicycles, each with a service radius of 500 meters.We defined specific parameters, including a maximum of 100 iterations, 25 individuals per generation, with 10 individuals selected for reproduction, and a 30% mutation probability, to address this challenge.The final solution satisfies the maximum demand intensity.More details of the source code for this study are available at https://github.com/HIGISX/hispot.

CONCLUSION
This study primarily combines the analysis of multi-source geospatial big data related to shared bicycle travel distribution in Wuhan City.It identifies the significant factors influencing shared bicycle distribution, determines the weights of these factors, defines demand points and candidate points, and calculates the demand intensity for each demand point.Based on this, a genetic algorithm is employed to address the problem of optimizing the deployment of shared bicycles in Wuhan City.This research offers valuable guidance for the development of shared bicycles in Wuhan City.However, there are still some limitations in this study: it exclusively uses a genetic algorithm as a part of the heuristic approach for problem-solving without conducting comparative analyses with other optimization algorithms or solvers.Additionally, the shared bicycle travel data used in this study were recorded during the National Day holiday period in Wuhan City, which may differ from daily travel patterns.

Figure 3 :
Figure 3: The distribution of bike-sharing in Wuhan City on weekdays(left) and weekends(right).

Figure 4 :
Figure 4: The bike-sharing distribution density around subway Stations (left), Bike-sharing pick-up(top right) and drop-off counts around bus stops(bottom right).

Figure 6 :
Figure 6: Distribution Kernel Density Map of Shared Bicycles in Wuhan City (Left) and Population Density Map (Right)

Figure 6 :
Figure 6: The Bike-sharing Site Selection Results Obtained Using Genetic Algorithm for the MCLP Model.