Research and Implementation on the Mechanism of Near-Space Exploration Data Management and Sharing Service

In view of the characteristics of near-space exploration data, such as complicated types, diverse sources, multiple disciplines and multiple parameters, a core metadata model of near-space exploration data is designed to meet the needs of data collection, data management and data sharing service. The mechanism of near-space exploration data management and sharing service is described in the lifecycle of near-space exploration data, including data collection and receiving, data classification, data quality control. The first domestic platform of near-space exploration data management and sharing service is developed based on the mechanism of near-space exploration data management and sharing service, and a high-performance spatio-temporal data sharing service framework integrating access control, query and retrieval, sharing and distribution, and statistical analyses is designed. The platform, which has collected 567 datasets with a total data volume of over 130 TB through practical application, provides an effective support for the follow-up exploration of near-space.


INTRODUCTION
Near-space exploration holds great strategic significance for studying various physical fields and their changing laws in near-space, for revealing the laws and mechanisms of near-space lightning, for understanding the coupling mechanism between the upper and lower layers of the Tibetan Plateau, and for recognizing the influence of the near-space environment on the movement laws of the lower atmosphere, as well as the mechanism of the coupling of surface atmospheric, terrestrial and oceanic environments.
In March 2018, the Class A Strategic leading science and technology project of the Chinese Academy of Sciences "Scientific Experimental System in Near-space" (hereinafter referred to as the "Honghu" ) was officially launched.The project aims to focus on the near-space environment and carry out the most comprehensive exploration of the whole area of near-space with the most complete range of parameters so far.Since the launch of the project, a range of comprehensive scientific experiments have been carried out on the Tibetan Plateau and Inner Mongolia using over 50 types of space exploration instruments and ground-based observation instruments on aerostat platform.Along with the continuous implementation of near-space environmental exploration, the amount of data is increasing rapidly.In order to standardize and strengthen the management of near-space data collection, management and sharing service, the Data Service Centre of Scientific Experimental System in Near-space (hereinafter referred to as the "Honghu Data Service Centre") was established by the Aerospace Information Research Institute, Chinese Academy of Sciences.
In view of the characteristics of near-space exploration data, such as complicated types, diversified sources, multiple disciplines, and multiple parameters, the critical issue of near-space exploration data management and sharing service in the lifecycle of near-space exploration data is investigated, a core metadata model of nearspace exploration data is designed.The mechanism of near-space exploration data management and sharing service is described, including data collection and receiving, data classification, data quality control.The general architecture, functional modules and implementation effects of the platform of near-space exploration data management and sharing service is described.

ANALYSIS OF NEAR-SPACE EXPLORATION DATA MANAGEMENT AND SHARING SERVICE REQUIRMENTS
Near-space usually refers to the airspace between 20 km and 100 km from the Earth's surface, spanning the stratosphere(20-50km), mesosphere (50-85km) and thermosphere (85-100km), involving both the ionosphere and the non-ionosphere, with special environments such as ozone, ultraviolet and radiation, as well as special phenomena such as gravity waves, planetary waves and atmospheric discharges.
The scientific exploration of near-space has never been stopped for a long time.Scientific understanding of near-space is a gradual process, having experienced sporadic exploration in the 1950s, preliminary understanding in the 1970s, mechanism research in the 1990s, and systematic research in the past decades [1].Near-space research has entered a stage where scientific exploration and applied research are of equal importance, and the needs are urgent for the near-space exploration data management and sharing service.
The sources of near-space exploration data can be divided into three main categories: ground-based, space-based and in-situ, according to the space location of the near-space exploration.Groundbased exploration mainly uses radars and other active/passive instruments to explore atmospheric wind field, temperature and ionospheric electron density in the near-space, including laser radar, medium-frequency radar, meteor radar, airglow imagers and atmospheric electric field meters, etc. Space-based exploration mainly uses satellites and other active/passive instrument to explore atmospheric density, temperature, wind field, composition, radiation and other parameters in the global or near-global range.At present the main atmospheric exploration satellites are Upper Atmosphere Research Satellites (UARS) and TIMED atmospheric observation satellites of the United States.China's Fengyun-3 satellites and Carbon satellites also possess the capability to explore key parameters in the near-space.Fengyun-3C satellite is equipped with a microwave atmospheric vertical detector equipped with channels in the oxygen absorption band near 50∼60GHz and 118.75GHz, which can monitor the atmospheric temperature in the lower part of near-space [2].In-situ exploration mainly uses sounding rockets, balloons and other platforms to explore near-space at different altitudes and different times.The sounding rockets carry the detection equipment into the space and release it, during the process of falling the parameters of atmospheric temperature, wind field, pressure, and density are measured.Currently, high-altitude balloons are the most mature platform that can access the middle and upper stratosphere of near-space to carry out various scientific observations and experiments.
Honghu project is mainly based on high-altitude balloon, equipped with in-situ detection sensors, dropsonde, aerosol particle sensor, thermal turbulence intensity sonde system, ozone detector, conductivity meter, electromagnetic field detector, acoustic instantaneous anemometer, ground electric field meter, wide energy spectrum neutron detector, ionospheric scattering detector and other in-situ detection instruments.The detection of atmospheric composition, wind field, density, pressure, temperature, ionospheric scattering effect and neutron spectral lines in near-space is realized.Along with the continuous implementation of near-space environmental exploration, the amount of data is increasing rapidly.Since the launch of the Honghu Project, the amount increased by 60% per year.As of the end of 2022, 567 datasets had been formed, with the total amount of data exceeding 130TB, covering over 150 types of parameters.According to the analysis of various types of near-space exploration data, the data is characterized by the following four characteristics: 1) the types are diverse, including text data, table data, graphic data, image data, vector data, multimedia data, etc. 2) the disciplines are diverse, including atmospheric science, geophysical science, space science, navigation science, biochemistry, life science and other professional disciplines, etc. 3) the parameters are diverse, including atmospheric temperature, wind field, pressure, density, composition, radiation, electromagnetic environment, biological adaptability, etc. 4) it is relevant to scientific experiment tasks, and it is necessary to combine experimental mission information to facilitate near-space scientific research, including test platform, data height, and flight trajectory, etc.
However, the near-space exploration data is acquired separately by professionals from various industries, and there is a lack of unified data collection channel.In addition, due to data security, there is no sharing policy and management method for near-space exploration data, which brings great difficulties to data management and sharing service.How to "preserve well, manage well and use well" has become an important problem to be solved urgently.

RESEARCH ON THE MECHANISM OF NEAR-SPACE EXPLORATION DATA MANAGEMENT AND SHARING SERVICE 3.1 Design of Core Metadata Model
In order to solve the difficult problem of near-space exploration data management and sharing, it is necessary to orderly manage the data and let the data demand person know whether the data exists or not.Designing the core metadata model and providing metadata service are the first steps to solve the above problems.Metadata data is usually divided into full set metadata and core metadata.The full set metadata is a comprehensive description of the dataset.The core metadata only describes the important information of the dataset, which is more concise and convenient for metadata display and storage.In the work of near-space exploration data management and sharing, it is often only necessary to know some main information to determine whether the demand can be met.Therefore, based on the characteristics of near-space exploration data, this paper proposes a core metadata model, aiming to simplify data extraction, reduce redundant information accumulation, and facilitate data management and sharing.The core metadata model is designed to provide information regarding data identification, classification, temporal and spatial properties, experimental tasks, instrument platforms, management and distribution details [3,4].
The core metadata model of near-space exploration data is composed of 8 parts: data identification, classification information, experiment task, instrument and platform, temporal information,

Ionospheric scattering detector Multi-band airglow imager Aerosol methane detector
Tracer device Ultraviolet spectrometer spatial information, production information and sharing information.The Unified Model Language (UML) of core metadata model is shown in Figure 1.
Each metadata subset contains one or more metadata information.The eight metadata subsets are described in detail below.
1) Data identification: define the basic information required to describe data resources.
The data identification information contains unique identifier, Digital Object Unique Identifier (DOI), name, keywords, abstract, language and summary.This helps to quickly understand the content of the dataset.
2) Data classification: used to determine the classification and type of data.
The data classification information includes data type, subject category, and theme content classification.Data types include text data, table data, graphic data, image data, vector data, multimedia data, etc. Theme content classification includes in-situ detection data, ground-based data, remote sensing image data, etc.
3) Experimental task: define the basic information describing the experimental task.
The experiment task information includes experiment task ID, name, description, location, area range, height range, start time, end time, wind, humidity, pressure and other meteorological environmental conditions.The experiment task information is important for understanding the purpose, background, experiment process and other important information of the near-space exploration data.4) Instrument and platform: define the information of Instrument and platform that describe the data source.
The instrument and platform information includes Instrument name, Instrument description, platform ID, platform name and platform description.
5) Temporal information: define the specific time range or point in time to obtain data.
The temporal information includes time range and time resolution, describes the time characteristics of the dataset.It is an important condition for users to query and retrieve data.
6) Spatial information: define the spatial characteristics of the dataset.
The spatial information includes data location, spatial scale, projection information, boundary latitude, longitude information, The production information includes data processing software, data processing algorithm, data source description, data quality information, which describes the production process and quality information of the dataset, as well as the measures taken to control the data quality.
8) Sharing information: define the permission and method of accessing and obtaining data, as well as the information of the contact person for data sharing.
The sharing information includes data volume, data storage path, data online access website, data submitter, data contact person, data sharing method, and data protection period.Data sharing modes include external sharing, internal sharing, and non-sharing.The data protection period can be 1 year, 2 years, or no protection period.To promote the development and sharing of near-space exploration data, it is recommended that the data protection period not exceed 2 years in principle.
The core metadata model designed in this paper describes the main data characteristics of the near-space exploration data, and provides a more comprehensive description of the important information about the data that users are concerned about from the perspective of data management and sharing, which will subsequently serve as an important reference for data receiving, quality evaluation, retrieval and utilization in the process of data collection and distribution.

Data Collection and Receiving Process
As the basic stage of the data lifecycle, data collection brings together rich and reliable near-space exploration data to provide data support for near-space scientific research, and it is necessary to establish a practical collection system and receiving process, control the quality of near-space exploration data, and ensure the completeness, accuracy and usability of scientific data.This paper mainly relies on the Honghu Project Data Service Center to carry out the research on data management and sharing service mechanism, and the near-space exploration data it submits mainly includes three major categories , i.e., special scientific experiment data, external system data and basic support data.Special scientific experiment data refer to the original data and analytical data obtained from various types of near-space exploration experiments carried out under the Honghu Project, as well as platform data and acoustic and visual data of the experimental process.External system data are derived from other data sources in support of near-space scientific research, including ground-based observation data and space-based exploration data.Basic support data is basic data to support proximity space science research, including meteorological data and geographic information data [5,6].Figure 2 shows the types of Honghu Project Data.
Using the Honghu Project Data Service Center as a reference, this paper establishes a standardization linkage throughout data collection, quality control, storage management, and sharing and distribution, based on the characteristics of near-space exploration data, for the benefit of data creators, data stewards, and data users, as shown in Figure 3.The management and sharing of near-space exploration data involves multiple organizational entities, including data managers, data producers, data users and experts in the field of near-space science and the Honghu Project Data Service Center.Before conducting near-space exploration experiments, data producers are required to prepare and submit a data collection plan, describing the production, collection, processing and analysis of the data to be remitted.The data manager issues a data collection list following review and approval.
In the experiment process, the data producer performs data processing, analysis, and mining according to the data delivery list.Data managers supervise, verify, and oversee the data generation process.
After the end of the experiment process, the data producer was required to compile and submit the data delivery report, including the data delivery plan and metadata file, according to the data delivery list.The data collection scheme outlines temporal and spatial information, storage capacity, types of data resources, sharing mode, and quality control information.Based on the core metadata model of near-space exploration data, the metadata file details key processes of data processing, quality control, description, storage, analysis, and sharing, as well as the methods and tools used.The data manager is responsible for reviewing and approving data submission reports.After approval, the data producer can proceed with data submission.The data service center is responsible for receiving the data and organizing experts in the near-space science field to conduct data audits.Following that, the center issues a data collection voucher.The data manager provides an opinion on data  China [2018] No. 17), the Data Service Center draws on the reference to the management of scientific data in Earth observation, remote sensing by unmanned aerial vehicles, marine environment monitoring and other related fields [7,8], formulates the management details of the scientific experiment system in the near-space and the norms for sharing, builds the core metadata model, and designs the storage strategy of near-space data, which serve as references and bases for management and service work.

Data Classification and Access Control
Near-space data have high scientific research value, with some containing sensitive information.Open sharing of these resources always brings large security risks, such as data resource leakage, an imperfect data security management mechanism, and lax management of access rights.Hierarchical classification divides data resources into disciplinary categories with clear access levels, and assigns users based on their characteristics and consistent data needs [9].This correlation between classified data and users ensures that appropriate data resources are accessed and used by the right users, greatly balancing the relationship between data security and openness.Based on this approach, this paper establishes a data sharing security system including identity authentication, access control, and data encryption to effectively manage and share near-space exploration data.
Near-space exploration data collection and sharing management involve data producers, data managers, data warehouse managers, data reviewers, data users, and other types of users.Many users are both providers and users of scientific data resources, and by sharing and exchanging data resources with data owners related to near-space research, they form a virtuous circle of an integrated environment for the application of scientific data in near-space exploration.Data users are classified into three types: data providing users, general use users, and internal management users ,as shown in Table 2.The data resources are classified into three levels: unrestricted access, authorized access, and owner access , as shown in Table 3.This approach facilitates the effective utilization of near-space exploration data [10].
1) Unrestricted access: completely open resources without any access restrictions.This resource has the lowest level of security protection.Any registered user can query, use, download, and disseminate this resource.Users can modify the resources twice.The modified resources must be shared with the same authorization method.When sharing, the data owner must be noted.
2) Authorized access: Only authorized users have access to the resource.If a registered user wants to access this level of resources, he/she must apply for additional authorization and can access the resources only after approval.
3) Owner access: This type of resource is the most restricted resource in this grading system.Typically, only the owner or an authorized manager of the resource can query, use, download, or distribute the resource.Registered users who desire to obtain such resources must submit an additional authorization application, which can be accessed only after approval.

NEAR-SPACE EXPLORATION DATA MANAGEMENT AND SHARING SERVICE PLATFORM
Based on the research results of the management and shared service mechanism of near-space exploration data, this paper relies on the Honghu Project Data Service Center to develop the first domestic platform of near-space exploration data management and sharing service.The platform builds a high-performance spatial and temporal data sharing service framework integrating access control, query retrieval, sharing and distribution, and statistical analysis to provide domestic and foreign users with multi-modal retrieval of near-space scientific data in multi-disciplinary fields, online browsing and downloading of information [11,12].The overall design architecture and functional modules of the platform are shown in Figure 4.
1) The base layer is based on the construction of the private cloud of Honghu Project Data Service Center, which adopts a multi-level storage mechanism and utilizes a data storage management method that combines multistate storage and spatio-temporal correlation.It establishes a basic operation environment that supports distributed, high-reliability, and high-performance capabilities.2) The data layer is primarily used for receiving and storing near-space exploration data, encompassing a range of fields such as atmospheric science, geophysical science, space science, navigation science, biochemistry, life science.This includes raw data and analysis data obtained from various near-space exploration experiments of the Honghu Project, platform data, acoustic and visual data during the experimental process, ground-based observation data and space-based exploration data from other sources, and relevant basic data supporting near-space scientific research.It also involves relevant metadata and tools and software.
3) The supporting layer provides support to the application layer and implements functions for near-space exploration data life cycle management, system management, and security management.The life cycle management of near-space exploration data supports functions such as unified data reception, data quality auditing, data analysis and cataloguing, data storage at different levels, data classification and release management, and data application statistics.
System and security management includes functions such as user management, access control, identity verification, and log management.
4) The application layer mainly provides multiple types of management and sharing services, such as near-space exploration data transfer, data retrieval, data browsing and subscription, data download and approval, data statistics, user registration and transfer guidance.
5) The access layer enables experts and scholars engaged in research on near-space science to access rich data and high-quality services through a dynamically designed and user-friendly interaction interface.
Data submission, and request are the two most central functions of the near-space exploration data management and shared services platform.To facilitate the organization of near-space exploration data into the database, the near-space exploration data management and shared service platform adopts a unified metadata template for data collection, which contains identification information, classification information, experimental mission information, payload platform information, time information, spatial information, production information, shared information data, and other contents of the dataset.It corresponds to the core metadata model and is associated with the near-space exploration data management database.We can use the metadata file to quickly catalog and review the collected near-space detection data, check the integrity of the necessary metadata information, and the standardization of filling.After the metadata review is passed, the entity data is uploaded, and it is checked whether the data can be opened normally, the file time attribute, the file size, and the rationality of the value in the file.After the verification is passed, the metadata information and entity data are injected into the database and storage system to realize the orderly organization and management of data.
The near-space exploration data management and shared service platform adopts a multidimensional feature search model that combines space-time, theme, classification, and mission features.It introduces multiple types of search methods, such as keywords, conditions, and geographic locations, allowing users to obtain a comprehensive overview of the distribution and quantity of nearspace exploration experiments.Additionally, users can filter items such as experiment mission, payload platform, space-time information, and subject categories to retrieve near-space exploration datasets and quickly and efficiently view metadata information and

CONCLUSION
Based on the analysis of data management and sharing service requirements for near-space exploration data, a core metadata model is proposed that covers data identification, classification information, experimental task, instrumentation and platform, temporal information, spatial information, production information, and sharing information.The mechanism of near-space exploration data management and sharing service is described in the lifecycle of nearspace exploration data, including data collection and receiving, data classification, data quality control and data intellectual property protection.Based on this, the overall architecture and functional module design of the near-space exploration data management and sharing service platform are finalized.
By November 2023, the near-space exploration data management and shared service platform has been operating steadily for three years, collecting 567 datasets with a total data volume of over 130TB , providing multi-disciplinary near-space scientific data for domestic and foreign users, and adding new data resources to the national scientific data resources.In the future, it will continue to strengthen and standardize the management of near-space exploration data, enhance the level of openness and sharing, and better support the continuous development of near-space exploration.

Figure 1 : 7 )
Figure 1: UML of metadata for near-space exploration data

Figure 2 :Figure 3 :
Figure 2: Overview of near-space exploration data

Figure 4 :
Figure 4: General architecture of the platform of near-space exploration data management and sharing service

Figure 5 :Figure 6 :
Figure 5: Homepage of near-space exploration data management and sharing platform

Figure 7 :
Figure 7: Data Details of near-space exploration data management and sharing platform

Figure 5 ,
Figure 6, Figure 7 are pages of near-space exploration data management and sharing platform.

Table 1
shows the scientific exploration experiments carried out by Honghu Project since 2018.

Table 1 :
Information on Honghu Project exploration experiments

Table 2 :
User Category System After receiving near-space exploration data, the center organizes and catalogues the data, stores and manages the data, and distributes them in accordance with the sharing methods specified in the data collection programme.Based on "The Measures for the Management of Scientific Data" issued by the General Office of the State Council (State Office of the People's Republic of

Table 3 :
Data resource leveling system