CCA-05: Data Sharing and Infrastructure
Imagine a scenario where during a visit to a new city, your smartwatch seamlessly taps into a shared global data network, providing you with tailored health alerts and recommendations. This watch, leveraging vast pools of shared biometric and environmental data, notifies you of areas with high pollen levels due to your known allergies, suggests restaurants catering to your specific nutritional needs, and even guides you to locations with optimal air quality for your afternoon jog. This innovation, born from the fusion of cross-organization data collaboration and advanced infrastructure, represents a leap in personalized public health management. Such advancements, fueled by global data sharing, not only enhance individual health and wellbeing but also drive economic growth and foster a more informed, health-conscious society. Here, technology becomes a personal health guardian, making proactive health management an integral part of daily life.
The Global Data Coalition estimates that there are over 3,000 disjointed digital repositories for biological data, compromising the ability to find, access, interoperate, and reuse data across these important resources. This situation points to a critical need for a strategic approach to identify and prioritize essential datasets and to ensure they can be integrated for generating new knowledge. We need a unified and scalable data infrastructure that is poised both to leverage available data from multiple sources across different sectors and to support application of AI technology for solving biological problems that will drive the bioeconomy. Addressing such challenges is crucial not only for keeping pace with global data management and innovation, but also for the U.S. to continue its leadership role.
CASA-Bio stakeholders representing government, industry, and non-profit sectors, identified areas of mutual interest where concerted effort among them may lead more quickly to the realization of the envisioned future. These are a few of their ideas. In addition to establishing standards for data and metadata, there's a need for in-depth exploration of how different sectors (industry versus academia) utilize and process all sorts of biological data. Furthermore, enhancing the interoperability and integration of biological data across domains such as agriculture, health, and environmental management is important. Collaborative efforts should focus on training new large language models for AI in biological sciences to gain new knowledge about biological systems and their functions. These efforts must address the distinct needs of various stakeholders to ensure the effective utilization of biological data for transformative advancements in the bioeconomy. We emphasize that this list is not comprehensive; we need you to help us think deeper within this subtheme!
As a member of the R&D community, you too are a CASA-Bio stakeholder, and providing your insight on R&D projects that undergird this sub-theme and lead to solutions is critical. Your ideas will matter! Your individual project ideas and those developed as part of the collaborative Town Hall process will be combined to produce an aggregate view. This view will help us understand not only the interests of the R&D community, but also what they are willing to do to advance the bioeconomy. Topics among the R&D project ideas we receive will help government, industry, and non-profit stakeholders see the potential of the US R&D community to address critical future needs and help define topics for future exploration through workshops and roadmapping.