Task Area 3: Infrastructure for Data and Software Reuse
Summary
Discussion about FAIR principles typically focusses on the data. However, FAIR software and data analysis tools to make sense of that data is also a critical part of the research data ecosystem. This need is particularly acute given the increasing complexity of scientific data analysis in x-ray and neutron data. Task area 3 is concerned with making scientific data analysis tools and software FAIR in alongside with the data itself. Daphne achieves this by engaging with selected power user groups - the scientific ‘influencers’ of the photon and neutron community - to develop findable and repeatable data analysis tools and foster best practices in open research software development. We focus on particular use cases of high benefit to the community of researchers.
Challenges and Goals of TA3
Currently the general absence of curated and managed software developed by leading research groups for remotely working with large data sets, measured at Xray and neutron facilities, prevents researchers other than the users who collected the to repeat the data analysis pipelines and re-use the code in their own research.
The goals of TA3 therefore focus on
- Strengthening and sharing user tools for data analysis
- Remotely available user software for re-use by all on facility infrastructure
- Interfacing research data to machine learning methods
We aim to create, curate and foster analysis software that can be deployed on ‘cloud-like’ services so that ‘ordinary’ users can repeat and benefit from the work of power users, and to make the analysis of ‘big data’ technically simple, reproducible and sustainable, including the accessibility to machine learning strategies. We acknowledge that professional-level software curation is essential to ensuring the longevity of DAPHNE services as a part of NFDI beyond the project duration.
The acceptance of software solutions in the user community that are developed with the support of DAPHNE as well as the Number of beamlines using DAPHNE specifications and software can be identified as figures of merit to measure the success of TA3.
Experience and Expertise
DAPHNE is backed by scientific computing software development expertise from scientific computing groups at the university partners and especially the facilities who will guide the progress of software development within DAPHNE.
The experience and procedures from European projects such as ExPaNDS and PaNOSC include code peer review concepts, continuous integration-deployment-testing cycles, time-based-releasing and formation of teams responsible for programming and release processes. These strategies will be adopted by and used within DAPHNE.
Addressing the question of software development methodology, we anticipate working closely in small teams with scientists from universities, research institutions and scientific staff at facility instruments as end users to develop solutions that match their workflows and needs. We envision rapid prototype turnaround followed by thorough testing before facility wide deployment.
Focus areas
During the preparation phase of DAPHNE the following areas have been identified as potential fields of action for TA3 with a focus on particular use cases of high benefit to the community of researchers.
- Tomography techniques including full field, ptychographic and fluorescence tomography
- EXAFS and related spectroscopies
- Small and wide-angle scattering (SAXS/WAXS/SANS) including grazing incidence geometry
- Serial crystallography
- Interfacing research data to machine learning methods
- Multi-dimensional data-treatment for neutron scattering
Community participation
DAPHNE TA3 largely relies on the work of power users and encourages and supports these facility user groups to make their data analysis pipelines reproducible, re-usable and available to other user groups. Therefore, we have a key interest to identity concerned user groups and software in order to assess potential integration and collaboration with DAPHNE. In order to suggest analysis software that should be considered to be included into the DAPHNE software ecosystem or to express interest in collaboration with DAPHNE (not limited to the focus areas listed above) the following form is available: Link