Connecting & sharing data
Because a wealth of ‘omics data live in diverse repositories globally, moving the data to a central location for processing is problematic and inefficient. At the same time, scientists need to bring together these diverse data sets to enable population-level or planetary-scale analyses that drive new knowledge and discovery. iMicrobe delivers a virtual framework for connecting web-accessible remote microbiome ‘omics data with private user data using the Agave API. Any data sets that are available via a web-link (ftp or http) are accessible and computable. For example, iMicrobe provides access to the CAMERA data collection (Community Cyberinfrastructure for Advanced Microbial Ecology Research and Analysis) including: reads, peptides, CDS, contigs, assemblies, annotations, and related projects derived from sample and environmental data for 120+ microbiome projects, representing 1 TB of data. These data are hosted in the CyVerse Data Store (/iplant/shared/imicrobe) and are integrated in iMicrobe under "Community Data”.
Users can also organize and share their own data through the integrated CyVerse Datastore. iMicrobe is a “Powered by CyVerse” project that leverages CyVerse cyberinfrastructure (CI) including the authentication system (OAuth2) for secure single sign-on between iMicrobe and all CyVerse services. These services include the CyVerse Data Store for storing, sharing, and distributing large amounts of data and analyses. Users login to iMicrobe using their CyVerse credentials to access both public and private data sets in their personal Dashboard. From the Dashboard, users can create, share, and update their own projects, samples, and metadata. All data are synchronously backed up by the University of Arizona to provide reliable and secure storage. Initial user allocations are 100GB, but can be expanded by requesting additional storage allocations through CyVerse. All user projects are private by default but can be shared with collaborators in the CyVerse community. Users can also store analyses, derived data sets, and a variety of file types, from text to image data, all within the same infrastructure.
Last updated