2012年10月9日星期二

QIIME scripts

http://qiime.org/scripts/


We’re very excited to announce the 1.5.0 release of QIIME, which is available for download here. As always, you can find the latest QIIME AMI ID here, and we’ll be releasing the new VirtualBox images in one week. This release is packed with way too many exciting new features to mention all of them here, but here are some of the ones we’re most excited about.
* The biggest change in this release of QIIME is the switch to the BIOM format for representing OTU tables on disk and the biom-format objects for representing OTU tables in memory. You can find a discussion of the motivations for the switch here, but briefly it will support interoperability of related tools (e.g., QIIME, MG-RAST, mothur, and VAMPS), it provides a more efficient representation of sparse matrix data than tab-separated text, and it allows for storage of OTU counts, OTU metadata (e.g., taxonomy), and sample metadata (e.g., environmental parameters) in a single file. A manuscript describing the BIOM format is currently in press at GigaScience. You can find information about converting between BIOM-formatted and “classic”-formatted OTU tables here.
* Our AWS AMI now support use with StarCluster and the IPython Notebook. StarCluster provides an extremely convenient way to boot virtual clusters on the Amazon Cloud, and we think it will be key toward making very large analyses (e.g., based on several Illumina runs) accessible to groups without large compute clusters. Using StarCluster you can now easily run your QIIME analyses across multiple AWS instances: for example, you can boot 20 eight-processor instances to create a virtual cluster with 160 processors. The IPython Notebook provides a web-based interface for developing API and/or command line based workflows. These are easy to share with others as .ipynb files, or to publish with your journal articles. Using the IPython Notebook with the QIIME AWS images enables truly reproducible computation. You can find information on how to use these new features here.
* We’ve added a number of new statistical approaches via the compare_distance_matrices.pyand compare_categories.py scripts. These include Adonis, Anosim, BEST, Moran’s I, MRPP, PERMANOVA, PERMDISP, RDA, Partial Mantel, and Mantel Correlogram. Two new tutorials illustrate how and when to use these methods – you can find these here and here. This code was all developed for an undergraduate Computer Science capstone project at Northern Arizona University – their project website is here.
* We’ve added support for the RTAX method for performing taxonomy assignment inassign_taxonomy.py. RTAX is specifically designed for assigning taxonomy to paired-end reads, but additionally works on single-end reads. You can find a paper on RTAX here, and a tutorial describing how to use this new code here.
* Along with the switch to BIOM format for OTU tables, we’ve updated the cleaned up the interfaces, usage examples, and help text associated with many of the scripts in QIIME. Notable examples are the replacement of filter_otu_table.py withfilter_otus_from_otu_table.py, and the replacement of filter_by_metadata.py withfilter_samples_from_otu_table.py.
* Support for inserting sequences into trees has been added via the newinsert_seqs_into_tree.py script. This wraps the pplacerRAxML, and ParsInsert applications.
* We’ve added the pick_subsampled_reference_otus_through_otu_tables.py, a more efficient open reference OTU picking workflow script for processing very large Illumina (or other) data sets. This is being used to process the Earth Microbiome Project data, so is designed to scale to tens of HiSeq runs. A new tutorial has been added that describes this process.
* The check_id_map.py code was completely refactored. It now creates html output to display locations of errors and warnings in the mapping file, so should provide a very convenient way to detect errors in your metadata mapping files.
* Added the start_parallel_jobs_sc.py script to support parallel jobs on SGE queueing systems, which is the default queueing system on StarCluster. This has only been tested on StarCluster at this point (hence ‘sc’ in the name), but we expect that it will work on other systems using SGE.
QIIME releases are massive collaborative efforts. Thanks to all of the developers for their hard work in making this release happen, and to our users for the suggestions, support, feature requests and bug reports. A lot of the QIIME developers will be at ISME this summer, so come find us and say hello!

没有评论:

发表评论