Basic (--basic
)
Prints summary statistics for the file:
- TotalReads - # of reads that are in the file
- MappedReads - # of reads marked mapped in the flag
- PairedReads - # of reads marked paired in the flag
- ProperPair - # of reads marked paired AND proper paired in the flag
- DuplicateReads - # of reads marked duplicate in the flag
- QCFailureReads - # of reads marked QC failure in the flag
- MappingRate(%) - # of reads marked mapped in the flag / TotalReads
- PairedReads(%) - # of reads marked paired in the flag / TotalReads
- ProperPair(%) - # of reads marked paired AND proper paired in the flag / TotalReads
- DupRate(%) - # of reads marked duplicate in the flag / TotalReads
- QCFailRate(%) - # of reads marked QC failure in the flag / TotalReads
- TotalBases - # of bases in all reads
- BasesInMappedReads - # of bases in reads marked mapped in the flag
Qual/Phred (--phred
and --qual
)
Prints a count of the number of times each quality value appears in the file to stderr.
phred
Displays Quality as phred integers [0-93]qual
Displays Quality as non-phred integers (phred + 33) [33-126]
By default, these counts include all qualities in the BAM file.
To exclude unmapped reads and soft clips, use --excludeFlags 4.
To only include records that overlap a set of regions, use --regionList and specify a bed file with the regions. If a read overlaps the region, all qualities will be counted even if those bases do not fall in the region. If you only want to count qualities that fall within the region, also specify --withinRegion. Without excluding unmapped reads, it will include soft clips that overlap the region.
BaseQC (--pBaseQC
and --cBaseQC
and --baseSum
)
The
pBaseQC
and cBaseQC
options generate per base statistics. Only one of these two options can be specified. They write statistics generated for each position to the file specified after the option. They use the same logic for calculating statistics, but pBaseQC
writes the statistics as percentages, and cBaseQC
writes them as counts. The order of the statistics are also different.
The
baseSum
option can be used with either pBaseQC
or cBaseQC
or on its own. baseSum
generates a summary of the per position statistics and writes it to stderr. It calculates the per position base statistics even if they will not be written anywhere (neither pBaseQC
nor cBaseQC
are specified).All three options use the same logic for calculating the statistics:
- A read spans a position if the read starts at or before the position, ends at or after the position and the position is not a clip. CIGAR operations allowed for the position are M/X/=/D/N. If the CIGAR is '*', only numbers for the specified reference position are incremented.
- Currently there is no special logic to exclude positions/reads where the reference base is 'N' or the read base is 'N'.
没有评论:
发表评论