Package picard.util
Class IntervalListTools
- java.lang.Object
-
- picard.cmdline.CommandLineProgram
-
- picard.util.IntervalListTools
-
@DocumentedFeature public class IntervalListTools extends CommandLineProgram
Performs variousIntervalList
manipulations.Summary
This tool offers multiple interval list file manipulation capabilities, including: sorting, merging, subtracting, padding, and other set-theoretic operations. The default action is to merge and sort the intervals provided in theINPUT
s. Other options, e.g. interval subtraction, are controlled by the arguments.
BothIntervalList
and VCF files are accepted as input.IntervalList
should be denoted with the extension ".interval_list", while a VCF must have one of ".vcf", ".vcf.gz", ".bcf". When VCF file is used as input, each variant is translated into an using its reference allele or the END INFO annotation (if present) to determine the extent of the interval.IntervalListTools
can also "scatter" the resulting interval-list into many interval-files. This can be useful for creating multiple interval lists for scattering an analysis over.Details
The IntervalList file format is designed to help the users avoid mixing references when supplying intervals and other genomic data to a single tool. A SAM style header must be present at the top of the file. After the header, the file then contains records, one per line in text format with the following values tab-separated:- Sequence name (SN)
- Start position (1-based)
- End position (1-based, end inclusive)
- Strand (either + or -)
- Interval name (ideally unique names for intervals)
For Example:
\@HD VN:1.0 \@SQ SN:chr1 LN:501 \@SQ SN:chr2 LN:401 chr1 1 100 + starts at the first base of the contig and covers 100 bases chr2 100 100 + interval with exactly one base
Usage examples
1. Combine the intervals from two interval lists:
java -jar picard.jar IntervalListTools \\ ACTION=CONCAT \\ I=input.interval_list \\ I=input_2.interval_list \\ O=new.interval_list
2. Combine the intervals from two interval lists, sorting the resulting in list and merging overlapping and abutting intervals:
java -jar picard.jar IntervalListTools \\ ACTION=CONCAT \\ SORT=true \\ UNIQUE=true \\ I=input.interval_list \\ I=input_2.interval_list \\ O=new.interval_list
3. Subtract the intervals in SECOND_INPUT from those in INPUT:
java -jar picard.jar IntervalListTools \\ ACTION=SUBTRACT \\ I=input.interval_list \\ SI=input_2.interval_list \\ O=new.interval_list
4. Find bases that are in either input1.interval_list or input2.interval_list, and also in input3.interval_list:
java -jar picard.jar IntervalListTools \\ ACTION=INTERSECT \\ I=input1.interval_list \\ I=input2.interval_list \\ SI=input3.interval_list \\ O=new.interval_list
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
IntervalListTools.Action
-
Field Summary
Fields Modifier and Type Field Description IntervalListTools.Action
ACTION
int
BREAK_BANDS_AT_MULTIPLES_OF
List<String>
COMMENT
boolean
INCLUDE_FILTERED
List<File>
INPUT
boolean
INVERT
File
OUTPUT
picard.util.IntervalListTools.Output
OUTPUT_VALUE
int
PADDING
Integer
SCATTER_CONTENT
int
SCATTER_COUNT
List<File>
SECOND_INPUT
boolean
SORT
IntervalListScatterMode
SUBDIVISION_MODE
boolean
UNIQUE
-
Fields inherited from class picard.cmdline.CommandLineProgram
COMPRESSION_LEVEL, CREATE_INDEX, CREATE_MD5_FILE, GA4GH_CLIENT_SECRETS, MAX_RECORDS_IN_RAM, QUIET, REFERENCE_SEQUENCE, referenceSequence, specialArgumentsCollection, TMP_DIR, USE_JDK_DEFLATER, USE_JDK_INFLATER, VALIDATION_STRINGENCY, VERBOSITY
-
-
Constructor Summary
Constructors Constructor Description IntervalListTools()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected String[]
customCommandLineValidation()
Put any custom command-line validation in an override of this method.protected int
doWork()
Do the work after command line has been parsed.-
Methods inherited from class picard.cmdline.CommandLineProgram
getCommandLine, getCommandLineParser, getDefaultHeaders, getFaqLink, getMetricsFile, getStandardUsagePreamble, getStandardUsagePreamble, getVersion, hasWebDocumentation, instanceMain, instanceMainWithExit, makeReferenceArgumentCollection, parseArgs, requiresReference, setDefaultHeaders, useLegacyParser
-
-
-
-
Field Detail
-
INPUT
@Argument(shortName="I", doc="One or more interval lists. If multiple interval lists are provided the output is theresult of merging the inputs. Supported formats are interval_list and VCF.", minElements=1) public List<File> INPUT
-
OUTPUT
@Argument(doc="The output interval list file to write (if SCATTER_COUNT == 1) or the directory into which to write the scattered interval sub-directories (if SCATTER_COUNT > 1).", shortName="O", optional=true) public File OUTPUT
-
PADDING
@Argument(doc="The amount to pad each end of the intervals by before other operations are undertaken. Negative numbers are allowed and indicate intervals should be shrunk. Resulting intervals < 0 bases long will be removed. Padding is applied to the interval lists (both INPUT and SECOND_INPUT, if provided) <b> before </b> the ACTION is performed.", optional=true) public int PADDING
-
UNIQUE
@Argument(doc="If true, merge overlapping and adjacent intervals to create a list of unique intervals. Implies SORT=true.") public boolean UNIQUE
-
SORT
@Argument(doc="If true, sort the resulting interval list by coordinate.") public boolean SORT
-
ACTION
@Argument(doc="Action to take on inputs.") public IntervalListTools.Action ACTION
-
SECOND_INPUT
@Argument(shortName="SI", doc="Second set of intervals for SUBTRACT and DIFFERENCE operations.", optional=true) public List<File> SECOND_INPUT
-
COMMENT
@Argument(doc="One or more lines of comment to add to the header of the output file (as @CO lines in the SAM header).", optional=true) public List<String> COMMENT
-
SCATTER_COUNT
@Argument(doc="The number of files into which to scatter the resulting list by locus; in some situations, fewer intervals may be emitted. ") public int SCATTER_COUNT
-
SCATTER_CONTENT
@Argument(doc="When scattering with this argument, each of the resultant files will (ideally) have this amount of \'content\', which means either base-counts or interval-counts depending on SUBDIVISION_MODE. When provided, overrides SCATTER_COUNT", optional=true) public Integer SCATTER_CONTENT
-
INCLUDE_FILTERED
@Argument(doc="Whether to include filtered variants in the vcf when generating an interval list from vcf.", optional=true) public boolean INCLUDE_FILTERED
-
BREAK_BANDS_AT_MULTIPLES_OF
@Argument(shortName="BRK", doc="If set to a positive value will create a new interval list with the original intervals broken up at integer multiples of this value. Set to 0 to NOT break up intervals.", optional=true) public int BREAK_BANDS_AT_MULTIPLES_OF
-
SUBDIVISION_MODE
@Argument(shortName="M", doc="The mode used to scatter the interval list.") public IntervalListScatterMode SUBDIVISION_MODE
-
INVERT
@Argument(doc="Produce the inverse list of intervals, that is, the regions in the genome that are <br>not</br> covered by any of the input intervals. Will merge abutting intervals first. Output will be sorted.", optional=true) public boolean INVERT
-
OUTPUT_VALUE
@Argument(doc="What value (if anything) to output to stdout (for scripting)") public picard.util.IntervalListTools.Output OUTPUT_VALUE
-
-
Method Detail
-
doWork
protected int doWork()
Description copied from class:CommandLineProgram
Do the work after command line has been parsed. RuntimeException may be thrown by this method, and are reported appropriately.- Specified by:
doWork
in classCommandLineProgram
- Returns:
- program exit status.
-
customCommandLineValidation
protected String[] customCommandLineValidation()
Description copied from class:CommandLineProgram
Put any custom command-line validation in an override of this method. clp is initialized at this point and can be used to print usage and access argv. Any options set by command-line parser can be validated.- Overrides:
customCommandLineValidation
in classCommandLineProgram
- Returns:
- null if command line is valid. If command line is invalid, returns an array of error message to be written to the appropriate place.
-
-