union |
It is most useful when the input sequences are specified in a List file. The List file (file of sequence names) can be any set of sequences in files or database entries specified in the normal EMBOSS USA (which can include the spcification of sub-regions of the sequence, eg. 'em:hsfau[20,55]'). Specifying several such subregions in a sequence or sequences allows you to enter disjoint sequences to be joined.
The file 'cds.list' contains a list of the regions making up the coding sequence of 'embl:x65923':
% union Reads sequence fragments and builds one sequence Input (gapped) sequence(s): @cds.list output sequence [x65921.fasta]: |
Go to the input files for this example
Go to the output files for this example
Standard (Mandatory) qualifiers: [-sequence] seqall (Gapped) sequence(s) filename and optional format, or reference (input USA) [-outseq] seqout [ |
Standard (Mandatory) qualifiers | Allowed values | Default | |
---|---|---|---|
[-sequence] (Parameter 1) |
(Gapped) sequence(s) filename and optional format, or reference (input USA) | Readable sequence(s) | Required |
[-outseq] (Parameter 2) |
Sequence filename and optional format (output USA) | Writeable sequence | <*>.format |
Additional (Optional) qualifiers | Allowed values | Default | |
-overlapfile | Sequence overlaps output file (optional) | Output file | <*>.union |
Advanced (Unprompted) qualifiers | Allowed values | Default | |
-feature | Use feature information | Boolean value Yes/No | No |
-source | Create source features | Boolean value Yes/No | No |
-findoverlap | Look for overlaps when joining | Boolean value Yes/No | No |
tembl-id:X65921[782:856] tembl-id:X65921[951:1095] tembl-id:X65921[1557:1612] tembl-id:X65921[1787:1912] |
You may find the program yank useful for creating List files.
>X65921 X65921.1 H.sapiens fau 1 gene atgcagctctttgtccgcgcccaggagctacacaccttcgaggtgaccggccaggaaacg gtcgcccagatcaaggctcatgtagcctcactggagggcattgccccggaagatcaagtc gtgctcctggcaggcgcgcccctggaggatgaggccactctgggccagtgcggggtggag gccctgactaccctggaagtagcaggccgcatgcttggaggtaaagtccatggttccctg gcccgtgctggaaaagtgagaggtcagactcctaaggtggccaaacaggagaagaagaag aagaagacaggtcgggctaagcggcggatgcagtacaaccggcgctttgtcaacgttgtg cccacctttggcaagaagaagggccccaatgccaactcttaa |
The result is a normal sequence file containing a single sequence resulting from the concatenation of the input sequences.
Program name | Description |
---|---|
biosed | Replace or delete sequence sections |
codcopy | Reads and writes a codon usage table |
cutseq | Removes a specified section from a sequence |
degapseq | Removes gap characters from sequences |
descseq | Alter the name or description of a sequence |
entret | Reads and writes (returns) flatfile entries |
extractalign | Extract regions from a sequence alignment |
extractfeat | Extract features from a sequence |
extractseq | Extract regions from a sequence |
listor | Write a list file of the logical OR of two sets of sequences |
makenucseq | Creates random nucleotide sequences |
makeprotseq | Creates random protein sequences |
maskfeat | Mask off features of a sequence |
maskseq | Mask off regions of a sequence |
newseq | Type in a short new sequence |
noreturn | Removes carriage return from ASCII files |
notseq | Exclude a set of sequences and write out the remaining ones |
nthseq | Writes one sequence from a multiple set of sequences |
pasteseq | Insert one sequence into another |
revseq | Reverse and complement a sequence |
seqret | Reads and writes (returns) sequences |
seqretsplit | Reads and writes (returns) sequences in individual files |
skipseq | Reads and writes (returns) sequences, skipping first few |
splitter | Split a sequence into (overlapping) smaller sequences |
trimest | Trim poly-A tails off EST sequences |
trimseq | Trim ambiguous bits off the ends of sequences |
vectorstrip | Strips out DNA between a pair of vector sequences |
yank | Reads a sequence range, appends the full USA to a list file |
You may find the program yank useful for creating List files.