dbxgcg |
Having created the EMBOSS indexes for this file, a database can then be defined in the file emboss.defaults as something like:
DB embl [ type: N dbalias: embl (see below) format: embl method: embossgcg directory: /data/gcg/embl file: *.seq indexdirectory: /data/gcg/embl/indexes ]The index file 'basename' given to dbxgcg must match the DB name in the definition. If not, then a 'dbalias' line must be given which specifies the basename of the indexes.
% dbxgcg Database b+tree indexing for GCG formatted databases Basename for index files: embl Resource name: embl EMBL : EMBL SWISS : Swiss-Prot, SpTrEMBL, TrEMBLnew GENBANK : Genbank, DDBJ PIR : NBRF Entry format [SWISS]: embl Database directory [.]: embl Wildcard database filename [*.seq]: id : ID acc : Accession number sv : Sequence Version and GI des : Description key : Keywords org : Taxonomy Index fields [id,acc]: Processing file /homes/user/test/embl/eem_ba1.seq Processing file /homes/user/test/embl/eem_est.seq Processing file /homes/user/test/embl/eem_fun.seq Processing file /homes/user/test/embl/eem_htginv1.seq Processing file /homes/user/test/embl/eem_hum1.seq Processing file /homes/user/test/embl/eem_in.seq Processing file /homes/user/test/embl/eem_ov.seq Processing file /homes/user/test/embl/eem_ro.seq Processing file /homes/user/test/embl/eem_vi.seq |
Go to the output files for this example
SET PAGESIZE 2048 SET CACHESIZE 200The above values are recommended for most systems. The PAGESIZE is a multiple of the size of disc pages the operating system buffers. The CACHESIZE is the number of disc pages dbxgcg is allowed to cache.
RES embl [ type: Index idlen: 15 acclen: 15 svlen: 20 keylen: 25 deslen: 25 orglen: 25 ]The length definitions are the maximum lengths of 'words' in the field being indexed. Longer words will be truncated to the value set.
Standard (Mandatory) qualifiers: [-dbname] string Basename for index files (Any string from 2 to 19 characters, matching regular expression /[A-z][A-z0-9_]+/) [-dbresource] string Resource name (Any string from 2 to 19 characters, matching regular expression /[A-z][A-z0-9_]+/) -idformat menu [SWISS] Entry format (Values: EMBL (EMBL); SWISS (Swiss-Prot, SpTrEMBL, TrEMBLnew); GENBANK (Genbank, DDBJ); PIR (NBRF)) -directory directory [.] Database directory -filenames string [*.seq] Wildcard database filename (Any string is accepted) -fields menu [id,acc] Index fields (Values: id (ID); acc (Accession number); sv (Sequence Version and GI); des (Description); key (Keywords); org (Taxonomy)) Additional (Optional) qualifiers: (none) Advanced (Unprompted) qualifiers: -release string [0.0] Release number (Any string up to 9 characters) -date string [00/00/00] Index date (Date string dd/mm/yy) -exclude string Wildcard filename(s) to exclude (Any string is accepted) -indexoutdir outdir [.] Index file output directory Associated qualifiers: (none) General qualifiers: -auto boolean Turn off prompts -stdout boolean Write standard output -filter boolean Read standard input, write standard output -options boolean Prompt for standard and additional values -debug boolean Write debug output to program.dbg -verbose boolean Report some/full command line options -help boolean Report command line options. More information on associated and general qualifiers can be found with -help -verbose -warning boolean Report warnings -error boolean Report errors -fatal boolean Report fatal errors -die boolean Report dying program messages |
Standard (Mandatory) qualifiers | Allowed values | Default | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
[-dbname] (Parameter 1) |
Basename for index files | Any string from 2 to 19 characters, matching regular expression /[A-z][A-z0-9_]+/ | Required | ||||||||||||
[-dbresource] (Parameter 2) |
Resource name | Any string from 2 to 19 characters, matching regular expression /[A-z][A-z0-9_]+/ | Required | ||||||||||||
-idformat | Entry format |
|
SWISS | ||||||||||||
-directory | Database directory | Directory | . | ||||||||||||
-filenames | Wildcard database filename | Any string is accepted | *.seq | ||||||||||||
-fields | Index fields |
|
id,acc | ||||||||||||
Additional (Optional) qualifiers | Allowed values | Default | |||||||||||||
(none) | |||||||||||||||
Advanced (Unprompted) qualifiers | Allowed values | Default | |||||||||||||
-release | Release number | Any string up to 9 characters | 0.0 | ||||||||||||
-date | Index date | Date string dd/mm/yy | 00/00/00 | ||||||||||||
-exclude | Wildcard filename(s) to exclude | Any string is accepted | An empty string is accepted | ||||||||||||
-indexoutdir | Index file output directory | Output directory | . |
# Number of files: 9 # Release: 0.0 # Date: 00/00/00 Dual filename database eem_ba1.seq eem_ba1.ref eem_est.seq eem_est.ref eem_fun.seq eem_fun.ref eem_htginv1.seq eem_htginv1.ref eem_hum1.seq eem_hum1.ref eem_in.seq eem_in.ref eem_ov.seq eem_ov.ref eem_ro.seq eem_ro.ref eem_vi.seq eem_vi.ref |
Order 71 Fill 47 Pagesize 2048 Level 0 Cachesize 200 Order2 82 Fill2 99 Count 47 Kwlimit 15 |
Order 71 Fill 47 Pagesize 2048 Level 0 Cachesize 200 Order2 82 Fill2 99 Count 39 Kwlimit 15 |
This file contains non-printing characters and so cannot be displayed here.
This file contains non-printing characters and so cannot be displayed here.
Program name | Description |
---|---|
dbiblast | Index a BLAST database |
dbifasta | Database indexing for fasta file databases |
dbiflat | Index a flat file database |
dbigcg | Index a GCG formatted database |
dbxfasta | Database b+tree indexing for fasta file databases |
dbxflat | Database b+tree indexing for flat file databases |