Chinaunix首页 | 论坛 | 博客
  • 博客访问: 59196
  • 博文数量: 8
  • 博客积分: 733
  • 博客等级: 军士长
  • 技术积分: 85
  • 用 户 组: 普通用户
  • 注册时间: 2010-03-06 16:44
文章分类

全部博文(8)

文章存档

2010年(8)

最近访客

分类:

2010-03-11 11:31:24

Accession numbers are identifiers for a sequence, for example P123456. They can have version numbers if suffixed with a "." and a number, for example P123456.2. This aids distinguishing between older and newer versions of a sequence, and to track which actual sequence was used in an analysis.

NCBI Reference sequences have their own syntax.

Accessions are allocated in batches to the different sequence repositories DDBJ, EMBL Database, and NCBI. Table 1 shows the format of some unversioned accession numbers.

Table 1: Some Accession Number Formats
 Database  Regular Expression  Perl Regular Expression
 RefSeq  [:alpha]{2}_[:digit]{6,9} or NZ_[:alpha]{4} [:digit]{6,9}  [A-Z]{2}_\d{6,9} or NZ_[A-Z]{4}\d{6,9}
 Swissprot  [OPQ][:digit][:alnum]{3}[:digit]  
 GenBank/EMBL/DDBJ  [:alpha][:digit]{5} or [:alpha]{2}[:digit]{6}  [A-Z]\d{5} or [A-Z]{2}\d{6}
 PRF  [:digit]{6,7} [:alpha]  \d{6,7}[A-Z]
 PDB  [:digit][:alpha]{3}  \d[A-Z]{3}
 MMDB  [:digit]{4}  \d{4}
 GenBank GI  [:digit]{5,}  \d{5,}


阅读(2048) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~