Perl One Liners-longbow0-ChinaUnix博客

Perl4Bioinformaticslongbow.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

longbow0

博客访问： 61162
博文数量： 8
博客积分： 733
博客等级：军士长
技术积分： 85
用户组：普通用户
注册时间： 2010-03-06 16:44

文章分类

全部博文（8）

Debian GNU/Linux（1）
Misc（1）
Perl（1）
Biology（1）
BioPerl（1）
Linux（1）
文件格式（0）
未分配的博文（2）

文章存档

2010年（8）

我的朋友

Perl Command Line Options

Usage: perl [switches] [--] [programfile] [arguments]

  -0[octal]         specify record separator (\0, if no argument)
  -a                autosplit mode with -n or -p (splits $_ into @F)
  -C[number/list]   enables the listed Unicode features
  -c                check syntax only (runs BEGIN and CHECK blocks)
  -d[:debugger]     run program under debugger
  -D[number/list]   set debugging flags (argument is a bit mask or alphabets)
  -e program        one line of program (several -e's allowed, omit programfile)
  -E program        like -e, but enables all optional features
  -f                don't do $sitelib/sitecustomize.pl at startup
  -F/pattern/       split() pattern for -a switch (//'s are optional)
  -i[extension]     edit <> files in place (makes backup if extension supplied)
  -Idirectory       specify @INC/#include directory (several -I's allowed)
  -l[octal]         enable line ending processing, specifies line terminator
  -[mM][-]module    execute "use/no module..." before executing program
  -n                assume "while (<>) { ... }" loop around program
  -p                assume loop like -n but print line also, like sed
  -P                run program through C preprocessor before compilation
  -s                enable rudimentary parsing for switches after programfile
  -S                look for programfile using PATH environment variable
  -t                enable tainting warnings
  -T                enable tainting checks
  -u                dump core after parsing program
  -U                allow unsafe operations
  -v                print version, subversion (includes VERY IMPORTANT perl info)
  -V[:variable]     print configuration summary (or a single Config.pm variable)
  -w                enable many useful warnings (RECOMMENDED)
  -W                enable all warnings
  -x[directory]     strip off text before #!perl line and perhaps cd to directory
  -X                disable all warnings

部分中文译文

-0<数字>
    (用8进制表示)指定记录分隔符($/变量)，默认为换行
-00
    段落模式，即以连续换行为分隔符
-0777
    禁用分隔符，即将整个文件作为一个记录
-a
    自动分隔模式，用空格分隔$_并保存到@F中。相当于@F = split ''。分隔符可以使用-F参数指定
-F
    指定-a的分隔符，可以使用正则表达式
-e
    执行指定的脚本。
-i<扩展名>
    原地替换文件，并将旧文件用指定的扩展名备份。不指定扩展名则不备份。
-l
    对输入内容自动chomp，对输出内容自动添加换行
-n
    自动循环，相当于 while(<>) { 脚本; }
-p
    自动循环+自动输出，相当于 while(<>) { 脚本; print; }

Use on Bioinformatics

Fetch Accession Numbers in WebPages

e.g., The , save this page as HTML format, or simple copy & past it to a TEXT editor and save as TXT format.

If it's a HTML file, swineflu.html, run html2text (Linux ONLY) and the output TXT file is 'swineflu.txt':

html2text -o swineflu.txt swineflu.html

Next,

perl -lane 'for (@F) {print $1 if /([A-Z]{1,2}\d{5,6})/}' swineflu.txt > swineflu.lst

Here the file 'swineflu.txt' contains all accession numbers exist on the webpage.

Compare Two Accession Number List files

This might NOT be a good way to do it.

Looking for Accession Numbers exist in file2 only and output results in file3.

perl -lne 'print unless (`grep $_ file1`)' file2 > file3

阅读(1128) | 评论(0) | 转发(0) |

上一篇：syslinux.cfg 3.54 译文

下一篇：Dropbox: 实时同步的免费网盘

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6