Chinaunix首页 | 论坛 | 博客
  • 博客访问: 149781
  • 博文数量: 185
  • 博客积分: 0
  • 博客等级: 民兵
  • 技术积分: 985
  • 用 户 组: 普通用户
  • 注册时间: 2013-04-29 20:33
文章分类

全部博文(185)

文章存档

2013年(185)

我的朋友

分类: HADOOP

2013-05-24 19:36:30

Like many an unfortunate programmer soul before me, I am currently dealing with an archaic file format that refuses to die. I'm talking 1970 format specification archaic. If it were solely up to me, we would throw out both the file format and any tool that ever knew how to handle it, and start from scratch. I can dream, but that unfortunately that won't resolve my issueThe format: Pretty Loosely defined, as years of nonsensical revisions have destroyed almost all back compatibility it once had. Basically, the only constant is that there are section headings, with few rules about what comes before or after these lines. Thankfully, all possible heading permutations are known. Here's a fake example:elsif($l=/^SUNGLASSES/i) { $r=\$sung; name($r);}.$$r .= $l;print STDERR "Finished processing $ARGV\n" if eof;As you can see, with the perl script I basically just change where a reference points to when I get to a certain pattern match, and concatenate each line of the file to its respective string until I get to the next pattern match. These are then printed out later as one big concated fileI would and could stick with perl, but my needs are becoming more complex every day and I would really like to see how this problem can be solved elegantly with python (can it?). As of right now my method in python is basically to load the entire file as a string, search for the heading locations, then split up the string based on the heading indices and concat the strings. This requires a lot of regex, ifstatements and variables for something that seems so simple in another languageIt seems that this really boils down to a fundamental language issue. I found a very nice SO discussion about python's "callbyobject" style as compared with that of other languages that are callbyreferencePython: How do I pass a variable by reference?

Yet, I still can't think of an elegant way to do this in python. If anyone can help kick my brain in the right direction, it would be greatly appreciated.

阅读(612) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~