Chinaunix首页 | 论坛 | 博客
  • 博客访问: 547405
  • 博文数量: 136
  • 博客积分: 4010
  • 博客等级: 上校
  • 技术积分: 1343
  • 用 户 组: 普通用户
  • 注册时间: 2008-08-19 23:18
文章分类

全部博文(136)

文章存档

2011年(28)

2009年(60)

2008年(48)

我的朋友

分类: Java

2009-01-09 14:52:15

Well, you can not. However, you have two choices to proceed:

  • 1) Recover the pages already fetched and than restart the fetcher.

    • You'll need to create a file fetcher.done in the segment directory an than: , and . Assuming your index is at /index

      % touch /index/segments/2005somesegment/fetcher.done 
      
      % bin/nutch updatedb /index/db/ /index/segments/2005somesegment/
      
      % bin/nutch generate /index/db/ /index/segments/2005somesegment/
      
      % bin/nutch fetch /index/segments/2005somesegment

      All the pages that were not crawled will be re-generated for fetch. If you fetched lots of pages, and don't want to have to re-fetch them again, this is the best way.

    2) Discard the aborted output.

    • Delete all folders from the segment folder except the fetchlist folder and restart the fetcher.

      •  

阅读(541) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~