分类: Java
2009-01-09 14:52:15
Well, you can not. However, you have two choices to proceed:
1) Recover the pages already fetched and than restart the fetcher.
You'll need to create a file fetcher.done in the segment directory an than: , and . Assuming your index is at /index
% touch /index/segments/2005somesegment/fetcher.done % bin/nutch updatedb /index/db/ /index/segments/2005somesegment/ % bin/nutch generate /index/db/ /index/segments/2005somesegment/ % bin/nutch fetch /index/segments/2005somesegment
All the pages that were not crawled will be re-generated for fetch. If you fetched lots of pages, and don't want to have to re-fetch them again, this is the best way.
2) Discard the aborted output.
Delete all folders from the segment folder except the fetchlist folder and restart the fetcher.