Chinaunix首页 | 论坛 | 博客
  • 博客访问: 3115198
  • 博文数量: 685
  • 博客积分: 0
  • 博客等级: 民兵
  • 技术积分: 5303
  • 用 户 组: 普通用户
  • 注册时间: 2014-04-19 14:17
个人简介

文章分类

全部博文(685)

文章存档

2015年(116)

2014年(569)

分类: LINUX

2015-03-02 11:42:08

原文地址:http://blog.chinaunix.net/uid-25885064-id-3487145.html

网页抓取和ftp访问是目前很常见的一个应用需要,无论是搜索引擎的爬虫,分析程序,资源获取程序,WebService等等都是需要的,自己开发抓取库当然是最好了,不过开发需要时间和周期,使用现有的Open source程序是个更好的选择,一来别人已经写的很好了,二来自己使用起来非常快速,三来还能够学习一下别人程序的优点。


libwww
官方网站:
更多信息:
运行平台:Unix/LinuxWindows 

Libwww 是一个用C语言写成的高度模组化用户端的网页存取API 


libcurl

官方网站:http://curl.haxx.se/libcurl
更多特点:http://curl.haxx.se/docs/features.html
运行平台:Unix/LinuxWindows


libcurl为一个免费开源的,客户端url传输库,支持FTPFTPSTFTPHTTPHTTPSGOPHERTELNETDICTFILELDAP,跨平台(支持 WindowsUnixLinux等),线程安全,支持Ipv6,并且易于使用。


libfetch
官方网站:
更多信息:
运行平台:BSD


HTTP/FTP客户端库】
资料来源:http://curl.haxx.se/libcurl/competitors.html

Free Software and Open Source projects have a long tradition of forks and duplicate efforts. We enjoy "doing it ourselves", no matter if someone else has done something very similar already. Free/open libraries that cover parts of libcurl's features:

libcurl (MIT)

a highly portable and easy-to-use client-side URL transfer library, supporting FTP, FTPS, HTTP, HTTPS, SCP, SFTP, TELNET, DICT, FILE, TFTP and LDAP. libcurl also supports HTTPS certificates, HTTP POST, HTTP PUT, FTP uploading, kerberos, HTTP form based upload, proxies, cookies, user+password authentication, file transfer resume, http proxy tunnelling and more!

 (LGPL)

Having a glance at libghttp (a gnome http library), it looks as if it works rather similar to libcurl (for http). There's no web page for this and the person who's email is mentioned in the README of the latest release I found claims he has passed the leadership of the project to "eazel". Popular choice among GNOME projects.

 () comparison with libcurl

More complex, and and harder to use than libcurl is. Includes everything from multi-threading to HTML parsing. The most notable transfer-related feature that libcurl does not offer but libwww does, is caching.

 (GPL)

C++ library "for transferring files via http, ftp, gopher, proxy server". Based on 'snarf' 2.0.9-code (formerly known as libsnarf). Quote from freshmeat:  "As the author of snarf, I have to say this frightens me. Snarf's networking system is far from robust and complete. It's probably full of bugs, and although it works for maybe 85% of all current situations, I wouldn't base a library on it."

 (LGPL)

An HTTP and WebDAV client library, with a C interface. I've mainly heard and seen people use this with WebDAV as their main interest.

(LGPL) comparison with libcurl

Part of glib (GNOME). Supports: HTTP 1.1, Persistent connections, Asynchronous DNS and transfers, Connection cache, Redirects, Basic, Digest, NTLM authentication, SSL with OpenSSL or Mozilla NSS, Proxy support including SSL, SOCKS support, POST data. Probably not very portable. Lacks: cookie support, NTLM for proxies, GSS, gzip encoding, trailers in chunked responses and more.

 (MPL)

Handles URLs, protocols, transports for the Mozilla browser.

 (MPL)

Minimal download library targeted to be much smaller than the above mentioned netlib. HTTP and FTP support.

 (GPL)

While not a library at all, I've been told that people sometimes extract the network code from it and base their own hacks from there.

 (BSD)

Does HTTP and FTP transfers (both ways), supports file: URLs, and an API for URL parsing. The utility  fetch  that is built on libfetch is an integral part of the    operating system.

 (LGPL)

" a small, robust, flexible library for downloading files via HTTP using the GET method. "

 (Artistic License)

" a very small C library to make http queries (GET, HEAD, PUT, DELETE, etc.) easily portable and embeddable "

 also known as IXMLHTTPRequest (part of MSXML 3.0)

(Windows) Provides client-side protocol support for communication with HTTP servers. A client computer can use the XMLHTTP object to send an arbitrary HTTP request, receive the response, and have the Microsoft? XML Document Object Model (DOM) parse that response.

 (GPL)

QHttp is a class in the Qt library from Troll Tech. Seems to be restricted to plain HTTP. Supports GET, POST and proxy. Asynchronous.

 (GPL)

" a set of routines that implement the FTP protocol. They allow applications to create and access remote files through function calls instead of needing to fork and exec an interactive ftp client program."

 (GPL)

A C++ library for "easy FTP client functionality. It features resuming of up- and downloads, FXP support, SSL/TLS encryption, and logging functionality."

Has a URLStream class. This C++ class allow you to download a file using HTTP. See demo/urlfetch.cpp in commoncpp2-1.3.19.tar.gz

 (LGPL)

Java HTTP client library.

 (Apache License)

A Java HTTP client library written by the Jakarta project.



阅读(1360) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~