Chinaunix首页 | 论坛 | 博客
  • 博客访问: 258363
  • 博文数量: 45
  • 博客积分: 170
  • 博客等级: 入伍新兵
  • 技术积分: 488
  • 用 户 组: 普通用户
  • 注册时间: 2012-09-13 14:43
文章分类

全部博文(45)

文章存档

2014年(2)

2013年(35)

2012年(8)

我的朋友

分类: Java

2013-09-12 20:52:23

    通过调用getHtmlContent(string url,String encode)获取网页源代码

点击(此处)折叠或打开

  1. public static String getHtmlContent(String url, String encode) {
  2.         if (!url.toLowerCase().startsWith("http://")) {
  3.             url = "http://" + url;
  4.         }
  5.         try {
  6.             URL rUrl = new URL(url);
  7.             return getHtmlContent(rUrl, encode);
  8.         } catch (Exception e) {
  9.             e.printStackTrace();
  10.             return null;
  11.         }
  12.     }
  13.     
  14.     public static String getHtmlContent(URL url, String encode) {
  15.         StringBuffer contentBuffer = new StringBuffer();
  16.         int responseCode = -1;
  17.         HttpURLConnection con = null;
  18.         try {
  19.             con = (HttpURLConnection) url.openConnection();
  20.             con.setRequestProperty("User-Agent", "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt)");// IE代理进行下载
  21.             con.setConnectTimeout(60000);
  22.             con.setReadTimeout(60000);
  23.             // 获得网页返回信息码
  24.             responseCode = con.getResponseCode();
  25.             if (responseCode == -1) {
  26.                 System.out.println(url.toString() + " : connection is failure...");
  27.                 con.disconnect();
  28.                 return null;
  29.             }
  30.             if (responseCode >= 400) // 请求失败
  31.             {
  32.                 System.out.println("请求失败:get response code: " + responseCode);
  33.                 con.disconnect();
  34.                 return null;
  35.             }
  36.             InputStream inStr = con.getInputStream();
  37.             InputStreamReader istreamReader = new InputStreamReader(inStr, encode);
  38.             BufferedReader buffStr = new BufferedReader(istreamReader);
  39.             String str = null;
  40.             while ((str = buffStr.readLine()) != null)
  41.                 contentBuffer.append(str);
  42.             inStr.close();
  43.         } catch (IOException e) {
  44.             e.printStackTrace();
  45.             contentBuffer = null;
  46.             System.out.println("error: " + url.toString());
  47.         } finally {
  48.             con.disconnect();
  49.         }
  50.         return contentBuffer.toString();
  51.     }

阅读(788) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~