xpath特例-laoliulaoliu-ChinaUnix博客

miraclemiracle.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

laoliulaoliu

博客访问： 4668163
博文数量： 1214
博客积分： 13195
博客等级：上将
技术积分： 9105
用户组：普通用户
注册时间： 2007-01-19 14:41

个人简介

C++,python,热爱算法和机器学习

文章分类

全部博文（1214）

cloud（3）
operation（9）
tornado（4）
mac_os（1）
golang（4）
架构（13）
git（4）
security（29）
shell（1）
macbook（1）
ruby（13）
javascript（15）
design（3）
testing（1）
mac（1）
bigdata（69）
nosql（46）
R（9）
gcj/acm（6）
NLP（10）
小说（3）
matlab（4）
web（44）
java（66）
product（7）
c#（1）
language（4）
machine learning（76）
science（4）
opencourse（2）
windows（3）
search（33）
algorithm（65）
database（51）
compiler（11）
ACE（5）
poem（1）
programming（29）
python（140）
assembly（1）
linux（49）
C++（16）
book（2）
cate（1）
phliosophy（3）
mental（30）
Science fiction（1）
Software（5）
c（23）
network（65）
CS（15）
thinking（10）
BSD（13）
solaris10（2）
life（57）
Debian（16）
economy（7）
Mathematics（57）
OS（8）
ibm（2）
gentoo（32）
未分配的博文（8）

文章存档

2021年（13）

2020年（49）

2019年（14）

2018年（27）

2017年（69）

2016年（100）

2015年（106）

2014年（240）

2013年（5）

2012年（193）

2011年（155）

2010年（93）

2009年（62）

2008年（51）

2007年（37）

我的朋友

相关博文

xpath特例

分类：网络与安全

2012-08-24 19:52:45

    import lxml.etree
    def parse():
        parser = lxml.etree.XMLParser(recover=True)
        tree = lxml.etree.fromstring(content, parser)
        items = tree.xpath('//div[@id="leftNavContainer"]//ul[@data-typeid="n"]//span[@class="refinementLink"]')

        for item in items:
            url = item.xpath('parent::*/@href')[0]

最后一行表达式含义： item 的父节点所有元素中，取出href属性的内容。
parent::[@href] 则只能取出父节点中含有href属性的元素。

阅读(884) | 评论(0) | 转发(0) |

上一篇：git - the simple guide

下一篇：Python, MySQLdb and UTF-8

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6