html - scrapy返回的response为str对象,如何转换成response提取目标值?
阿神
阿神 2017-04-17 17:44:42
[Python讨论组]

通过抓取,最后返回的对象为以下内容,发现对象属性为string,现在应该如何提取?

{"r":0,
 "msg": ["

\n\n\"Android\nAndroid \u6e38\u620f<\/strong>\n<\/a>\n

<\/p>\n\n<\/i>\u5173\u6ce8<\/a>\n\n<\/p><\/p>","

\n\n\"Unity\uff08\u6e38\u620f\u5f15\u64ce\uff09\"\nUnity\uff08\u6e38\u620f\u5f15\u64ce\uff09<\/strong>\n<\/a>\n

Unity \u662f\u4e00\u79cd\u96c6\u6210\u7684\u521b\u4f5c\u5de5\u5177\uff0c\u9488\u5bf93D\u6e38\u620f\u548c\u5176\u4ed6\u4ea4\u4e92\u5185\u5bb9\uff08\u2026<\/p>\n\n<\/i>\u5173\u6ce8<\/a>\n\n<\/p><\/p>","

\n\n\"\u76db\u5927\u7f51\u7edc\"\n\u76db\u5927\u7f51\u7edc<\/strong>\n<\/a>\n

\u4e0a\u6d77\u76db\u5927\u7f51\u7edc\u53d1\u5c55\u6709\u9650\u516c\u53f8\uff08\u7b80\u79f0\u201c\u76db\u5927\u7f51\u7edc\u201d\uff09\u662f\u4e2d\u56fd\u7684\u7f51\u7edc\u6e38\u620f\u8fd0\u2026<\/p>\n\n<\/i>\u5173\u6ce8<\/a>\n\n<\/p><\/p>","

\n\n\"\u72ec\u7acb\u6e38\u620f\"\n\u72ec\u7acb\u6e38\u620f<\/strong>\n<\/a>\n

\u72ec\u7acb\u6e38\u620f\u6307\u6e38\u620f\u5f00\u53d1\u8005\u6ca1\u6709\u6e38\u620f\u516c\u53f8\u6216\u6e38\u620f\u53d1\u884c\u5546\u63d0\u4f9b\u7684\u85aa\u8d44\uff0c\u5fc5\u987b\u72ec\u2026<\/p>\n\n<\/i>\u5173\u6ce8<\/a>\n\n<\/p><\/p>","

\n\n\"\u6865\u724c\"\n\u6865\u724c<\/strong>\n<\/a>\n

\u7231\u597d\u8005\u904d\u53ca\u5168\u4e16\u754c\u7684\u4e00\u79cd\u6251\u514b\u6e38\u620f\uff0c\u4e16\u754c\u8303\u56f4\u3001\u6d32\u9645\u8303\u56f4\u90fd\u8bbe\u6709\u6865\u724c\u534f\u2026<\/p>\n\n<\/i>\u5173\u6ce8<\/a>\n\n<\/p><\/p>","

\n\n\"\u6e38\u620f\u4ea7\u4e1a\"\n\u6e38\u620f\u4ea7\u4e1a<\/strong>\n<\/a>\n

<\/p>\n\n<\/i>\u5173\u6ce8<\/a>\n\n<\/p><\/p>","

\n\n\"\u827a\u7535\n\u827a\u7535 (EA)<\/strong>\n<\/a>\n

\u7f8e\u56fd\u827a\u7535\u662f\u5168\u7403\u8457\u540d\u7684\u4e92\u52a8\u5a31\u4e50\u8f6f\u4ef6\u5236\u4f5c\u4e0e\u53d1\u884c\u516c\u53f8\uff0c\u603b\u90e8\u4f4d\u4e8e\u7f8e\u56fd\u52a0\u2026<\/p>\n\n<\/i>\u5173\u6ce8<\/a>\n\n<\/p><\/p>","

\n\n\"\u523a\u5ba2\u4fe1\u6761\uff08Assassin's\n\u523a\u5ba2\u4fe1\u6761\uff08Assassin's Creed\uff09<\/strong>\n<\/a>\n

\u2014\u2014\u613f\u6d1e\u5bdf\u4e4b\u7236\u6307\u5f15\u6211\u7b49\u3002 \u4e07\u7269\u7686\u865a\uff0c\u4e07\u4e8b\u7686\u5141\u3002 \u8fd9\u662f\u6211\u4eec\u7684\u7956\u5148\u2026<\/p>\n\n<\/i>\u5173\u6ce8<\/a>\n\n<\/p><\/p>","

\n\n\"\u6587\u660e\uff08\u7cfb\u5217\u6e38\u620f\uff09\"\n\u6587\u660e\uff08\u7cfb\u5217\u6e38\u620f\uff09<\/strong>\n<\/a>\n

\u300a\u6587\u660e\u300b\u6700\u65e9\u7531\u72ec\u7acb\u5f00\u53d1\u8005\u5f00\u53d1\uff0c\u540e\u7ecfMicroprose\uff0c\u518d\u5230F\u2026<\/p>\n\n<\/i>\u5173\u6ce8<\/a>\n\n<\/p><\/p>","

\n\n\"Xbox\"\nXbox<\/strong>\n<\/a>\n

Xbox \u662f\u5fae\u8f6f\u6240\u5f00\u53d1\u3001\u9500\u552e\u7684\u5bb6\u7528\u6e38\u620f\u4e3b\u673a\u3002<\/p>\n\n<\/i>\u5173\u6ce8<\/a>\n\n<\/p><\/p>","

\n\n\"\u661f\u9645\u4e89\u9738\"\n\u661f\u9645\u4e89\u9738<\/strong>\n<\/a>\n

\u300a\u661f\u9645\u4e89\u9738\u300b\u662f\u7531\u66b4\u96ea\u5a31\u4e50\u5236\u4f5c\u53d1\u884c\u7684\u4e00\u6b3e\u8457\u540d\u5373\u65f6\u6218\u7565\u6e38\u620f\u3002\u8fd9\u662f\u661f\u2026<\/p>\n\n<\/i>\u5173\u6ce8<\/a>\n\n<\/p><\/p>","

\n\n\"Cocos2d-x\"\nCocos2d-x<\/strong>\n<\/a>\n

Cocos2d-x\u662f\u4e00\u4e2a\u5f00\u6e90\u7684\u79fb\u52a82D\uff08\u76ee\u524d\u5df2\u7ecf\u67093D\u7248\u672c\uff09\u6e38\u2026<\/p>\n\n<\/i>\u5173\u6ce8<\/a>\n\n<\/p><\/p>","

\n\n\"\u7cbe\u7075\u5b9d\u53ef\u68a6\uff08Pok\u00e9mon\uff09\"\n\u7cbe\u7075\u5b9d\u53ef\u68a6\uff08Pok\u00e9mon\uff09<\/strong>\n<\/a>\n

\u7cbe\u7075\u5b9d\u53ef\u68a6\u7cfb\u5217\uff08Pok\u00e9mon\uff0c\u30dd\u30b1\u30c3\u30c8\u30e2\u30f3\u30b9\u30bf\u30fc\uff09\uff0c\u53c8\u79f0\u53e3\u888b\u2026<\/p>\n\n<\/i>\u5173\u6ce8<\/a>\n\n<\/p><\/p>","

\n\n\"Ingress\uff08\u589e\u5f3a\u73b0\u5b9e\u6e38\u620f\uff09\"\nIngress\uff08\u589e\u5f3a\u73b0\u5b9e\u6e38\u620f\uff09<\/strong>\n<\/a>\n

Ingress \u662f\u4e00\u6b3e\u4fb5\u5165\u5f0f\u865a\u62df\u73b0\u5b9e\u6e38\u620f\u3001\u5927\u578b\u591a\u4eba\u7535\u5b50\u6e38\u620f\uff0c\u4e2d\u2026<\/p>\n\n<\/i>\u5173\u6ce8<\/a>\n\n<\/p><\/p>","

\n\n\"\u6881\u6b22\"\n\u6881\u6b22<\/strong>\n<\/a>\n

\u6881\u6b22 <\/p>\n\n<\/i>\u5173\u6ce8<\/a>\n\n<\/p><\/p>","

\n\n\"\u7b2c\u4e00\u4eba\u79f0\u89c6\u89d2\u5c04\u51fb\u6e38\u620f\uff08FPS\uff09\"\n\u7b2c\u4e00\u4eba\u79f0\u89c6\u89d2\u5c04\u51fb\u6e38\u620f\uff08FPS\uff09<\/strong>\n<\/a>\n

\u7b2c\u4e00\u4eba\u79f0\u5c04\u51fb\uff08First-person shooter\uff09\u6e38\u620f\u662f\u2026<\/p>\n\n<\/i>\u5173\u6ce8<\/a>\n\n<\/p><\/p>","

\n\n\"\u76db\u5927\u521b\u65b0\u9662\"\n\u76db\u5927\u521b\u65b0\u9662<\/strong>\n<\/a>\n

<\/p>\n\n<\/i>\u5173\u6ce8<\/a>\n\n<\/p><\/p>","

\n\n\"\u6e38\u620f\u5f15\u64ce\"\n\u6e38\u620f\u5f15\u64ce<\/strong>\n<\/a>\n

<\/p>\n\n<\/i>\u5173\u6ce8<\/a>\n\n<\/p><\/p>","

\n\n\"\u7535\u5b50\u6e38\u620f\"\n\u7535\u5b50\u6e38\u620f<\/strong>\n<\/a>\n

<\/p>\n\n<\/i>\u5173\u6ce8<\/a>\n\n<\/p><\/p>","

\n\n\"\u6e38\u620f\u754c\u9762\u8bbe\u8ba1\"\n\u6e38\u620f\u754c\u9762\u8bbe\u8ba1<\/strong>\n<\/a>\n

<\/p>\n\n<\/i>\u5173\u6ce8<\/a>\n\n<\/p><\/p>"] }

如何转换为response对象,然后提取中间的目标值?

阿神
阿神

闭关修行中......

全部回复(2)
黄舟

这是JSON,转化一下:

import json

content = '...你那堆东西'
result = json.loads(content)

print result['msg']

这样就把你获取的JSON格式字符串转化成了result字典,然后再正常处理就好了。

迷茫

返回的是json数据数据,用python的json库直接解析。
我感觉爬到json数据处理起来最爽了,
最近爬的很多都是后续请求生成的 json数据

热门教程
更多>
最新下载
更多>
网站特效
网站源码
网站素材
前端模板
关于我们 免责申明 举报中心 意见反馈 讲师合作 广告合作 最新更新 English
php中文网:公益在线php培训,帮助PHP学习者快速成长!
关注服务号 技术交流群
PHP中文网订阅号
每天精选资源文章推送
PHP中文网APP
随时随地碎片化学习

Copyright 2014-2025 https://www.php.cn/ All Rights Reserved | php.cn | 湘ICP备2023035733号