1. XenForo 1.5.14 中文版——支持中文搜索!现已发布!查看详情
  2. Xenforo 爱好者讨论群:215909318 XenForo专区

新闻 Scrapy 1.1.0 发布,web 爬虫框架 下载

本帖由 漂亮的石头2016-05-12 发布。版面名称:软件资讯

  1. 漂亮的石头

    漂亮的石头 版主 管理成员

    注册:
    2012-02-10
    帖子:
    487,660
    赞:
    47
    Scrapy 1.1.0 发布了。Scrapy 是一套基于基于Twisted的异步处理框架,纯python实现的爬虫框架,用户只需要定制开发几个模块就可以轻松的实现一个爬虫,用来抓取网页内容以及各种图片,非常之方便。

    改进记录如下:


    • Scrapy 1.1 has beta Python 3 support (requires Twisted >= 15.5). See:ref:`news_betapy3` for more details and some limitations.


    • Hot new features:


    • These bug fixes may require your attention:


      • Don't retry bad requests (HTTP 400) by default (:issue:`1289`). If you need the old behavior, add 400 to :setting:`RETRY_HTTP_CODES`.


      • Fix shell files argument handling (:issue:`1710`, :issue:`1550`). If you try scrapy shell index.html it will try to load the URL http://index.html, use scrapy shell ./index.html to load a local file.


      • Robots.txt compliance is now enabled by default for newly-created projects (:issue:`1724`). Scrapy will also wait for robots.txt to be downloaded before proceeding with the crawl (:issue:`1735`). If you want to disable this behavior, update :setting:`ROBOTSTXT_OBEY` in settings.py file after creating a new project.


      • Exporters now work on unicode, instead of bytes by default (:issue:`1080`). If you use PythonItemExporter, you may want to update your code to disable binary mode which is now deprecated.


      • Accept XML node names containing dots as valid (:issue:`1533`).


      • When uploading files or images to S3 (with FilesPipeline orImagesPipeline), the default ACL policy is now "private" instead of "public" Warning: backwards incompatible!. You can use :setting:`FILES_STORE_S3_ACL` to change it.


      • We've reimplemented canonicalize_url() for more correct output, especially for URLs with non-ASCII characters (:issue:`1947`). This could change link extractors output compared to previous scrapy versions. This may also invalidate some cache entries you could still have from pre-1.1 runs.Warning: backwards incompatible!.

    下载地址:

    Scrapy 1.1.0 发布,web 爬虫框架下载地址
     
正在加载...