请求

要进行数据爬取,首先需要懂得如何请求数据,在Node中我们通常使用的请求库有 requestsuperagent

request

安装:

yarn add request

简单示例:

var request = require('request');
request('http://www.baidu.com', function (error, response, body) {
  console.log('error:', error); // Print the error if one occurred
  console.log('statusCode:', response && response.statusCode); // Print the response status code if a response was received
  console.log('body:', body); // Print the HTML for the Google homepage.
});

打印出:

error: null
statusCode: 200
body: <!DOCTYPE html><!--STATUS OK--> ...

管道

使用管道(pipe)配合 fs 模块可以对请求的数据进行处理(比如存储)

存储图片

var request = require('request');
var fs = require('fs');
request('https://www.xiaoyulive.top/logo.png').pipe(fs.createWriteStream('logo.png'));

存储文本

var request = require('request');
var fs = require('fs');
request('https://www.xiaoyulive.top').pipe(fs.createWriteStream('index.html'));

链式语法

request 可以进行链式调用:

request
  .get('http://www.test.com')
  .on('response', function(response) {
    console.log(response.statusCode) // 200
    console.log(response.headers['content-type']) // 'text/html'
  })
  • 可以使用各种请求方式作为方法调用 getpostput
  • 可以使用 on 监听各种事件,比如 response
  • response 回调包括 statusCodeheaders 等http响应的相关数据

Promise调用

如果想要以 Promise 的形似使用,需要安装 request-promise

var rp = require('request-promise');
rp('http://www.xiaoyulive.top')
  .then(function (res) {
    console.log('body: ', res);
  })
  .catch(function (err) {
    console.log('err: ', err);
  });

使用选项:

var options = {
    uri: 'https://api.github.com/user/repos',
    qs: {
        access_token: 'xxxxx xxxxx' // -> uri + '?access_token=xxxxx%20xxxxx'
    },
    headers: {
        'User-Agent': 'Request-Promise'
    },
    json: true // Automatically parses the JSON string in the response
};
rp(options)
    .then(function (repos) {
        console.log('User has %d repos', repos.length);
    })
    .catch(function (err) {
        // API call failed...
    });

其他相关的 Promise 库:

request的更多使用方式参考:

superagent

安装:

yarn add superagent

使用方式:

const superagent = require('superagent');
// callback
superagent
  .post('https://test.com/api/pet')
  .send({ name: 'Manny', species: 'cat' }) // sends a JSON post body
  .set('X-API-Key', 'foobar')
  .set('accept', 'json')
  .end((err, res) => {
    // Calling the end function will send the request
  });
// promise with then/catch
superagent.get('http://www.xiaoyulive.top').then(console.log).catch(console.log)
// promise with async/await
;(async () => {
  try {
    const res = await superagent.get('http://www.xiaoyulive.top');
    console.log(res.text);
  } catch (err) {
    console.error(err);
  }
})();

支持的插件:

MIT Licensed | Copyright © 2018-present 滇ICP备16006294号

Design by Quanzaiyu | Power by VuePress