回调函数

API应用的回调与爬虫的回调很类似,因为少了链接发现模块,所以也相应少了一些回调。下面只列出API应用的回调。

initCrawl

function initCrawl(site)

与爬虫的initCrawl基本类似,建议只添加一个scanUrl,site.addUrl添加普通链接无效。
参考爬虫的initCrawl

beforeCrawl

function beforeCrawl(site)

与爬虫的beforeCrawl一样,参考爬虫的beforeCrawl

beforeDownloadPage

function beforeDownloadPage(page, site)

与爬虫的beforeDownloadPage一样,参考爬虫的beforeDownloadPage

onChangeProxy

function onChangeProxy(site, page)

与爬虫的onChangeProxy一样,参考爬虫的onChangeProxy

isAntiSpider

function isAntiSpider(url, content, page)

与爬虫的isAntiSpider一样,参考爬虫的isAntiSpider

afterDownloadPage

function afterDownloadPage(page, site)

与爬虫的afterDownloadPage一样,参考爬虫的afterDownloadPage

afterDownloadAttachedPage

function afterDownloadAttachedPage(page, site)

与爬虫的afterDownloadAttachedPage一样,参考爬虫的afterDownloadAttachedPage

afterExtractField

function afterExtractField(fieldName, data, page, site, index)

与爬虫的afterExtractField一样,参考爬虫的afterExtractField

beforeHandleImg

function beforeHandleImg(fieldName, img)

与爬虫的beforeHandleImg一样,参考爬虫的beforeHandleImg

beforeHostFile

function beforeHostFile(fieldName, url)

与爬虫的beforeHostFile一样,参考爬虫的beforeHostFile

afterHostFile

function afterHostFile(fieldName, hostedUrl)

与爬虫的afterHostFile一样,参考爬虫的afterHostFile

afterExtractPage

function afterExtractPage(page, data, site)

与爬虫的afterExtractPage一样,参考爬虫的afterExtractPage