22 Commits (25b6bce2ec374fdd8208ccae75a57ea07d4e1be9)

Author SHA1 Message Date
ghost 25b6bce2ec add crawler/cleaner logs 2 years ago
ghost b6605b9132 implement not reachable resources ban feature with timeout to prevent extra http requests 2 years ago
ghost 702a14b634 add mime content type crawling #1 2 years ago
ghost 5999fb3a73 add distributed hosts crawling using yggo nodes manifest 2 years ago
ghost d4f66c83e7 fix image crawling errors 2 years ago
ghost 68581960a3 add image.data field 2 years ago
ghost 0741a3e9ef implement image crawler 2 years ago
ghost 78931ebc74 normalize host image description storage 2 years ago
ghost db617f9939 refactor image storage model 2 years ago
ghost 6d8f4f4882 create manifests registry 2 years ago
ghost ec20435790 remove presets registry (because provided in the node API) 2 years ago
ghost 11aa404807 add metaYggo field index 2 years ago
ghost 352466ad03 update host.robotsPostfix registry 2 years ago
ghost 6550eb310f update host.robotsPostfix rules 2 years ago
ghost 6cee58214e update host.robotsPostfix rules 2 years ago
ghost 6f4daf7a25 update host.robotsPostfix rule 2 years ago
ghost f4db66d53f add new host.robotsPostfix rules 2 years ago
ghost be7eae501b add host.status registry #1, #5 2 years ago
ghost 3c9bc1adaa add required user-agent construction #5 2 years ago
ghost b819fda025 init yggdrasil robots.txt registry #5 2 years ago
ghost df6f2a1869 implement CRAWL_ROBOTS_POSTFIX_RULES configuration #5 2 years ago
ghost 2495a2bbc7 implement MySQL/Sphinx data model #3, add basical robots.txt support #2 2 years ago