99 Commits (d912caeb0c15833580d06298e07afbe7815814f7)

Author SHA1 Message Date
ghost d912caeb0c fix variable name 2 years ago
ghost 5346b13602 implement custom hostPageDom elements index 2 years ago
ghost 5df598a1d4 fix variable name 2 years ago
ghost e16a7b8171 fix HY000/1366 error processing 2 years ago
ghost dc2d971ba0 clean up banned pages extra data 2 years ago
ghost d96abb8ea8 ban host page on encoding not detected 2 years ago
ghost d2469e9adc fix meta variables overwrite 2 years ago
ghost 1d5d5ead5d fix DomDocument initiation without encoding provided 2 years ago
ghost 8a747de341 fix HTML/multimedia content detection 2 years ago
ghost 93c6067fd9 fix host page mime detection 2 years ago
ghost 80d3912bc7 allow x-raw-image links 2 years ago
ghost b23f550a1b skip magnet links 2 years ago
ghost acba2816e2 remove transaction from tables optimization case 2 years ago
ghost b2cf9fc6a5 do table optimization in separated transaction 2 years ago
ghost ab78e17ca8 add hostPage.size collection 2 years ago
ghost 0af5d165d3 remove logCrawler column not in use 2 years ago
ghost 4b16b41440 make transaction for each item in crawl queue 2 years ago
ghost b585b16d31 fix datatype error detection 2 years ago
ghost c5e25d17fb prevent page ban when it MIME in the whitelist, skip steps below only (make multimedia/streaming resources visible in search results) 2 years ago
ghost 4fa33afe40 prevent infinitive connection on streaming resources detected 2 years ago
ghost 345c59b5f4 collect target location links on page redirect available 2 years ago
ghost 5d7f2bf68c fix snap foreign keys deletion 2 years ago
ghost 242e0abd86 ban pages only on data type error codes only 2 years ago
ghost 62a4f33b53 load missed dependency 2 years ago
ghost 512bd56056 ban page that throws the error and stuck the crawl queue 2 years ago
ghost 45c4f7b7b0 add database optimization settings 2 years ago
ghost 81f7ea1e1e implement multi-storage snap downloads 2 years ago
ghost 1969707eeb integrate optional MEGA/cmd snap storage 2 years ago
ghost bd99dcb023 add leading zero to mkdir access code 2 years ago
ghost 48664f0caf fix zip close, loop brake condition 2 years ago
ghost 50c9066f62 add tables optimization to the cron/cleaner task 2 years ago
ghost 0d19004e86 make local snap storage optimization 2 years ago
ghost efc66d5dab update local snap storage paths 2 years ago
ghost 2f7d99079d implement local snaps 2 years ago
ghost 9477d87b2e change strpos to stripos 2 years ago
ghost 28e8bcf8d7 add audio/video media crawl support 2 years ago
ghost 307ebcf0b1 add page description on title | description | keywords not empty, remove deprecated constructions 2 years ago
ghost 7c5ba050b2 fix media crawling 2 years ago
ghost 0fed16621a fix mime content type update 2 years ago
ghost db0e66c846 refactor to mime-based content index #1 2 years ago
ghost 0ffcee1efb fix image description updates timing 2 years ago
ghost 2c5ca1b630 fix image description duplicate 2 years ago
ghost 28bf526d53 add host nsfw settings 2 years ago
ghost 8ce0324e94 convert page data to string 2 years ago
ghost d186fff48f skip curl download on response data size reached 2 years ago
ghost d7a5f7ef84 remove content filter, snap raw the data 2 years ago
ghost 23ead4e12c update page / image description models, implement history snap crawling 2 years ago
ghost 0e9d29675f implement host page description history crawling 2 years ago
ghost 6371def666 fix attributes passing 2 years ago
ghost 32d0f390d3 update http code and mime type on page/image ban event 2 years ago