512 Commits (6e03a76ed83e40ecf8f88131a3b9eb4b20acb04f)
 

Author SHA1 Message Date
ghost 6e03a76ed8 add CURLOPT_SSL_VERIFYHOST/CURLOPT_SSL_VERIFYPEER options 1 year ago
ghost 004a5336de remove htmls pages ban on title tag not available 1 year ago
ghost f9774f2431 add innodb_buffer_pool_size default value 1 year ago
ghost de28d85a71 add connection exceptions 1 year ago
ghost 142d496108 fix SQL syntax error 1 year ago
ghost d46c4921c5 add page break 1 year ago
ghost 80b33f619c fix PAGES_LIMIT condition 1 year ago
ghost d024ffd770 implement unlimited settings customization for each host 1 year ago
ghost ab6c0379c8 implement hosts crawl queue, move robots, sitemaps, manifests to this task 1 year ago
ghost 6ee5e53ef4 show sitemaps processed debug 1 year ago
ghost 71724ae33f refactor manifest crawling 1 year ago
ghost cb37c57bc4 rename example files 1 year ago
ghost 68d5820f30 reserve one hour for huge load operations 1 year ago
ghost efbbf19601 fix multimedia snaps 1 year ago
ghost 6862fb35cd update readme 1 year ago
ghost 282a6d609d update manifest API 1 year ago
ghost b24d31f360 refactor cleaner, delegate tasks to crawler, init hostSetting table 1 year ago
ghost fd90e2d517 keep banned pages data 1 year ago
ghost ab8b6f6315 rename variables 1 year ago
ghost 02612d098b delete getFoundHostPage method, update API version 1 year ago
ghost 11e02da66d memory usage optimization, rename methods, remove memchached dependency from the model 1 year ago
ghost cbabea595b rename method name 1 year ago
ghost 7e3248ca2c rename method name 1 year ago
ghost 772975059c add mysql conf example 1 year ago
ghost 7c407e0d1f update crontab example 1 year ago
ghost 1249e8d29c fix CRAWL_PAGE_RANK_UPDATE condition 1 year ago
ghost 5df59661d8 add page rank update optional in the crawl queue 1 year ago
ghost a5a2ec233e unify mime-based search results template 1 year ago
ghost 6d5901c101 display shortened page URL instead of host address, change column name 1 year ago
ghost 1d7deffc4c update PR generation, delegate PR value from redirecting pages, update method names 1 year ago
ghost bba718c901 remove host pages total column 1 year ago
ghost b7a48b905e update method names 1 year ago
ghost e65c24f6f3 uodate roadmap 1 year ago
ghost 1655ec63b2 skip xmpp links 1 year ago
ghost 06c136f05c fix meta/nofollow attribute processing 1 year ago
ghost 39ba77fce5 fix page info conditions 1 year ago
ghost ef170f62f3 update cli 1 year ago
ghost 43776b5ff4 fix semaphores 1 year ago
ghost 48e0482dbd update Filter::searchQuery method 1 year ago
ghost cc0cca346b allow empty search queries 1 year ago
ghost d119756a41 fix index size 1 year ago
ghost 662351cc46 make meta fields index separated, set search priority by document title 1 year ago
ghost 5791877a4e update Filter::searchQuery method, fix search by URL 1 year ago
ghost 0bda87fbe6 fix priority calculation on zero value in PR 1 year ago
ghost bf69d894ca change search results piority, add PR to the page weight 1 year ago
ghost d3c628b477 update Filter::searchQuery method 1 year ago
ghost 61a0652f51 update Filter::searchQuery method 1 year ago
ghost 3235133cd0 extract keywords from URI 1 year ago
ghost 3d6bc54b66 update Filter::searchQuery method 1 year ago
ghost 2ef9948342 change default CRAWL_PAGE_HOME_SECONDS_OFFSET value to 1 month 1 year ago