97 Commits (dc60f0376fd039985f597376c92282be7517aba4)

Author SHA1 Message Date
yggverse 3884f375d4 save document body text to index 9 months ago
ghost 1c2e8dafb2 collect keywords from document headers 11 months ago
ghost cfbc84cbaf sort queue by rank asc 11 months ago
ghost db9dc8d4ba force results to string 11 months ago
ghost 50dc9d315a add rank field 11 months ago
ghost 6f4abe4729 set crc32url as document id 11 months ago
ghost 93baed4b90 delete deprecated documents with HTTP code not 200 on second scan 12 months ago
ghost 33cc778999 crawl newest pages by rand in queue 1 year ago
ghost 35ad144a9e add stripos url rules for crawl snaps 1 year ago
ghost 0e06ff3c0f fix debug message 1 year ago
ghost 51d52dea7d fix destination name 1 year ago
ghost 87ca594860 add debug levels 1 year ago
ghost 33d657cb72 apply sleep on timeout value provided only 1 year ago
ghost bc00f0c851 make tmp subfolders storage optimization 1 year ago
ghost f613b44d3f disable sort by RAND() in crawler queue 1 year ago
ghost d3f8d1c0e3 fix result output 1 year ago
ghost 86b20cbc51 add debug output on skip condition 1 year ago
ghost 3306dc1961 add skip url filter by stripos condition 1 year ago
ghost ee074b684a add semaphore namespace prefix 1 year ago
ghost 27946ff27c define missed crc32url field value 1 year ago
ghost 38fbc32151 fix document fields update 1 year ago
ghost 08995e6199 randomize new pages queue 1 year ago
ghost 6a9117757b reset http code to 404 on page index initiation 1 year ago
ghost 015221eafb fix semaphore condition #5 1 year ago
ghost a499c363f6 prevent multi-thread execution #5 1 year ago
ghost 2961045c76 implement index cleaner tool #5 1 year ago
ghost 02dd3649a7 add CURL options that prevent crawl queue stuck 1 year ago
ghost 349f26f5ea update option name 1 year ago
ghost 133548a98c fix url check conditions 1 year ago
ghost 6f21cb8bf2 add missed crc32url value 1 year ago
ghost 01437065e3 fix duplicates validation 1 year ago
ghost dfb2c06738 add crc32url filter 1 year ago
ghost 8a827bfcdf update settings definition 1 year ago
ghost a50ef908e2 draft alter index tool 1 year ago
ghost 192e45103d add index settings support 1 year ago
ghost 4c3038e733 fix processed offset 1 year ago
ghost 10b08215d0 fix data types 1 year ago
ghost b7444b8f12 add queue offset / limit attributes 1 year ago
ghost da365c1ab1 fix total condition 1 year ago
ghost 3448eb85f7 implement yggo db migration cli tool 1 year ago
ghost 875382c56e implement FTP snaps 1 year ago
ghost 72f2fdaeca change config location 1 year ago
ghost c6e9ba9d09 implement local storage feature with tar.gz compression 1 year ago
ghost dc807fe4d5 add url trim 1 year ago
ghost 01753b0557 add crawl queue delay support 1 year ago
ghost 13cf61b42c fix debug output 1 year ago
ghost 7dfc800a67 initial commit 1 year ago