49 Commits (fdd18de37334d7441af93e8fdb41c7e18012e99c)

Author SHA1 Message Date
ghost fdd18de373 remove abstraction 2 years ago
ghost 4801360a51 update api version 2 years ago
ghost b6605b9132 implement not reachable resources ban feature with timeout to prevent extra http requests 2 years ago
ghost f88d2ee9ff implement MIME content-type crawler filter 2 years ago
ghost 5999fb3a73 add distributed hosts crawling using yggo nodes manifest 2 years ago
ghost 297563d4a5 display related pages in priority to the unique host by rank, rand() order 2 years ago
ghost 834ac68cce create separated pagination settings for page/image search types 2 years ago
ghost 79878d17fe add crawler / proxy user agent settings 2 years ago
ghost 9ed8411d2f add image queue crawler 2 years ago
ghost d905e33b4f update host images info on search requests 2 years ago
ghost 63b51f71c6 fix space offset 2 years ago
ghost f980b6318c add page meta to the image index 2 years ago
ghost baf78e2bf5 add hostImage examples to sphinx configuration 2 years ago
ghost 0741a3e9ef implement image crawler 2 years ago
ghost 56c79d8f3a update config documentation 2 years ago
ghost 6d8f4f4882 create manifests registry 2 years ago
ghost 219a56d6cd update manifest API 2 years ago
ghost d7bbf1d96a update default settings preset 2 years ago
ghost 0a199fce72 add project description and support links 2 years ago
ghost a16a13b395 add application mode settings 2 years ago
ghost a2fc14c8cf implement manifest API 2 years ago
ghost 74dd15e544 add page rank sort order attribute 2 years ago
ghost d20487acfd add stem_enru, stem_cz, stem_ar morphology support 2 years ago
ghost 2a79671cf1 add missed option example 2 years ago
ghost afd4375e4d add hostPagesTotal info to the hosts API 2 years ago
ghost 1d7031e4f7 make protocol settings adaptive 2 years ago
ghost 3917ca8d4f move crontab configuration example to the config directory 2 years ago
ghost 8dbb4a06af add disk quota validation 2 years ago
ghost 13431008c4 add options documentation 2 years ago
ghost 9916fb701f implement basic api 2 years ago
ghost 81cb970248 add options documentation 2 years ago
ghost 8da150b295 add options documentation 2 years ago
ghost 8f09db5045 add options documentation 2 years ago
ghost c4dfb58fe3 add options documentation 2 years ago
ghost 8e8d89db0e implement database cleaner 2 years ago
ghost df6f2a1869 implement CRAWL_ROBOTS_POSTFIX_RULES configuration #5 2 years ago
ghost 8d102ecdf7 index hosts with enabled status only 2 years ago
ghost 0b12e872a3 add host name to the search index 2 years ago
ghost e98146b78b index only 200 http code pages 2 years ago
ghost 0f2b772fa8 remove not indexed pages from the search index 2 years ago
ghost 2495a2bbc7 implement MySQL/Sphinx data model #3, add basical robots.txt support #2 2 years ago
ghost a07ca1dce1 add ipv6 example 2 years ago
ghost 79663c84db add CRAWL_META_ONLY option 2 years ago
ghost ff95df72c1 implement hostname identicons 2 years ago
ghost 4ea01bf8b4 implement search results pagination 2 years ago
ghost c770a912f0 fix crawl request warnings 2 years ago
ghost aadfe7f551 add env-less option 2 years ago
ghost 1f9b1503b9 add server environment configuration to keep the multi-adresing support 2 years ago
ghost 72985eaf9e initial commit 2 years ago