Commit Graph

552 Commits

Author SHA1 Message Date
ghost
79878d17fe add crawler / proxy user agent settings 2023-05-04 07:38:22 +03:00
ghost
73f212e3d7 set crawler queue order priority to item rank, rand() 2023-05-04 06:55:05 +03:00
ghost
9ed8411d2f add image queue crawler 2023-05-04 06:45:04 +03:00
ghost
d905e33b4f update host images info on search requests 2023-05-04 06:12:51 +03:00
ghost
68581960a3 add image.data field 2023-05-04 05:19:29 +03:00
ghost
9c24eda833 switch to native curl library 2023-05-04 04:56:25 +03:00
ghost
100d12c6ab update curl library constructor 2023-05-04 04:55:26 +03:00
ghost
bb4e97eea3 use curl for image connections to prevent queue timeout 2023-05-04 04:42:07 +03:00
ghost
63b51f71c6 fix space offset 2023-05-04 04:20:54 +03:00
ghost
f980b6318c add page meta to the image index 2023-05-04 04:20:20 +03:00
ghost
250e20bbcd remove separator 2023-05-04 04:19:38 +03:00
ghost
6b18202588 implement proxied image search #1 2023-05-04 03:48:57 +03:00
ghost
baf78e2bf5 add hostImage examples to sphinx configuration 2023-05-04 01:34:12 +03:00
ghost
0741a3e9ef implement image crawler 2023-05-04 01:04:39 +03:00
ghost
78931ebc74 normalize host image description storage 2023-05-03 21:52:00 +03:00
ghost
1122cb9798 update DB prototype 2023-05-03 21:51:26 +03:00
ghost
db617f9939 refactor image storage model 2023-05-03 21:27:15 +03:00
ghost
74fb0d50be add DB prototype scheme 2023-05-03 21:26:32 +03:00
ghost
1ee2ac4f0b add yggo:manifest namespace 2023-05-03 09:38:58 +03:00
ghost
56c79d8f3a update config documentation 2023-05-03 09:31:40 +03:00
ghost
f8e0a50db6 add manifest url filter 2023-05-03 09:26:48 +03:00
ghost
6d8f4f4882 create manifests registry 2023-05-03 09:22:14 +03:00
ghost
219a56d6cd update manifest API 2023-05-03 05:47:02 +03:00
ghost
eb3e70a7b7 fix robots.txt conditions 2023-05-03 04:17:58 +03:00
ghost
d7bbf1d96a update default settings preset 2023-05-03 04:17:13 +03:00
ghost
ec20435790 remove presets registry (because provided in the node API) 2023-05-03 04:13:32 +03:00
ghost
0bd765064b implement extended search mode support #9 2023-05-01 20:09:28 +03:00
ghost
fb5cfe4f50 update readme 2023-05-01 19:23:51 +03:00
ghost
c3ff4de3bb update readme 2023-05-01 19:10:42 +03:00
ghost
b2d3cf1c13 update readme 2023-05-01 19:09:28 +03:00
ghost
84fd82f294 fix replacement typo #9 2023-05-01 19:03:14 +03:00
ghost
d40b914983 add new chars quoting #9 2023-05-01 18:58:03 +03:00
ghost
f7807cf43e add extended syntax filter to prevent sphinxql query error #9 2023-05-01 18:39:46 +03:00
ghost
a5f5541395 skip robots:noindex page without extra actions 2023-04-29 08:58:48 +03:00
ghost
00140e30a8 update readme 2023-04-29 07:49:09 +03:00
ghost
c592edbd82 update readme 2023-04-29 07:47:55 +03:00
ghost
9ae91ee187 remove phrase search mask, allow sphinx macroses 2023-04-29 07:41:59 +03:00
ghost
e418ddcd32 fix data type 2023-04-25 21:20:35 +03:00
ghost
11aa404807 add metaYggo field index 2023-04-25 21:10:59 +03:00
ghost
0a199fce72 add project description and support links 2023-04-25 20:33:06 +03:00
ghost
a16a13b395 add application mode settings 2023-04-25 20:25:12 +03:00
ghost
9396c52313 change manifest API key names 2023-04-25 19:53:52 +03:00
ghost
e396a3a848 update readme 2023-04-25 19:44:25 +03:00
ghost
6ade7e9fcd update readme 2023-04-25 19:43:10 +03:00
ghost
957f15188b add CRAWL_PAGE_SECONDS_OFFSET info 2023-04-25 19:38:17 +03:00
ghost
a2fc14c8cf implement manifest API 2023-04-25 19:35:52 +03:00
ghost
5875dd58c9 fix PR update condition 2023-04-25 18:19:22 +03:00
ghost
74dd15e544 add page rank sort order attribute 2023-04-25 17:07:57 +03:00
ghost
8671fc4bde implement page ranking 2023-04-25 16:54:01 +03:00
ghost
57f64f6b90 add hostPage weight and rank info 2023-04-25 16:53:13 +03:00