Commit Graph

50 Commits

Author SHA1 Message Date
ghost
2853db6207 fix mimes separator 2023-05-15 17:18:33 +03:00
ghost
81f7ea1e1e implement multi-storage snap downloads 2023-05-15 09:18:18 +03:00
ghost
1969707eeb integrate optional MEGA/cmd snap storage 2023-05-14 19:41:20 +03:00
ghost
0d19004e86 make local snap storage optimization 2023-05-14 01:45:55 +03:00
ghost
2f7d99079d implement local snaps 2023-05-13 10:15:07 +03:00
ghost
28e8bcf8d7 add audio/video media crawl support 2023-05-13 01:23:09 +03:00
ghost
db0e66c846 refactor to mime-based content index #1 2023-05-10 12:47:36 +03:00
ghost
28bf526d53 add host nsfw settings 2023-05-09 13:26:19 +03:00
ghost
d186fff48f skip curl download on response data size reached 2023-05-09 10:21:37 +03:00
ghost
23ead4e12c update page / image description models, implement history snap crawling 2023-05-09 08:19:49 +03:00
ghost
e9d5137dfe allow svg images mime content type 2023-05-08 13:00:37 +03:00
ghost
25b6bce2ec add crawler/cleaner logs 2023-05-08 11:04:59 +03:00
ghost
fdd18de373 remove abstraction 2023-05-06 14:03:43 +03:00
ghost
4801360a51 update api version 2023-05-06 13:55:05 +03:00
ghost
b6605b9132 implement not reachable resources ban feature with timeout to prevent extra http requests 2023-05-06 08:45:37 +03:00
ghost
f88d2ee9ff implement MIME content-type crawler filter 2023-05-05 21:25:57 +03:00
ghost
5999fb3a73 add distributed hosts crawling using yggo nodes manifest 2023-05-05 05:26:53 +03:00
ghost
297563d4a5 display related pages in priority to the unique host by rank, rand() order 2023-05-04 10:53:37 +03:00
ghost
834ac68cce create separated pagination settings for page/image search types 2023-05-04 09:20:34 +03:00
ghost
79878d17fe add crawler / proxy user agent settings 2023-05-04 07:38:22 +03:00
ghost
9ed8411d2f add image queue crawler 2023-05-04 06:45:04 +03:00
ghost
d905e33b4f update host images info on search requests 2023-05-04 06:12:51 +03:00
ghost
0741a3e9ef implement image crawler 2023-05-04 01:04:39 +03:00
ghost
56c79d8f3a update config documentation 2023-05-03 09:31:40 +03:00
ghost
6d8f4f4882 create manifests registry 2023-05-03 09:22:14 +03:00
ghost
219a56d6cd update manifest API 2023-05-03 05:47:02 +03:00
ghost
d7bbf1d96a update default settings preset 2023-05-03 04:17:13 +03:00
ghost
0a199fce72 add project description and support links 2023-04-25 20:33:06 +03:00
ghost
a16a13b395 add application mode settings 2023-04-25 20:25:12 +03:00
ghost
a2fc14c8cf implement manifest API 2023-04-25 19:35:52 +03:00
ghost
afd4375e4d add hostPagesTotal info to the hosts API 2023-04-25 16:01:55 +03:00
ghost
1d7031e4f7 make protocol settings adaptive 2023-04-24 02:32:03 +03:00
ghost
8dbb4a06af add disk quota validation 2023-04-23 04:05:00 +03:00
ghost
13431008c4 add options documentation 2023-04-23 03:16:54 +03:00
ghost
9916fb701f implement basic api 2023-04-23 03:01:51 +03:00
ghost
81cb970248 add options documentation 2023-04-23 01:54:10 +03:00
ghost
8da150b295 add options documentation 2023-04-23 01:46:34 +03:00
ghost
8f09db5045 add options documentation 2023-04-23 01:32:34 +03:00
ghost
c4dfb58fe3 add options documentation 2023-04-23 01:14:31 +03:00
ghost
8e8d89db0e implement database cleaner 2023-04-09 00:06:28 +03:00
ghost
df6f2a1869 implement CRAWL_ROBOTS_POSTFIX_RULES configuration #5 2023-04-08 22:28:31 +03:00
ghost
2495a2bbc7 implement MySQL/Sphinx data model #3, add basical robots.txt support #2 2023-04-07 04:04:24 +03:00
ghost
a07ca1dce1 add ipv6 example 2023-04-04 01:39:48 +03:00
ghost
79663c84db add CRAWL_META_ONLY option 2023-04-03 03:07:54 +03:00
ghost
ff95df72c1 implement hostname identicons 2023-04-03 01:30:09 +03:00
ghost
4ea01bf8b4 implement search results pagination 2023-04-02 23:36:35 +03:00
ghost
c770a912f0 fix crawl request warnings 2023-04-02 18:14:42 +03:00
ghost
aadfe7f551 add env-less option 2023-04-02 18:08:03 +03:00
ghost
1f9b1503b9 add server environment configuration to keep the multi-adresing support 2023-04-02 01:51:54 +03:00
ghost
72985eaf9e initial commit 2023-04-01 19:29:39 +03:00