Commit Graph

489 Commits

Author SHA1 Message Date
ghost
89d1b2230b update readme 2023-05-11 08:46:44 +03:00
ghost
ced7d7c9d6 remove unused css construction 2023-05-11 08:00:46 +03:00
ghost
9ad03c8153 add meta data description 2023-05-11 07:40:09 +03:00
ghost
b83ad6cc3a fix default mime 2023-05-11 01:45:36 +03:00
ghost
acafdfcf3a add method filters 2023-05-11 01:34:09 +03:00
ghost
566d3b442e make mime details grouped 2023-05-10 23:37:24 +03:00
ghost
4486bdc215 show mime type options that match search results only 2023-05-10 20:37:05 +03:00
ghost
307ebcf0b1 add page description on title | description | keywords not empty, remove deprecated constructions 2023-05-10 19:35:01 +03:00
ghost
7c5ba050b2 fix media crawling 2023-05-10 18:35:18 +03:00
ghost
746cc228a9 update page rank query 2023-05-10 15:42:48 +03:00
ghost
0fed16621a fix mime content type update 2023-05-10 14:47:33 +03:00
ghost
34e25a1d94 update readme 2023-05-10 14:32:36 +03:00
ghost
db0e66c846 refactor to mime-based content index #1 2023-05-10 12:47:36 +03:00
ghost
272a885039 add line separator 2023-05-09 16:37:56 +03:00
ghost
12c33d8ed6 add line separator 2023-05-09 16:34:33 +03:00
ghost
c13842b6c0 remove extra query 2023-05-09 16:30:36 +03:00
ghost
e7c5e2ca9d GROUP_CONCAT host image descriptions 2023-05-09 16:27:31 +03:00
ghost
0ffcee1efb fix image description updates timing 2023-05-09 15:53:21 +03:00
ghost
2c5ca1b630 fix image description duplicate 2023-05-09 15:23:32 +03:00
ghost
1c7cca1446 fix UNIQUE index relation 2023-05-09 14:10:08 +03:00
ghost
28bf526d53 add host nsfw settings 2023-05-09 13:26:19 +03:00
ghost
8ce0324e94 convert page data to string 2023-05-09 12:52:07 +03:00
ghost
dfca5570c6 remove unused construction 2023-05-09 12:10:42 +03:00
ghost
7dc7c89d9e update readme 2023-05-09 10:22:41 +03:00
ghost
d186fff48f skip curl download on response data size reached 2023-05-09 10:21:37 +03:00
ghost
d7a5f7ef84 remove content filter, snap raw the data 2023-05-09 09:02:17 +03:00
ghost
ef4de6b245 fix image search page errors 2023-05-09 08:53:33 +03:00
ghost
377d4935ad update readme 2023-05-09 08:28:09 +03:00
ghost
23ead4e12c update page / image description models, implement history snap crawling 2023-05-09 08:19:49 +03:00
ghost
77bd25f587 add line separators 2023-05-09 01:39:56 +03:00
ghost
0e9d29675f implement host page description history crawling 2023-05-09 01:29:32 +03:00
ghost
6371def666 fix attributes passing 2023-05-08 17:52:17 +03:00
ghost
32d0f390d3 update http code and mime type on page/image ban event 2023-05-08 14:13:53 +03:00
ghost
84dcecf50b add svg images support, fix mime validation 2023-05-08 13:12:16 +03:00
ghost
e9d5137dfe allow svg images mime content type 2023-05-08 13:00:37 +03:00
ghost
e6da2e729a fix images ban update 2023-05-08 13:00:02 +03:00
ghost
8fbd7f3516 count totals using sphinx index instead of database 2023-05-08 12:28:49 +03:00
ghost
bf1eeb332c fix page/image mime content type detection 2023-05-08 12:10:57 +03:00
ghost
25b6bce2ec add crawler/cleaner logs 2023-05-08 11:04:59 +03:00
ghost
dcdc2c50ad update debug string names 2023-05-08 08:31:34 +03:00
ghost
ea04220de3 add curl requests debug 2023-05-08 08:27:21 +03:00
ghost
1aba060d34 fix variable name 2023-05-08 07:23:50 +03:00
ghost
fdd18de373 remove abstraction 2023-05-06 14:03:43 +03:00
ghost
4801360a51 update api version 2023-05-06 13:55:05 +03:00
ghost
6c41dd5831 fix ban time update / count affected rows only 2023-05-06 10:11:25 +03:00
ghost
20514c455f add banned items counters 2023-05-06 08:50:41 +03:00
ghost
b6605b9132 implement not reachable resources ban feature with timeout to prevent extra http requests 2023-05-06 08:45:37 +03:00
ghost
cfa5d01db1 update readme 2023-05-06 07:33:34 +03:00
ghost
702a14b634 add mime content type crawling #1 2023-05-06 07:25:54 +03:00
ghost
0bd95d7f4d fix comments 2023-05-05 21:39:48 +03:00