ghost
|
3e3b7ee2ef
|
optimize snaps, delete unused constructions
|
2023-07-30 19:09:41 +03:00 |
|
ghost
|
712d67f6bf
|
implement unlimited snap storage mirrors, delete megaCMD integration
|
2023-07-29 14:37:01 +03:00 |
|
ghost
|
1dd0a8ee2c
|
make page rank procedural, optimize performance
|
2023-07-28 12:49:43 +03:00 |
|
ghost
|
5346b13602
|
implement custom hostPageDom elements index
|
2023-06-25 22:10:47 +03:00 |
|
ghost
|
0949d7f871
|
set default encoding
|
2023-06-14 02:20:09 +03:00 |
|
ghost
|
ab78e17ca8
|
add hostPage.size collection
|
2023-06-13 12:45:12 +03:00 |
|
ghost
|
7892784f5c
|
add httpCode column to hostPageSnapDownload table
|
2023-06-12 13:34:25 +03:00 |
|
ghost
|
0af5d165d3
|
remove logCrawler column not in use
|
2023-06-05 22:06:55 +03:00 |
|
ghost
|
81f7ea1e1e
|
implement multi-storage snap downloads
|
2023-05-15 09:18:18 +03:00 |
|
ghost
|
1969707eeb
|
integrate optional MEGA/cmd snap storage
|
2023-05-14 19:41:20 +03:00 |
|
ghost
|
0d19004e86
|
make local snap storage optimization
|
2023-05-14 01:45:55 +03:00 |
|
ghost
|
2f7d99079d
|
implement local snaps
|
2023-05-13 10:15:07 +03:00 |
|
ghost
|
d98b8f5c94
|
remove hostPageToHostPage .quantity field because of implements wrong duplicates counting on reindex
|
2023-05-13 06:30:40 +03:00 |
|
ghost
|
db0e66c846
|
refactor to mime-based content index #1
|
2023-05-10 12:47:36 +03:00 |
|
ghost
|
2c5ca1b630
|
fix image description duplicate
|
2023-05-09 15:23:32 +03:00 |
|
ghost
|
1c7cca1446
|
fix UNIQUE index relation
|
2023-05-09 14:10:08 +03:00 |
|
ghost
|
28bf526d53
|
add host nsfw settings
|
2023-05-09 13:26:19 +03:00 |
|
ghost
|
23ead4e12c
|
update page / image description models, implement history snap crawling
|
2023-05-09 08:19:49 +03:00 |
|
ghost
|
0e9d29675f
|
implement host page description history crawling
|
2023-05-09 01:29:32 +03:00 |
|
ghost
|
25b6bce2ec
|
add crawler/cleaner logs
|
2023-05-08 11:04:59 +03:00 |
|
ghost
|
b6605b9132
|
implement not reachable resources ban feature with timeout to prevent extra http requests
|
2023-05-06 08:45:37 +03:00 |
|
ghost
|
702a14b634
|
add mime content type crawling #1
|
2023-05-06 07:25:54 +03:00 |
|
ghost
|
5999fb3a73
|
add distributed hosts crawling using yggo nodes manifest
|
2023-05-05 05:26:53 +03:00 |
|
ghost
|
d4f66c83e7
|
fix image crawling errors
|
2023-05-04 08:51:45 +03:00 |
|
ghost
|
68581960a3
|
add image.data field
|
2023-05-04 05:19:29 +03:00 |
|
ghost
|
0741a3e9ef
|
implement image crawler
|
2023-05-04 01:04:39 +03:00 |
|
ghost
|
78931ebc74
|
normalize host image description storage
|
2023-05-03 21:52:00 +03:00 |
|
ghost
|
db617f9939
|
refactor image storage model
|
2023-05-03 21:27:15 +03:00 |
|
ghost
|
6d8f4f4882
|
create manifests registry
|
2023-05-03 09:22:14 +03:00 |
|
ghost
|
ec20435790
|
remove presets registry (because provided in the node API)
|
2023-05-03 04:13:32 +03:00 |
|
ghost
|
11aa404807
|
add metaYggo field index
|
2023-04-25 21:10:59 +03:00 |
|
ghost
|
352466ad03
|
update host.robotsPostfix registry
|
2023-04-10 03:19:08 +03:00 |
|
ghost
|
6550eb310f
|
update host.robotsPostfix rules
|
2023-04-09 03:10:42 +03:00 |
|
ghost
|
6cee58214e
|
update host.robotsPostfix rules
|
2023-04-09 03:05:43 +03:00 |
|
ghost
|
6f4daf7a25
|
update host.robotsPostfix rule
|
2023-04-09 02:19:07 +03:00 |
|
ghost
|
f4db66d53f
|
add new host.robotsPostfix rules
|
2023-04-09 02:14:13 +03:00 |
|
ghost
|
be7eae501b
|
add host.status registry #1, #5
|
2023-04-09 00:28:51 +03:00 |
|
ghost
|
3c9bc1adaa
|
add required user-agent construction #5
|
2023-04-09 00:02:31 +03:00 |
|
ghost
|
b819fda025
|
init yggdrasil robots.txt registry #5
|
2023-04-08 22:29:33 +03:00 |
|
ghost
|
df6f2a1869
|
implement CRAWL_ROBOTS_POSTFIX_RULES configuration #5
|
2023-04-08 22:28:31 +03:00 |
|
ghost
|
2495a2bbc7
|
implement MySQL/Sphinx data model #3, add basical robots.txt support #2
|
2023-04-07 04:04:24 +03:00 |
|