ghost
|
662351cc46
|
make meta fields index separated, set search priority by document title
|
2023-08-01 14:15:14 +03:00 |
|
ghost
|
5791877a4e
|
update Filter::searchQuery method, fix search by URL
|
2023-08-01 13:50:07 +03:00 |
|
ghost
|
3235133cd0
|
extract keywords from URI
|
2023-07-31 22:42:49 +03:00 |
|
ghost
|
2ef9948342
|
change default CRAWL_PAGE_HOME_SECONDS_OFFSET value to 1 month
|
2023-07-31 22:04:27 +03:00 |
|
ghost
|
9c0f361601
|
refactor snap storage
|
2023-07-31 13:33:30 +03:00 |
|
ghost
|
000b9ad8dd
|
add FS cleaning features, lock execution on active crontab tasks, disable hostPageSnap/localhost untested constructions
|
2023-07-30 21:53:30 +03:00 |
|
ghost
|
3e3b7ee2ef
|
optimize snaps, delete unused constructions
|
2023-07-30 19:09:41 +03:00 |
|
ghost
|
b13293988a
|
add search index by host and host page URL
|
2023-07-30 12:39:41 +03:00 |
|
ghost
|
712d67f6bf
|
implement unlimited snap storage mirrors, delete megaCMD integration
|
2023-07-29 14:37:01 +03:00 |
|
ghost
|
1dd0a8ee2c
|
make page rank procedural, optimize performance
|
2023-07-28 12:49:43 +03:00 |
|
ghost
|
4a4394fb27
|
add memcached support
|
2023-07-27 17:53:36 +03:00 |
|
ghost
|
2e2501b437
|
implement sitemap support
|
2023-07-27 11:44:42 +03:00 |
|
ghost
|
3218add372
|
add custom home page reindex settings
|
2023-06-30 13:28:22 +03:00 |
|
ghost
|
5346b13602
|
implement custom hostPageDom elements index
|
2023-06-25 22:10:47 +03:00 |
|
ghost
|
c07d6af52f
|
add new mime preset
|
2023-06-13 21:57:01 +03:00 |
|
ghost
|
830e96b03d
|
increase minimum requirements
|
2023-06-13 03:16:29 +03:00 |
|
ghost
|
dd736c7923
|
crontab schedule optimization
|
2023-06-10 00:19:27 +03:00 |
|
ghost
|
8726512cf0
|
change morphology from stem_enru to lemmatize_ru_all/lemmatize_en_all
|
2023-06-05 18:20:49 +03:00 |
|
ghost
|
8bc8a943e7
|
add lemmatize_de_all
|
2023-06-05 18:13:31 +03:00 |
|
ghost
|
17f69b9661
|
add min_word_len, min_prefix_len, html_strip, index_exact_words presets example
|
2023-06-05 13:36:15 +03:00 |
|
ghost
|
1e2736d67b
|
skip empty mime type index
|
2023-06-04 18:10:59 +03:00 |
|
ghost
|
4fa33afe40
|
prevent infinitive connection on streaming resources detected
|
2023-06-04 17:02:32 +03:00 |
|
ghost
|
982be2a949
|
add the description text source
|
2023-05-30 21:46:52 +03:00 |
|
ghost
|
cb60d52a0b
|
update documentation
|
2023-05-29 22:36:13 +03:00 |
|
ghost
|
45c4f7b7b0
|
add database optimization settings
|
2023-05-29 22:13:41 +03:00 |
|
ghost
|
2853db6207
|
fix mimes separator
|
2023-05-15 17:18:33 +03:00 |
|
ghost
|
f827c37691
|
add MEGAcmd/FTP launch examples
|
2023-05-15 11:51:27 +03:00 |
|
ghost
|
81f7ea1e1e
|
implement multi-storage snap downloads
|
2023-05-15 09:18:18 +03:00 |
|
ghost
|
1969707eeb
|
integrate optional MEGA/cmd snap storage
|
2023-05-14 19:41:20 +03:00 |
|
ghost
|
0d19004e86
|
make local snap storage optimization
|
2023-05-14 01:45:55 +03:00 |
|
ghost
|
2f7d99079d
|
implement local snaps
|
2023-05-13 10:15:07 +03:00 |
|
ghost
|
d98b8f5c94
|
remove hostPageToHostPage .quantity field because of implements wrong duplicates counting on reindex
|
2023-05-13 06:30:40 +03:00 |
|
ghost
|
28e8bcf8d7
|
add audio/video media crawl support
|
2023-05-13 01:23:09 +03:00 |
|
ghost
|
566d3b442e
|
make mime details grouped
|
2023-05-10 23:37:24 +03:00 |
|
ghost
|
746cc228a9
|
update page rank query
|
2023-05-10 15:42:48 +03:00 |
|
ghost
|
db0e66c846
|
refactor to mime-based content index #1
|
2023-05-10 12:47:36 +03:00 |
|
ghost
|
e7c5e2ca9d
|
GROUP_CONCAT host image descriptions
|
2023-05-09 16:27:31 +03:00 |
|
ghost
|
28bf526d53
|
add host nsfw settings
|
2023-05-09 13:26:19 +03:00 |
|
ghost
|
d186fff48f
|
skip curl download on response data size reached
|
2023-05-09 10:21:37 +03:00 |
|
ghost
|
23ead4e12c
|
update page / image description models, implement history snap crawling
|
2023-05-09 08:19:49 +03:00 |
|
ghost
|
77bd25f587
|
add line separators
|
2023-05-09 01:39:56 +03:00 |
|
ghost
|
0e9d29675f
|
implement host page description history crawling
|
2023-05-09 01:29:32 +03:00 |
|
ghost
|
e9d5137dfe
|
allow svg images mime content type
|
2023-05-08 13:00:37 +03:00 |
|
ghost
|
25b6bce2ec
|
add crawler/cleaner logs
|
2023-05-08 11:04:59 +03:00 |
|
ghost
|
fdd18de373
|
remove abstraction
|
2023-05-06 14:03:43 +03:00 |
|
ghost
|
4801360a51
|
update api version
|
2023-05-06 13:55:05 +03:00 |
|
ghost
|
b6605b9132
|
implement not reachable resources ban feature with timeout to prevent extra http requests
|
2023-05-06 08:45:37 +03:00 |
|
ghost
|
f88d2ee9ff
|
implement MIME content-type crawler filter
|
2023-05-05 21:25:57 +03:00 |
|
ghost
|
5999fb3a73
|
add distributed hosts crawling using yggo nodes manifest
|
2023-05-05 05:26:53 +03:00 |
|
ghost
|
297563d4a5
|
display related pages in priority to the unique host by rank, rand() order
|
2023-05-04 10:53:37 +03:00 |
|