ghost
|
034a683df7
|
add YGGstate DB crawl integration
|
2023-08-07 00:13:04 +03:00 |
|
ghost
|
3d9db381e8
|
fix CRAWL_MANIFEST_API_VERSION
|
2023-08-06 21:27:56 +03:00 |
|
ghost
|
3c3443b3fd
|
freeze crawl on remote storage connection lost, infinitely repeat new attempt after 60 seconds until storage connected again
|
2023-08-06 17:57:42 +03:00 |
|
ghost
|
872ea25d00
|
remove deprecated condition
|
2023-08-05 22:00:26 +03:00 |
|
ghost
|
fff75d4d86
|
update debug message
|
2023-08-05 21:58:18 +03:00 |
|
ghost
|
6eefd9b762
|
fix undefined variable
|
2023-08-05 21:57:11 +03:00 |
|
ghost
|
e953c01eaa
|
update debug message
|
2023-08-05 21:55:37 +03:00 |
|
ghost
|
bd212edb97
|
update debug message
|
2023-08-05 21:52:26 +03:00 |
|
ghost
|
1b287c8d28
|
update debug message
|
2023-08-05 21:40:59 +03:00 |
|
ghost
|
562b97ba8f
|
update debug message
|
2023-08-05 21:39:44 +03:00 |
|
ghost
|
c5ae6974bd
|
fix PDO calls
|
2023-08-05 21:36:28 +03:00 |
|
ghost
|
b3ec1d42a7
|
fix empty URI processing
|
2023-08-05 21:31:33 +03:00 |
|
ghost
|
7ddb47619a
|
update debug message
|
2023-08-05 21:17:05 +03:00 |
|
ghost
|
9fe33a3b2c
|
update CLI roadmap
|
2023-08-05 21:16:09 +03:00 |
|
ghost
|
6e069a86e5
|
update readme
|
2023-08-05 21:11:40 +03:00 |
|
ghost
|
513addc7af
|
add query totals counting, update crawler debug
|
2023-08-05 21:03:45 +03:00 |
|
ghost
|
6e03a76ed8
|
add CURLOPT_SSL_VERIFYHOST/CURLOPT_SSL_VERIFYPEER options
|
2023-08-05 20:24:47 +03:00 |
|
ghost
|
004a5336de
|
remove htmls pages ban on title tag not available
|
2023-08-05 20:01:31 +03:00 |
|
ghost
|
f9774f2431
|
add innodb_buffer_pool_size default value
|
2023-08-05 19:51:30 +03:00 |
|
ghost
|
de28d85a71
|
add connection exceptions
|
2023-08-05 19:39:49 +03:00 |
|
ghost
|
142d496108
|
fix SQL syntax error
|
2023-08-05 19:31:29 +03:00 |
|
ghost
|
d46c4921c5
|
add page break
|
2023-08-05 19:24:32 +03:00 |
|
ghost
|
80b33f619c
|
fix PAGES_LIMIT condition
|
2023-08-05 19:24:21 +03:00 |
|
ghost
|
d024ffd770
|
implement unlimited settings customization for each host
|
2023-08-05 19:06:39 +03:00 |
|
ghost
|
ab6c0379c8
|
implement hosts crawl queue, move robots, sitemaps, manifests to this task
|
2023-08-04 09:32:12 +03:00 |
|
ghost
|
6ee5e53ef4
|
show sitemaps processed debug
|
2023-08-04 09:07:46 +03:00 |
|
ghost
|
71724ae33f
|
refactor manifest crawling
|
2023-08-04 09:00:03 +03:00 |
|
ghost
|
cb37c57bc4
|
rename example files
|
2023-08-03 18:49:29 +03:00 |
|
ghost
|
68d5820f30
|
reserve one hour for huge load operations
|
2023-08-03 18:47:39 +03:00 |
|
ghost
|
efbbf19601
|
fix multimedia snaps
|
2023-08-03 17:41:55 +03:00 |
|
ghost
|
6862fb35cd
|
update readme
|
2023-08-03 15:33:34 +03:00 |
|
ghost
|
282a6d609d
|
update manifest API
|
2023-08-03 15:31:57 +03:00 |
|
ghost
|
b24d31f360
|
refactor cleaner, delegate tasks to crawler, init hostSetting table
|
2023-08-03 15:25:38 +03:00 |
|
ghost
|
fd90e2d517
|
keep banned pages data
|
2023-08-03 14:31:06 +03:00 |
|
ghost
|
ab8b6f6315
|
rename variables
|
2023-08-03 14:24:37 +03:00 |
|
ghost
|
02612d098b
|
delete getFoundHostPage method, update API version
|
2023-08-03 14:08:45 +03:00 |
|
ghost
|
11e02da66d
|
memory usage optimization, rename methods, remove memchached dependency from the model
|
2023-08-03 10:48:27 +03:00 |
|
ghost
|
cbabea595b
|
rename method name
|
2023-08-03 10:26:37 +03:00 |
|
ghost
|
7e3248ca2c
|
rename method name
|
2023-08-03 10:26:14 +03:00 |
|
ghost
|
772975059c
|
add mysql conf example
|
2023-08-03 09:25:43 +03:00 |
|
ghost
|
7c407e0d1f
|
update crontab example
|
2023-08-03 08:34:51 +03:00 |
|
ghost
|
1249e8d29c
|
fix CRAWL_PAGE_RANK_UPDATE condition
|
2023-08-02 21:22:31 +03:00 |
|
ghost
|
5df59661d8
|
add page rank update optional in the crawl queue
|
2023-08-02 21:21:23 +03:00 |
|
ghost
|
a5a2ec233e
|
unify mime-based search results template
|
2023-08-02 17:29:02 +03:00 |
|
ghost
|
6d5901c101
|
display shortened page URL instead of host address, change column name
|
2023-08-02 15:47:44 +03:00 |
|
ghost
|
1d7deffc4c
|
update PR generation, delegate PR value from redirecting pages, update method names
|
2023-08-02 15:43:44 +03:00 |
|
ghost
|
bba718c901
|
remove host pages total column
|
2023-08-02 15:36:26 +03:00 |
|
ghost
|
b7a48b905e
|
update method names
|
2023-08-02 14:25:48 +03:00 |
|
ghost
|
e65c24f6f3
|
uodate roadmap
|
2023-08-02 12:44:10 +03:00 |
|
ghost
|
1655ec63b2
|
skip xmpp links
|
2023-08-02 11:57:54 +03:00 |
|