ghost
|
8671fc4bde
|
implement page ranking
|
2023-04-25 16:54:01 +03:00 |
|
ghost
|
5936fa9a30
|
fix quota check condition
|
2023-04-23 04:31:32 +03:00 |
|
ghost
|
8dbb4a06af
|
add disk quota validation
|
2023-04-23 04:05:00 +03:00 |
|
ghost
|
dfbc6132c9
|
fix robots:noindex condition, add robots:nofollow attribute support
|
2023-04-09 15:25:15 +03:00 |
|
ghost
|
5c8d299a4a
|
add meta:robots tag support #2
|
2023-04-09 03:28:31 +03:00 |
|
ghost
|
8e8d89db0e
|
implement database cleaner
|
2023-04-09 00:06:28 +03:00 |
|
ghost
|
0484d43482
|
fix trim path levels in the relative links
|
2023-04-08 23:52:46 +03:00 |
|
ghost
|
df6f2a1869
|
implement CRAWL_ROBOTS_POSTFIX_RULES configuration #5
|
2023-04-08 22:28:31 +03:00 |
|
ghost
|
b3c668706b
|
trim path levels in the relative links
|
2023-04-08 19:14:04 +03:00 |
|
ghost
|
71a3e7dd0e
|
skip x-raw-image links crawl
|
2023-04-08 19:11:12 +03:00 |
|
ghost
|
9b9d40a97c
|
skip javascript/mailto links index
|
2023-04-07 05:19:32 +03:00 |
|
ghost
|
2a843449e0
|
add process locked notice to the debug output
|
2023-04-07 04:58:56 +03:00 |
|
ghost
|
ce509ec0a8
|
remove debug row
|
2023-04-07 04:39:25 +03:00 |
|
ghost
|
2495a2bbc7
|
implement MySQL/Sphinx data model #3, add basical robots.txt support #2
|
2023-04-07 04:04:24 +03:00 |
|
ghost
|
79663c84db
|
add CRAWL_META_ONLY option
|
2023-04-03 03:07:54 +03:00 |
|
ghost
|
04dbbc3adf
|
make url/src column ukeys digital by using crc32
|
2023-04-02 18:56:56 +03:00 |
|
ghost
|
b218b8bbc3
|
make url/src columns unique keys, add insert/ignore construction
|
2023-04-02 18:09:44 +03:00 |
|
ghost
|
1485983b3a
|
lock multi-thread execution
|
2023-04-02 00:27:33 +03:00 |
|
ghost
|
72985eaf9e
|
initial commit
|
2023-04-01 19:29:39 +03:00 |
|