Commit Graph

555 Commits

Author SHA1 Message Date
ghost
3c9bc1adaa add required user-agent construction #5 2023-04-09 00:02:31 +03:00
ghost
0484d43482 fix trim path levels in the relative links 2023-04-08 23:52:46 +03:00
ghost
b819fda025 init yggdrasil robots.txt registry #5 2023-04-08 22:29:33 +03:00
ghost
df6f2a1869 implement CRAWL_ROBOTS_POSTFIX_RULES configuration #5 2023-04-08 22:28:31 +03:00
ghost
505544c8c9 add affiliate link 2023-04-08 20:13:13 +03:00
ghost
b3c668706b trim path levels in the relative links 2023-04-08 19:14:04 +03:00
ghost
71a3e7dd0e skip x-raw-image links crawl 2023-04-08 19:11:12 +03:00
ghost
50b6e90380 Merge branch 'main' of https://github.com/YGGverse/YGGo into main 2023-04-08 18:23:51 +03:00
ghost
8d102ecdf7 index hosts with enabled status only 2023-04-08 18:23:48 +03:00
ghost
0b12e872a3 add host name to the search index 2023-04-08 18:22:53 +03:00
d47081
a29d6d5d0a
Update README.md 2023-04-07 18:24:50 +03:00
ghost
ab71b3823a update readme 2023-04-07 15:03:00 +03:00
ghost
e98146b78b index only 200 http code pages 2023-04-07 05:34:45 +03:00
ghost
9b9d40a97c skip javascript/mailto links index 2023-04-07 05:19:32 +03:00
ghost
2a843449e0 add process locked notice to the debug output 2023-04-07 04:58:56 +03:00
ghost
0f2b772fa8 remove not indexed pages from the search index 2023-04-07 04:50:01 +03:00
ghost
ce509ec0a8 remove debug row 2023-04-07 04:39:25 +03:00
ghost
2495a2bbc7 implement MySQL/Sphinx data model #3, add basical robots.txt support #2 2023-04-07 04:04:24 +03:00
d47081
a14d18fedb
Update README.md 2023-04-05 19:28:58 +03:00
d47081
4bb3e26c7b
Update README.md 2023-04-05 19:22:58 +03:00
d47081
9b8bd6d277
Update README.md 2023-04-05 19:20:51 +03:00
d47081
f25e95cb79
Update README.md 2023-04-05 19:19:39 +03:00
d47081
ceed482bd4
Update README.md 2023-04-05 19:18:51 +03:00
d47081
006460381b
Update README.md 2023-04-05 17:54:46 +03:00
d47081
e8059d94ec
Update README.md 2023-04-05 16:11:03 +03:00
d47081
2c08604125
Update README.md 2023-04-04 01:46:06 +03:00
d47081
9377a8d0aa
Update README.md 2023-04-04 01:44:37 +03:00
d47081
9d01f9ab72
Update README.md 2023-04-04 01:43:29 +03:00
d47081
2f99dcb0d7
Update README.md 2023-04-04 01:43:15 +03:00
ghost
a07ca1dce1 add ipv6 example 2023-04-04 01:39:48 +03:00
ghost
c9cd38f6ac update variable names #2 2023-04-04 01:38:32 +03:00
ghost
ed2d4047b4 implement robots.txt library #2 2023-04-04 00:27:32 +03:00
ghost
183ad99ccd change repository address 2023-04-03 17:56:51 +03:00
ghost
e7e4bb686c fix curl exec double call 2023-04-03 04:47:31 +03:00
ghost
79663c84db add CRAWL_META_ONLY option 2023-04-03 03:07:54 +03:00
ghost
dc55dcb9b5 Merge branch 'main' of https://github.com/d47081/YGGo into main 2023-04-03 02:04:29 +03:00
ghost
f0516126e2 add image storage cache folder 2023-04-03 02:04:25 +03:00
d47081
014b56ab03
Update README.md 2023-04-03 02:00:49 +03:00
d47081
60947dbf6e
Update README.md 2023-04-03 01:59:34 +03:00
ghost
bac2ffa635 add image dimentions for low connection UI optimization 2023-04-03 01:57:52 +03:00
ghost
abe927b4bc Merge branch 'main' of https://github.com/d47081/YGGo into main 2023-04-03 01:55:28 +03:00
ghost
5c55ee0e3f update search page style 2023-04-03 01:55:26 +03:00
d47081
5d6e50941c
Update README.md 2023-04-03 01:41:13 +03:00
ghost
74578b7aad fix extension to webp 2023-04-03 01:38:01 +03:00
ghost
18edb66ab0 urlencode identicon requests 2023-04-03 01:33:23 +03:00
ghost
ff95df72c1 implement hostname identicons 2023-04-03 01:30:09 +03:00
ghost
616fe37d2e update home page screenshot 2023-04-02 23:57:41 +03:00
ghost
c00e4c7e70 fix empty request title 2023-04-02 23:49:04 +03:00
ghost
b1e695d328 Merge branch 'main' of https://github.com/d47081/YGGo into main 2023-04-02 23:39:14 +03:00
ghost
a3bdccddd6 fix pagination link condition 2023-04-02 23:39:13 +03:00