Commit Graph

  • bbaeb50f60 update readme main ghost 2023-11-25 19:37:56 +0200
  • ca92db8826 add transliteration support in search requests #13 ghost 2023-10-28 01:00:00 +0300
  • 57332b1449 fix word-break ghost 2023-10-28 00:53:21 +0300
  • 91dcf2d2e9 fix status code container #10 ghost 2023-10-25 18:40:13 +0300
  • 400233bfe6 allow non 200 codes in search results (page could be available in explore mode / snap cache) #10 ghost 2023-10-25 18:33:04 +0300
  • 92a93e611e lowercase 'search' button ghost 2023-10-18 20:33:19 +0300
  • b23c663023 update paddings ghost 2023-10-18 20:31:23 +0300
  • 77e9a0294f cache getTopHostPages method results for 24 hours ghost 2023-10-18 20:27:12 +0300
  • 9e71a4538b add number_format for pages total ghost 2023-10-18 20:17:50 +0300
  • 3563580fa9 update font size ghost 2023-10-18 18:11:56 +0300
  • ea7d1a41ed update paddings ghost 2023-10-18 18:00:32 +0300
  • de712181a4 quote ipv6 url ghost 2023-10-18 17:48:07 +0300
  • 1ac3c18a26 update readme ghost 2023-10-17 18:50:14 +0300
  • 377fd5a941 update readme ghost 2023-10-17 18:49:19 +0300
  • 7c97b32dd5 update readme ghost 2023-10-17 18:45:11 +0300
  • 71c16c9c19 add database snaps link ghost 2023-10-16 22:05:39 +0300
  • 669743592e update readme ghost 2023-10-16 22:03:17 +0300
  • 16a99347db update readme ghost 2023-10-16 22:01:41 +0300
  • e218021ccc update readme ghost 2023-09-07 21:19:45 +0300
  • ece0f03385 relate exception processing with #11 ghost 2023-09-06 14:32:21 +0300
  • a1e2721849 skip links collect with rel=nofollow attribute ghost 2023-09-06 00:34:59 +0300
  • e576cb69db update readme ghost 2023-09-02 16:32:17 +0300
  • 0186d8705b change identicon library to jidenticon ghost 2023-09-02 16:31:00 +0300
  • ebe42dfe18 update readme ghost 2023-08-30 12:06:55 +0300
  • f26edf5af6 update readme ghost 2023-08-30 12:06:15 +0300
  • f9cf414901 reduce quantity of http requests for each page in queue by CRAWL_HOST_PAGE_SECONDS_DELAY setting ghost 2023-08-17 18:56:29 +0300
  • 468ef50ee3 delete deprecated constructions ghost 2023-08-17 18:43:19 +0300
  • eccb7ea241 refactor hostPageDom tables, add multiple selectors and children values support ghost 2023-08-17 18:32:48 +0300
  • 42b34d0783 fix settings procesing, remove unused variables ghost 2023-08-17 15:09:56 +0300
  • b1bfd79b80 change DEFAULT_HOST_URL_REGEXP check from host to page URL ghost 2023-08-17 14:59:00 +0300
  • a8ffe14349 implement 'hostPage add' CLI method ghost 2023-08-17 14:58:06 +0300
  • 56c376474f fix foreach continue level ghost 2023-08-17 14:10:17 +0300
  • 1012759c65 update config example ghost 2023-08-17 14:02:11 +0300
  • 88d2b16699 implement hostPageDom delete action ghost 2023-08-17 13:43:36 +0300
  • 3a9d78b7c4 implement hostPageDom delete action ghost 2023-08-17 13:43:27 +0300
  • 0f127ddb91 upgrade hostPageDom crawler to Symfony\Component\DomCrawler ghost 2023-08-17 13:28:50 +0300
  • 055b15333e fix variable name ghost 2023-08-17 13:16:00 +0300
  • ec3fc1e15d remove debug constructions ghost 2023-08-17 13:13:20 +0300
  • 37d01013db add semaphores namespace ghost 2023-08-17 12:59:13 +0300
  • 8fd422b5c2 generate hostPageDom target value based on source selector ghost 2023-08-17 12:58:38 +0300
  • d1b115d11c add semaphores namespace ghost 2023-08-17 12:55:42 +0300
  • 0638bc6742 update DEFAULT_HOST_PAGES_DOM_SELECTORS format ghost 2023-08-17 12:55:15 +0300
  • 175209813f add findLastHostPageDomBySelector method ghost 2023-08-17 11:04:28 +0300
  • 0b4abd2b50 update DEFAULT_HOST_PAGES_DOM_SELECTORS syntax ghost 2023-08-17 11:04:09 +0300
  • e3138faeac delete deprecated method ghost 2023-08-17 10:09:31 +0300
  • 2b49ff5f6a move hostPageDescription.data field data to hostPageDom.value ghost 2023-08-16 23:25:45 +0300
  • 665563e0b8 update setting options ghost 2023-08-16 22:30:55 +0300
  • 70db9620ec replace simple_html_dom library with Symfony\Component\DomCrawler ghost 2023-08-16 22:01:10 +0300
  • caa0df67ee update readme ghost 2023-08-16 12:47:09 +0300
  • 644270ee11 update readme ghost 2023-08-15 11:50:18 +0300
  • c081f27766 update readme ghost 2023-08-15 11:39:04 +0300
  • a27cb61f69 replace memcached to Yggverse\Cache\Memory API ghost 2023-08-15 11:16:11 +0300
  • 30520f6047 search page speed optimization, yggverse/cache library integration begin ghost 2023-08-15 10:24:37 +0300
  • dc1b3a169c add peak memory usage debug ghost 2023-08-15 09:35:31 +0300
  • e7201c33de add memory usage debug ghost 2023-08-15 09:21:43 +0300
  • c9a354e4ba implement hostSetting set/get methods ghost 2023-08-14 12:22:54 +0300
  • b2d7fb2fef fix line break return ghost 2023-08-14 12:08:05 +0300
  • 6085677e67 upgrade yggstate db query ghost 2023-08-11 00:35:03 +0300
  • ab0391e29e fix url parser path ghost 2023-08-07 14:14:12 +0300
  • f8845c620f update installation/setup guide ghost 2023-08-07 14:04:32 +0300
  • 183ae91ccc add composer support, refactor FS tree to psr-4 ghost 2023-08-07 14:00:13 +0300
  • 7bb1eb5b61 add class deprecation notice ghost 2023-08-07 13:22:24 +0300
  • 034a683df7 add YGGstate DB crawl integration ghost 2023-08-07 00:13:04 +0300
  • 3d9db381e8 fix CRAWL_MANIFEST_API_VERSION ghost 2023-08-06 21:27:56 +0300
  • 3c3443b3fd freeze crawl on remote storage connection lost, infinitely repeat new attempt after 60 seconds until storage connected again ghost 2023-08-06 17:57:42 +0300
  • 872ea25d00 remove deprecated condition ghost 2023-08-05 22:00:26 +0300
  • fff75d4d86 update debug message ghost 2023-08-05 21:58:18 +0300
  • 6eefd9b762 fix undefined variable ghost 2023-08-05 21:57:11 +0300
  • e953c01eaa update debug message ghost 2023-08-05 21:55:37 +0300
  • bd212edb97 update debug message ghost 2023-08-05 21:52:26 +0300
  • 1b287c8d28 update debug message ghost 2023-08-05 21:40:59 +0300
  • 562b97ba8f update debug message ghost 2023-08-05 21:39:44 +0300
  • c5ae6974bd fix PDO calls ghost 2023-08-05 21:36:28 +0300
  • b3ec1d42a7 fix empty URI processing ghost 2023-08-05 21:31:33 +0300
  • 7ddb47619a update debug message ghost 2023-08-05 21:17:05 +0300
  • 9fe33a3b2c update CLI roadmap ghost 2023-08-05 21:16:09 +0300
  • 6e069a86e5 update readme ghost 2023-08-05 21:11:40 +0300
  • 513addc7af add query totals counting, update crawler debug ghost 2023-08-05 21:03:45 +0300
  • 6e03a76ed8 add CURLOPT_SSL_VERIFYHOST/CURLOPT_SSL_VERIFYPEER options ghost 2023-08-05 20:24:47 +0300
  • 004a5336de remove htmls pages ban on title tag not available ghost 2023-08-05 20:01:31 +0300
  • f9774f2431 add innodb_buffer_pool_size default value ghost 2023-08-05 19:51:30 +0300
  • de28d85a71 add connection exceptions ghost 2023-08-05 19:39:49 +0300
  • 142d496108 fix SQL syntax error ghost 2023-08-05 19:31:29 +0300
  • d46c4921c5 add page break ghost 2023-08-05 19:24:32 +0300
  • 80b33f619c fix PAGES_LIMIT condition ghost 2023-08-05 19:24:21 +0300
  • d024ffd770 implement unlimited settings customization for each host ghost 2023-08-05 19:06:39 +0300
  • ab6c0379c8 implement hosts crawl queue, move robots, sitemaps, manifests to this task ghost 2023-08-04 09:32:12 +0300
  • 6ee5e53ef4 show sitemaps processed debug ghost 2023-08-04 09:07:46 +0300
  • 71724ae33f refactor manifest crawling ghost 2023-08-04 09:00:03 +0300
  • cb37c57bc4 rename example files ghost 2023-08-03 18:49:29 +0300
  • 68d5820f30 reserve one hour for huge load operations ghost 2023-08-03 18:47:39 +0300
  • efbbf19601 fix multimedia snaps ghost 2023-08-03 17:41:55 +0300
  • 6862fb35cd update readme ghost 2023-08-03 15:33:34 +0300
  • 282a6d609d update manifest API ghost 2023-08-03 15:31:57 +0300
  • b24d31f360 refactor cleaner, delegate tasks to crawler, init hostSetting table ghost 2023-08-03 15:25:38 +0300
  • fd90e2d517 keep banned pages data ghost 2023-08-03 14:31:06 +0300
  • ab8b6f6315 rename variables ghost 2023-08-03 14:24:37 +0300
  • 02612d098b delete getFoundHostPage method, update API version ghost 2023-08-03 14:08:45 +0300
  • 11e02da66d memory usage optimization, rename methods, remove memchached dependency from the model ghost 2023-08-03 10:48:27 +0300
  • cbabea595b rename method name ghost 2023-08-03 10:26:37 +0300