mirror of https://github.com/YGGverse/Yo.git
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
ghost
795a28c20e
|
1 year ago | |
---|---|---|
src | 1 year ago | |
.gitignore | 1 year ago | |
LICENSE | 1 year ago | |
README.md | 1 year ago | |
composer.json | 1 year ago |
README.md
Yo! Micro Web Crawler in PHP & Manticore
Next generation of YGGo! project with goal to reduce server requirements and make deployment process simpler.
Index model changed to the distributed cluster model, and oriented to aggregate search results from different instances trough API.
Codebase following minimalism such as possible.
Implementation
Engine written in PHP and uses Manticore on backend.
Default build inspired and adapted for Yggdrasil eco-system but could be used to make own search project.
Components
- CLI tools for index operations
- JS-less frontend to make search web portal
- API tools to make search index distributed
Features
- MIME-based crawler with flexible filter settings
- Page snap history with local and remote mirrors support
Documentation
CLI
Index
Init
Create initial index
php src/cli/index/init.php [reset]
reset
- optional, reset existing index
Document
Add
php src/cli/document/add.php URL
URL
- add new URL to the crawl queue
Crawl
php src/cli/document/crawl.php
Search
php src/cli/document/search.php '@title "*"' [limit]
query
- requiredlimit
- optional search results limit