发布于2022年11月4日3年前 OnionSearch:一款针对洋葱域名的URL搜索脚本 OnionSearchOnionSearch是一款针对洋葱域名的URL查找脚本,该东西根据Python 3开发,能够协助广阔研究人员在不同的.onion查找引擎中完成URL地址爬取。东西要求Python 3当前支撑的查找引擎ahmiadarksearchioonionlandnotevildarksearchenginerphobosonionsearchservertorgleonionsearchenginetordextor66tormaxhaystackmultivacevosearchdeeplink东西装置PyPI装置:pip3 install onionsearchGitHub装置:git clone https://github.com/megadose/OnionSearch.git cd OnionSearch/ python3 setup.py install东西运用usage: onionsearch [-h] [--proxy PROXY] [--output OUTPUT] [--continuous_write CONTINUOUS_WRITE] [--limit LIMIT] [--engines [ENGINES [ENGINES ...]]] [--exclude [EXCLUDE [EXCLUDE ...]]] [--fields [FIELDS [FIELDS ...]]] [--field_delimiter FIELD_DELIMITER] [--mp_units MP_UNITS] search positional arguments: search The search string or phrase optional arguments: -h, --help show this help message and exit --proxy PROXY Set Tor proxy (default: 127.0.0.1:9050) --output OUTPUT Output File (default: output_$SEARCH_$DATE.txt), where $SEARCH is replaced by the first chars of the search string and $DATE is replaced by the datetime --continuous_write CONTINUOUS_WRITE Write progressively to output file (default: False) --limit LIMIT Set a max number of pages per engine to load --engines [ENGINES [ENGINES ...]] Engines to request (default: full list) --exclude [EXCLUDE [EXCLUDE ...]] Engines to exclude (default: none) --fields [FIELDS [FIELDS ...]] Fields to output to csv file (default: engine name link), available fields are shown below --field_delimiter FIELD_DELIMITER Delimiter for the CSV fields --mp_units MP_UNITS Number of processing units (default: core number minus 1) [...]多处理行为默许配置下,该脚本将会运用“mp_units = cpu_count() - 1”参数来运转。这也就意味着,如果咱们的设备CPU有四个核,它将会一起运转三个爬虫。咱们能够随意设置“mp_units”参数的值,但建议运用默许值。东西运用样例向一切的查找引擎恳求查询“computer”:onionsearch "computer"向一切的查找引擎恳求查询“computer”,但扫除“Ahmia”和“Candle”:onionsearch "computer" --exclude ahmia candle向一切的查找引擎恳求查询“computer”,需一起包括“Tor66”、“DeepLink”和“Phobos”,onionsearch "computer" --engines tor66 deeplink phobos跟上述查询内容相同,但仅限每个查找引擎查询三个页面:onionsearch "computer" --engines tor66 deeplink phobos --limit 3输出结果默许输出默许配置下,查找结果将以CSV格式存储,其中包括下列数据:"engine","name of the link","url"自定义输出列咱们还能够运用“--fields”和“--field_delimiter”参数来指定输出文件中的数据项:“--fields”能够协助咱们添加、移除和重新排序输出项:"engine","name of the link","url","domain"或者:"engine","domain"东西运用演示许可证协议本项目的开发与发布遵从GNU General Public License v3.0开源许可证协议。项目地址OnionSearch:【GitHub传送门】
创建帐户或登录后发表意见