Details
Sparkler: Evolving Apache Nutch to run on Spark
|
A web crawler is a bot program that fetches resources from the web for the sake of building applications like search engines, knowledge bases, etc. Sparkler (contraction of Spark-Crawler) is a new web crawler that makes use of recent advancements in distributed computing and information retrieval domains by conglomerating various Apache projects like Spark, Kafka, Lucene/Solr, Tika, and Felix. Sparkler is an extensible, highly scalable, and high-performance web crawler that is an evolution of Apache Nutch and runs on Apache Spark Cluster. |
Submitted by elementlist on Mar 25, 2017 |
350 views. Averaging 0 views per day. |
Please login or register if you wish to leave a comment.
Submit
New Links
Most Popular
Quick Search
Statistics
3,012 listings in 21 categories, with 2,248,766 clicks. Directory last updated Sep 12, 2023.
Welcome Amara Fatima, the newest member.
Comments on Sparkler: Evolving Apache Nutch to run on Spark