Ideas for Coding a Web Crawler in Java

Asked By Arthur K Baker 20 points N/A Posted on - 11/18/2014

Is it feasible to write a web crawler in Java? I know some web crawlers are written in languages such as PHP but I am not entirely sure you can have one written in Java. So my question is, can you write a web crawler program in Java and have it deployed on the web to search for information? If it is possible, then do you know how efficient such a program written in Java will be?

Status: Open
Question Views: 638
Answer Count: 1
Vote Up 0 Vote Down

Answer Accepted: No
Question Category: Java

Answered By Sharath Reddy 590495 points N/A #189544

Ideas for Coding a Web Crawler in Java

At first, I thought it is not possible because most web spiders are not written using Java. But after a little digging, it turns out that there are even tutorials online that will teach you how to create your own Java web crawler. But first, of course, you need a full knowledge about Java because that’s the foundation.

A normal spider works in the following pace: first, parse the root page or the root web page, like for example, mit.edu, and gather all links from this page; second, use the URLs that you collected in the first step and then parse those URLs; third, each page needs to be tracked so that each web page gets processed only once.

The third step will require you to have a database. But if you don’t want to use a database, you can also use a file to track or monitor the history of the crawl. If you want to know how it is done, visit Web Crawler Out Of Java.

About Sharath Reddy

Questions
1

Answers
14599

Best Answers
2290

Vote Up 0 Vote Down

Posted on - 11/25/2014
Question Category: Java

Ideas for Coding a Web Crawler in Java

Ideas for Coding a Web Crawler in Java

Explain the Concept Of Platform As A Service (Paas)

The Openstack Cloud Computing Service Concept

Related Questions

Latest Articles

Rokid Max 2 Review: I Tried AR Glasses So I Could Watch Netflix in...

Top 10 Technology Trends For 2025

How To Choose The Right Linux VPS For Your Needs

Latest Blogs

Top 10 New Laptop Entrants That Shook The Public

10 Facts About The Dark Web

Top 10 Latest Steam Cleaner Machines

Latest Tips

Top 10 Internet Monitoring Software

Top 10 Best Partition Manager Software

Top 10 Best Online Music Production Software