Parse HTML in Java using the best library
Hello everyone
I want to parse HTML in java using a library. Can anyone tell me what is the best library to parse HTML in java?
Thanks
Hello everyone
I want to parse HTML in java using a library. Can anyone tell me what is the best library to parse HTML in java?
Thanks
JTidy could be the one you are looking for. Here's what you'll get using that library :
You can use it to clean up faulty HTML, plus, You'll be provided with the DOM interface to the processing document. Ultimately, it's a good choice to parse HTML in java.
Jsoup is also a good choice for HTML processing. It'll do the same as Jtidy except it'll use tag query. i.e. tag selector syntax.
In my view, i'd go for Jtidy as it is kind of less technical or more flexible to work with you might say.