How t get specific value from html in java?

11,970

Solution 1

You can use library like Jsoup

You can get it from here --> Download Jsoup

Here is its API reference --> Jsoup API Reference

Its really very easy to parse HTML content using Jsoup.

Below is a sample code which might be helpful to you..

public class GetPTags {

           public static void main(String[] args){

             Document doc =  Jsoup.parse(readURL("http://www.todaysgoldrate.co.intodays-gold-rate-in-pune/"));
             Elements p_tags = doc.select("p");
             for(Element p : p_tags)
             {
                 System.out.println("P tag is "+p.text());
             }

            }

        public static String readURL(String url) {

        String fileContents = "";
        String currentLine = "";

        try {
            BufferedReader reader = new BufferedReader(new InputStreamReader(new URL(url).openStream()));
            fileContents = reader.readLine();
            while (currentLine != null) {
                currentLine = reader.readLine();
                fileContents += "\n" + currentLine;
            }
            reader.close();
            reader = null;
        } catch (Exception e) {
            JOptionPane.showMessageDialog(null, e.getMessage(), "Error Message", JOptionPane.OK_OPTION);
            e.printStackTrace();

        }

        return fileContents;
    }

}

Solution 2

http://java-source.net/open-source/crawlers

You can use any of that's apis, but don't parse the HTML with the pure JDK, because it's too painfull.

Share:
11,970
Sandip Armal Patil
Author by

Sandip Armal Patil

Android Developer in Programr , Pune. My responsibility it to develop Android Application. I also work on HTML5, JavaScript. I love Reading. I like to travel and visit historical places. Here is my Android Blog. Here is my two Android App. Find Anything and We care baby

Updated on June 30, 2022

Comments

  • Sandip Armal Patil
    Sandip Armal Patil almost 2 years

    I am developing one Application which show Gold rate and create graph for this.
    I find one website which provide me this gold rate regularly.My question is how to extract this specific value from html page.
    Here is link which i need to extract = http://www.todaysgoldrate.co.in/todays-gold-rate-in-pune/ and this html page have following tag and content.

    <p><em>10 gram gold Rate in pune = Rs.31150.00</em></p>     
    

    Here is my code which i use for extracting but i didn't find way to extract specific content.

    public class URLExtractor {
    
    private static class HTMLPaserCallBack extends HTMLEditorKit.ParserCallback {
    
        private Set<String> urls;
    
        public HTMLPaserCallBack() {
            urls = new LinkedHashSet<String>();
        }
    
        public Set<String> getUrls() {
            return urls;
        }
    
        @Override
        public void handleSimpleTag(Tag t, MutableAttributeSet a, int pos) {
            handleTag(t, a, pos);
        }
    
        @Override
        public void handleStartTag(Tag t, MutableAttributeSet a, int pos) {
            handleTag(t, a, pos);
        }
    
        private void handleTag(Tag t, MutableAttributeSet a, int pos) {
            if (t == Tag.A) {
                Object href = a.getAttribute(HTML.Attribute.HREF);
                if (href != null) {
                    String url = href.toString();
                    if (!urls.contains(url)) {
                        urls.add(url);
                    }
                }
            }
        }
    }
    
    public static void main(String[] args) throws IOException {
        InputStream is = null;
        try {
            String u = "http://www.todaysgoldrate.co.in/todays-gold-rate-in-pune/";   
            //Here i need to extract this content by tag wise or content wise....  
    

    Thanks in Advance.......