How to take a webpage screenshot?

13,609

Solution 1

I think there are 3 problems and one fragility in that code:

Problems

  1. JEditorPane was never intended to be a browser.
  2. setPage(URL) loads asynchronously. It is necessary to add a listener to determine when the page has loaded.
  3. You might find some sites automatically refuse connections to Java clients.

Fragility

The fragile nature is included with the call to setBounds(). Use layouts.

Image at 400x600

Google screen shot

But looking at this image, it seems 3 does not apply here, 2 is not the problem. It comes down to point 1. JEditorPane was never intended as a browsing component. Those random characters at the bottom are JavaScript that the JEP is not only not scripting, but then improperly displaying in the page.

Solution 2

You can do an entire screen capture using Java Robot (API Here).

import java.awt.AWTException;
import java.awt.Rectangle;
import java.awt.Robot;
import java.awt.Toolkit;
import java.awt.image.BufferedImage;
import java.io.File;
import java.io.IOException;

import javax.imageio.ImageIO;

public class RobotExp {

    public static void main(String[] args) {

        try {

            Robot robot = new Robot();
            // Capture the screen shot of the area of the screen defined by the rectangle
            BufferedImage bi=robot.createScreenCapture(new Rectangle(Toolkit.getDefaultToolkit().getScreenSize()));
            ImageIO.write(bi, "jpg", new File("C:/imageTest.jpg"));

        } catch (AWTException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

This example was found here. With some modifications by me.

Solution 3

Have a look at flying-saucer. Great for generating images and pdf's from HTML pages.

Solution 4

Your problem is that you're using Java's JEditorPane to render the webpage, which has a very limited HTML rendering engine. It is simply not able to display more complex webpages as well as a modern Browser.

If you need to produce screenshots of correctly rendered complex webpages using Java, the best way is probably to use Selenium to control a real browser like Firefox.

Solution 5

The javadoc states

HTML text. The kit used in this case is the class javax.swing.text.html.HTMLEditorKit which provides HTML 3.2 support.

Probably that explains why the page looks a bit broken, as nowadays pages are mostly using HTML4, 5 or XHTML.....

There's an article here on SO regarding Java browser components: Best Java/Swing browser component?

Share:
13,609
Felipe Dias
Author by

Felipe Dias

Updated on August 08, 2022

Comments

  • Felipe Dias
    Felipe Dias over 1 year

    I am using this code below but the image generated is broken. I think probably it is because of the renderization options. Does anybody know what is happening?

    package webpageprinter;
    
    import java.net.URL;
    import java.awt.image.BufferedImage;
    import javax.imageio.ImageIO;
    import java.beans.PropertyChangeListener;
    import java.beans.PropertyChangeEvent;
    import javax.swing.text.html.*;
    import java.awt.*;
    import javax.swing.*;
    import java.io.*;
    
    public class WebPagePrinter {
    private BufferedImage image = null;
    
    public BufferedImage Download(String webpageurl) {
    try
    {
        URL url = new URL(webpageurl);
        final JEditorPane jep = new JEditorPane();
        jep.setContentType("text/html");
        ((HTMLDocument)jep.getDocument()).setBase(url);
        jep.setEditable(false);
        jep.setBounds(0,0,1024,768);
        jep.addPropertyChangeListener("page",new
        PropertyChangeListener() {
                    @Override
        public void propertyChange(PropertyChangeEvent e) {
        try
        {
            image = new
            BufferedImage(1024,768,BufferedImage.TYPE_INT_RGB );
            Graphics g = image.getGraphics();
            Graphics2D graphics = (Graphics2D) g;
            graphics.setRenderingHint(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_ON);
            jep.paint(graphics);
            ImageIO.write(image,"png",new File("C:/webpage.png"));
        }
        catch (Exception re)
        {
            re.printStackTrace();
        }
        }});
        jep.setPage(url);
    
    }
    catch (Exception e)
    {
    e.printStackTrace();
    }
    return image;
    }
    
        public static void main(String[] args) {
    
            new WebPagePrinter().Download("http://www.google.com");
    
        }
    }
    
  • fvu
    fvu over 12 years
    Their page specifies XHTML, did you try it with HTML4 or 5?
  • joostschouten
    joostschouten over 12 years
    I use it in combination with JSoup (jsoup.org) when the HTML is not valid XHTML - Jsoup.parse(loadedHTML)
  • Felipe Dias
    Felipe Dias over 12 years
    Thanks for uploading the image.
  • Felipe Dias
    Felipe Dias over 12 years
    Yeah, I've already tried this, but my focus is to shot a webpage without been on it. I want to improve the code later to take screenshots from time to time automatically. But thanks!
  • Felipe Dias
    Felipe Dias over 12 years
    My intention with this code is to shot a webpage without been on it from time to time. I want to display the image in a large screen here at my office and for that I already have a HTML code that does that. I still have to implement the timer in the Java code, but do you have some suggestions about what should I use instead of JEditorPane?
  • Andrew Thompson
    Andrew Thompson over 12 years
    Sorry, it is not something I've looked much into.
  • Chexpir
    Chexpir almost 9 years
    Warning: It doesn't include JS support at all.