How to setup HtmlUnit in an Eclipse project?

18,552

Solution 1

This is how to set up HtmlUnit and how to export it to a runnable jar file in eclipse:

  1. Create a new java project (all default settings)
  2. Right click on the project (in the package explorer view) and go to New->Folder and name it "lib"
  3. Download HtmlUnit library (file htmlunit-2.9-bin.zip)
  4. Uncompress it and copy into our folder "lib" the contents of the folder "/htmlunit-2.9/lib/" of the uncompressed file (you can drag and drop from windows/linux desktop all the files in eclipse's package explorer and selecting to copy the files)
  5. Right click on the project again and go to Build Path->Configure Build Path...
  6. In the tab Libraries click on Add JARs...
  7. Look for our new library folder (if you don't see it close the window and go to the package explorer again, select the project folder and press F5 and carry on from step 5)
  8. Select all the files inside that folder (17 files in HtmlUnit 2.9) and close all windows
  9. Check if everything is ok by creating a very simple application (I happened to have written a simple code in this question that might help you)
  10. Everything should be fine (if it isn't, recheck the steps), so let's export the application by right clicking on the project and selecting Export...
  11. Look for Java/Runnable JAR file and click Next
  12. Select the appropiate launch configuration, destination and select "Package required libraries into generated JAR" if you want just one big file that contains your application and HtmlUnit and click on finish
  13. Open a console where your JAR file resides and execute "java -jar yourJARfile.jar" and enjoy your application

If this worked for a new project then update your own project to reflect the steps taken in the list. Hope this helps

Solution 2

New java project with default settings download library latest version of HTMUnit from Download Latest HTMLUnit jar Select new project properties-> Java Build Path -> go to library tab and add the extracted all jars files. Create a new class with main method within your new project and run a simple appliation and add this method in class and call it in main method.

`@Test
public void getElements() throws Exception {
final WebClient webClient = new WebClient();
final HtmlPage page = webClient.getPage("http://some_url");
final HtmlDivision div = page.getHtmlElementById("some_div_id");
final HtmlAnchor anchor = page.getAnchorByName("anchor_name");

webClient.closeAllWindows();
}`
Share:
18,552
Jan Lycka
Author by

Jan Lycka

Updated on June 08, 2022

Comments

  • Jan Lycka
    Jan Lycka almost 2 years

    My project includes htmlunit jars and downloads some pages content. Executable jar (which includes libs, funct. of eclipse export) thereof, however, works only on the machine on which I created it (on different it doesn't execute).

    EDIT: It doesn't execute as it doesn't show "Starting Headless Browser" MessageBox upon startup. I used Eclipse Indigo: File > Export > Runnable jar > package required libratries into generated jar

    Help, gods:

    import java.io.*;
    import com.gargoylesoftware.htmlunit.BrowserVersion;
    import com.gargoylesoftware.htmlunit.Page;
    import com.gargoylesoftware.htmlunit.RefreshHandler;
    import com.gargoylesoftware.htmlunit.WebClient;
    import com.gargoylesoftware.htmlunit.html.HtmlPage;
    import com.gargoylesoftware.htmlunit.html.HtmlTextInput;
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    import javax.swing.*;
    import javax.swing.filechooser.FileSystemView;
    

    EDIT: further code, as requested

    public class MyTest
    {
    public static void main(String[] arguments) {
    try{
    JOptionPane.showMessageDialog(null, "Starting Headless Browser");
    JFileChooser fr = new JFileChooser();
    FileSystemView fw = fr.getFileSystemView();
    String MyDocuments = fw.getDefaultDirectory().toString();
    
    FileInputStream fstream = new FileInputStream(MyDocuments+"\\Links.txt");
    DataInputStream in = new DataInputStream(fstream);
    BufferedReader br = new BufferedReader(new InputStreamReader(in));
    String strLine;
    String strLineID;
    
    FileWriter xfstream = new FileWriter(MyDocuments+"\\NewPageContentList.txt");
    BufferedWriter out = new BufferedWriter(xfstream);
    while ((strLineID = br.readLine()) != null)   {
    strLine = br.readLine();
    out.write(strLineID);
    out.write("\r\n");
    out.write(DownloadPage(strLine));
    out.write("\r\n");
    }
    
    out.close();
    in.close();
    JOptionPane.showMessageDialog(null, "HeadLess Browser Process Has Finished");
    }
    
    catch (Exception e){
    JOptionPane.showMessageDialog(null, "error");
    }
    }
    
    public static String DownloadPage(String str){
    final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_3_6);
    webClient.setThrowExceptionOnScriptError(false);
    
    try{
    final HtmlPage page = webClient.getPage(str);
    final String pageAsText = str_replace("\n","",str_replace("\r","",page.asText()));
    
    return pageAsText;
    }
    
    catch(IOException e){
    JOptionPane.showMessageDialog(null, "error");
    }
    
    webClient.closeAllWindows();
    return "";
    }
    
    public static String str_replace (String search, String replace, String subject)
    {
    StringBuffer  result = new StringBuffer (subject);
    int  pos = 0;
    while (true)
    {
    pos = result.indexOf (search, pos);
    if (pos != -1)
    result.replace (pos, pos + search.length (), replace);
    else
    break;
    }
    
    return result.toString ();
    }
    }
    
  • Renjith K N
    Renjith K N almost 12 years
    Hi Mosty, Will u please upload a picture/image of project structrue using HTML unit