How can I execute Sqoop in Java?

14,602

Solution 1

There is a trick which worked out for me pretty easy. Via ssh you can execute Sqoop command directly. Just you have to use is a SSH Java Library

You have to follow this step.

Download sshxcute java library : https://code.google.com/p/sshxcute/ and Add it to the build path of your java project which contains the following java code

import net.neoremind.sshxcute.core.SSHExec;
import net.neoremind.sshxcute.core.ConnBean;
import net.neoremind.sshxcute.task.CustomTask;
import net.neoremind.sshxcute.task.impl.ExecCommand;

public class TestSSH {

   public static void main(String args[]) throws Exception{

    // Initialize a ConnBean object, parameter list is ip, username, password

    ConnBean cb = new ConnBean("192.168.56.102", "root","hadoop");

    // Put the ConnBean instance as parameter for SSHExec static method getInstance(ConnBean) to retrieve a singleton SSHExec instance
    SSHExec ssh = SSHExec.getInstance(cb);          
    // Connect to server
    ssh.connect();
    CustomTask sampleTask1 = new ExecCommand("echo $SSH_CLIENT"); // Print Your Client IP By which you connected to ssh server on Horton Sandbox
    System.out.println(ssh.exec(sampleTask1));
    CustomTask sampleTask2 = new ExecCommand("sqoop import --connect jdbc:mysql://192.168.56.101:3316/mysql_db_name --username=mysql_user --password=mysql_pwd --table mysql_table_name --hive-import -m 1 -- --schema default");
    ssh.exec(sampleTask2);
    ssh.disconnect();   
   }
}

Solution 2

You can use "SqoopOptions" for executing sqoop in your Java Program.

This is a sample code for importing a table in MySql to HDFS.

public static void importSQLToHDFS() throws Exception {
    String driver = "com.mysql.jdbc.Driver";
    Class.forName(driver).newInstance();

    Configuration config = new Configuration(); 
    config.addResource(new Path("/.../conf/core-site.xml"));
    config.addResource(new Path("/.../conf/hdfs-site.xml"));

    properties.load(new FileInputStream("/.../sqoopimport.properties"));

    SqoopOptions options = new SqoopOptions();
    options.setDriverClassName(driver);
    options.setHadoopHome("/.../hadoop-0.20.2-cdh3u2");
    options.setConnectString(properties.getProperty("db_connection_string"));
    options.setTableName(properties.getProperty("db_mysql_table_name"));
    options.setUsername(properties.getProperty("db_usr_id"));
    options.setPassword(properties.getProperty("db_passwd"));
    options.setNumMappers(1);
    options.setTargetDir(properties.getProperty("path_export_file"));
    options.setFileLayout(FileLayout.TextFile);

    new ImportTool().run(options);
}

For Exporting see sample code below. Note: Here property file is not used. Make sure you have created the table to which the data is to be imported.

public static boolean exportHDFSToSQL() throws InstantiationException, IllegalAccessException, ClassNotFoundException {
    try {
         SqoopOptions options=new SqoopOptions();
         options.setConnectString("jdbc:mysql://localhost:3306/dbName");
         options.setUsername("user_name");
         options.setPassword("pwd");
         options.setExportDir("path of file to be exported from hdfs");
         options.setTableName("table_name");
         options.setInputFieldsTerminatedBy(',');
         options.setNumMappers(1);
         new ExportTool().run(options);
    } catch (Exception e) {
        return false;
    }
    return true;
}

Solution 3

firstly let me mention that Sqoop 1 do not have official client API. Even though calling Sqoop in a way that you do is quite common and working.

Based on the log I would guess that your java application that is executing Sqoop do not have hadoop configuration on the classpath. Thus Sqoop won't get information about your cluster and will work in "local" mode. You need to put the hadoop configuration into your classpath in order to run Sqoop against remote cluster. Please checkout following entry on stackoverflow for more details.

Share:
14,602

Related videos on Youtube

yoonhok
Author by

yoonhok

Updated on June 04, 2022

Comments

  • yoonhok
    yoonhok almost 2 years

    I created a new Java project, then I added Library Sqoop and Hadoop. (The Libraries are "hadoop-core-1.1.1.jar, sqoop-1.4.2.jar, etc.... ".)

    Then I tried below code:

    public class MySqoopDriver {
        public static void main(String[] args) {
            String[] str = { "export", "--connect", "jdbc:mysql://localhost/mytestdb", "--hadoop-home",
                    "/home/yoonhok/development/hadoop-1.1.1", "--table", "tbl_1", "--export-dir", "hdfs://localhost:9000/user/hive/warehouse/tbl_1",
                    "--username", "yoonhok", "--password", "1234"};
    
            Sqoop.runTool(str);
        }
    }
    

    The parameters are right, since when I try in terminal, it works well.

    But it didn't work. The error message is:

    13/02/17 16:23:07 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
    13/02/17 16:23:07 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
    13/02/17 16:23:07 INFO tool.CodeGenTool: Beginning code generation
    13/02/17 16:23:07 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tbl_1` AS t LIMIT 1
    13/02/17 16:23:07 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tbl_1` AS t LIMIT 1
    13/02/17 16:23:07 INFO orm.CompilationManager: HADOOP_HOME is /home/yoonhok/development/hadoop-1.1.1
    Note: /tmp/sqoop-yoonhok/compile/86a3cab62184ad50a3ae11e7cb0e4f4d/tbl_1.java uses or overrides a deprecated API.
    Note: Recompile with -Xlint:deprecation for details.
    13/02/17 16:23:08 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-yoonhok/compile/86a3cab62184ad50a3ae11e7cb0e4f4d/tbl_1.jar
    13/02/17 16:23:08 INFO mapreduce.ExportJobBase: Beginning export of tbl_1
    13/02/17 16:23:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    13/02/17 16:23:09 INFO input.FileInputFormat: Total input paths to process : 1
    13/02/17 16:23:09 INFO input.FileInputFormat: Total input paths to process : 1
    13/02/17 16:23:09 INFO mapred.JobClient: Cleaning up the staging area file:/tmp/hadoop-yoonhok/mapred/staging/yoonhok1526809600/.staging/job_local_0001
    13/02/17 16:23:09 ERROR security.UserGroupInformation: PriviledgedActionException as:yoonhok cause:java.io.FileNotFoundException: File /user/hive/warehouse/tbl_1/000000_0 does not exist.
    13/02/17 16:23:09 ERROR tool.ExportTool: Encountered IOException running export job: java.io.FileNotFoundException: File /user/hive/warehouse/tbl_1/000000_0 does not exist.
    

    When I check the HDFS, the file exists:

    hadoop fs -ls /user/hive/warehouse/tbl_1
    Found 1 items
    -rw-r--r--   1 yoonhok supergroup        240 2013-02-16 18:56 /user/hive/warehouse/tbl_1/000000_0
    

    How can I execute Sqoop in my Java program?

    I tried Processbuilder and Process, but I don't want to use them.

    I really want to use the Sqoop API, but I heard that it doesn't exist yet.

    I read this question but it doesn't work for me.

  • yoonhok
    yoonhok about 11 years
    Thanks for your help. But What is 'properties'? Do i make it? and I don't know about "/.../sqoopimport.properties" um...
  • Jickson T George
    Jickson T George about 11 years
    Ya for the above code u need to make a property file. The code above is for importing a table from MySQL to HDFS. For exporting use "new ExportTool().run(options);"