How to use the Solr Data Import Handler to index a MySQL table?

13,949

Solution 1

just make sure you have these lines of code in solrconfig.xml file

<lib dir="../../../contrib/dataimporthandler/lib/" regex=".*\.jar" />
<lib dir="../../../dist/" regex="solr-dataimporthandler-\d.*\.jar" />

make sure the path of those jar files and those jar files should be available physically at that path. If you don't have then please add that and try to restart the tomacat server and hopefully it will be resolved.

Solution 2

I know this question is old but I have recently had an occasion to set it up and had similar problems using Bitnami (Windows).

  1. In \dist make sure you have dataimporter and mysqlconnector:

solr-dataimporthandler-4.9.0.jar

mysql-connector-java-5.1.32-bin.jar

  1. In \contrib\dataimporthandler\lib

    activation-1.1.1.jar

    mail-1.4.3.jar

  2. In your collection solrconfig.xml should have

    <lib dir="../../contrib/dataimporthandler/lib/" regex=".*\.jar" />
    <lib dir="../../dist/" regex="solr-dataimporthandler-.*\.jar" />  
    <lib dir="../../dist/" regex="mysql-connector-java-.*\.jar"/>  
    

and:

  <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
    <lst name="defaults">
      <str name="config">db-data-config.xml</str>
    </lst>
  </requestHandler>
Share:
13,949
Admin
Author by

Admin

Updated on June 05, 2022

Comments

  • Admin
    Admin almost 2 years

    When I try to import a mysql table by loading this in the browser:

    http://192.168.136.129:8983/solr/dataimport?command=full-import
    

    I get this error:

    HTTP ERROR 404
    
    Problem accessing /solr/dataimport. Reason:
    
        NOT_FOUND
    
    Powered by Jetty://
    

    I'm following this tutorial from the official Solr wiki to get started with the DIH:

    http://wiki.apache.org/solr/DIHQuickStart

    As per the tutorial I added this to my solrconfig.xml:

    <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
      <lst name="defaults">
        <str name="config">data-config.xml</str>
      </lst>
    </requestHandler>  
    

    in data-config.xml I have the following:

    <dataConfig>
      <dataSource type="JdbcDataSource" 
                  driver="com.mysql.jdbc.Driver"
                  url="jdbc:mysql://localhost/site" 
                  user="root" 
                  password="123"/>
      <document>
        <entity name="profiles" 
                query="select user_id,about,music,movies,occupation from profiles">
        </entity>
      </document>
    </dataConfig>
    

    And these are the fields defined in my schema.xml:

      <fields>
        <field name="user_id" type="string" indexed="true" stored="true" required="true" />
        <field name="about" type="string" indexed="true" stored="true" />
        <field name="music" type="string" indexed="true" stored="true" />
        <field name="movies" type="string" indexed="true" stored="true" />
        <field name="occupation" type="string" indexed="true" stored="true" />  
        <field name="text" type="text_general" indexed="true" stored="false" multiValued="true"/>
      </fields>
    
      <uniqueKey>user_id</uniqueKey>
    

    So what am I doing wrong? I imagine it may have something to do with the data-config.xml file. In it I don't know if a certain path to the MySQL driver is being assumed. I downloaded the MySQL JDBC driver from here:

    http://dev.mysql.com/downloads/connector/j/3.1.html

    and put it in my /solr/lib directory.

    When I downloaded the driver and extracted it there was a bunch of folders inside one folder called "mysql-connector-java-3.0.17-ga".

    I do notice that inside that there is a dir called: com and inside that mysql and inside that jbdc and inside that there is a file called Driver.class.

    Is this what is being referenced from data-config.xml? If so why isn't the initial directory not mentioned.

    Basically I have no idea what the issue is, can someone help please.