Importing CSV file into Hadoop

31,512

Solution 1

2 steps to import csv file

  1. move csv file to hadoop sanbox (/home/username) using winscp or cyberduck.
  2. use -put command to move file from local location to hdfs.

        hdfs dfs -put /home/username/file.csv /user/data/file.csv
    

Solution 2

There are three flags that we can use for load data from local machine into HDFS,

-copyFromLocal

We use this flag to copy data from the local file system to the Hadoop directory.

hdfs dfs –copyFromLocal /home/username/file.csv /user/data/file.csv

If the folder is not created as HDFS or root user we can create the folder:

hdfs dfs -mkdir /user/data

-put

As @Sam mentioned in the above answer we also use -put flag to copy data from the local file system to the Hadoop directory.

hdfs dfs -put /home/username/file.csv /user/data/file.csv

-moveFromLocal

we also use -moveFromLocal flag to copy data from the local file system to the Hadoop directory. But this will remove the file from the local directory

hdfs dfs -moveFromLocal /home/username/file.csv /user/data/file.csv
Share:
31,512
akaliza
Author by

akaliza

Updated on December 22, 2020

Comments

  • akaliza
    akaliza over 3 years

    I am new with Hadoop, I have a file to import into hadoop via command line (I access the machine through SSH)

    How can I import the file in hadoop? How can I check afterward (command)?