.
  • Most popular hadoop commands

    Here is a list of the most popular hdfs or hadoop commands to manage your hdfs files. 

    List the files in a hdfs directory

    hadoop fs -ls hdfs_path
    hdfs dfs -ls hdfs_path

    Hadoop Create Directory 

    hadoop fs -mkdir hdfs://user/new_foder/

    Hadoop Create a directory tree

    hadoop fs -mkdir -p  hdfs://user/new_foder/new_subfolder/

    Hadoop copy hdfs files

    hadoop fs -cp source_hdfs_path target_hdfs_path

    cp command usage detail:

    hadoop fs -cp [-f] [-p | -p[topax]] URI [URI …] <dest>

    This command allows multiple sources as well in which case the destination must be a directory.

    [Read More...]
  • Run hadoop command in Python

    Hadoop is the most widely used big data platform for big data analysis. It is easy to run Hadoop command in Shell or a shell script. However, there is often a need to run manipulate hdfs file directly from python. We use examples to describe how to run hadoop command in python to list, save hdfs files.

    We already know how to call an extern shell command from python. We can simply call Hadoop command using the run_cmd method.

    Run Hadoop ls command in Python

     

    Run Hadoop get command in Python

    Run Hadoop put command in Python

     

    [Read More...]