2011年6月26日星期日

Splitting file based on line numbers

http://www.expertsheaven.com/split-file-based-on-line-numbers-in-unix/

This script will be useful if you require to split a huge file based on number of lines or records. Normal file splitters available in the market split the file based on the size (byte, KB, MB) which cannot be used to split based on number of lines or records.
Steps to use the script:
  1. Save the below script as lsplit.ksh

    propDIR=./
    propFile=$propDIR/SSNRange.txt.prop
    inpFile=$1
    date
    startLineNo=1
    count=1
    while read line
    do
    startLineNo=`echo $line | cut -f1 -d,`
    endLineNo=`echo $line | cut -f2 -d,`
    if [ "$endLineNo" != "" -a "$startLineNo" != "" ]; then
    echo "Cut here from $startLineNo to $endLineNo"
    sed -n "$startLineNo","$endLineNo"p $inpFile > $inpFile.split.$count
    count=`expr $count + 1`
    fi
    done < $propFile
    date
  2. Create a properties file SSNRange.txt.prop which would contain the range of records or lines. Example of properties file is as follows
    1
    2
    3
    1,400
    401,1504
    1505, 7000
  3. Run the script
    1
    $ lsplit.ksh infile.txt
  4. Three output files will be created
    • infile.txt.split.1 –> Creates a file with first 400 lines
    • infile.txt.split.2 –> Creates a file with lines starting from 401 to 1504
    • infile.txt.split.3 –> Creates a file with lines starting from 1505 to 7000
Advantages of this script:
  • File is split based on line numbers are records.
  • No manual editing is required the correct the first and last records
  • Easy to handle it in batch

没有评论:

发表评论