How to split a 6 or 7 GB file into several sub-2 GB files without splitting entry?

8,988

Solution 1

If you don't have any lines longer than 2GB, you can use

split --line-bytes=2GB

From the info manual:

‘--line-bytes=SIZE’
 Put into each output file as many complete lines of INPUT as
 possible without exceeding SIZE bytes.  Individual lines or records
 longer than SIZE bytes are broken into multiple files.

Solution 2

I believe this will provide almost what you need

split -n

-n, --number=CHUNKS
              generate CHUNKS output files.


CHUNKS may be: 
N       split into N files based on size of input
K/N     output Kth of N to stdout
l/N     split into N files without splitting lines
l/K/N   output Kth of N to stdout without splitting lines
r/N     like 'l' but use round robin distribution
r/K/N   likewise but only output Kth of N to stdout
Share:
8,988

Related videos on Youtube

Admin
Author by

Admin

Updated on September 18, 2022

Comments

  • Admin
    Admin over 1 year

    My level has 6 to 10 GB sized files as input. These files contain several lines of data. The next level's max input capacity is 2GB. So I have to split these 6-10 GB files into several sub-2 GB files without breaking lines! Basically I have to split a file based on size but without breaking lines.

  • Kip
    Kip about 2 years
    split on OSx does not support this parameter :(