Randomly copy certain amount of certain file type from one directory into another
Solution 1
You could use shuf
:
shuf -zn8 -e *.jpg | xargs -0 cp -vt target/
shuf
shuffles the list of*.jpg
files in the current directory.-z
is to zero-terminate each line, so that files with special characters are treated correctly.-n8
exitsshuf
after 8 files.xargs -0
reads the input delimited by a null character (fromshuf -z
) and runscp
.-v
is to print every copy verbosely.-t
is to specify the target directory.
Solution 2
The best answer absolutely didn't worked for me, because -e *.jpg
doesn't actually look into the working directory. It's just an expression. So shuf
doesn't shuffle anything...
I found the following improvement based on what I learned in that post.
find /some/dir/ -type f -name "*.jpg" -print0 | xargs -0 shuf -e -n 8 -z | xargs -0 cp -vt /target/dir/
Solution 3
You can also do this with Python.
Here is a python scscript I use to move a random percent of images that also gets associated label datasets typically required for CV image datasets. Note this moves the files because I do not want my test training dataset in my training dataset.
I use the below for Yolo training sets as labels and images are in the same directory and the labels are txt files.
import numpy as np
import os
import random
#set directories
directory = str('/MauiData/maui_complete_sf_train')
target_directory = str('/MauiData/maui_complete_sf_test')
data_set_percent_size = float(0.07)
#print(os.listdir(directory))
# list all files in dir that are an image
files = [f for f in os.listdir(directory) if f.endswith('.jpg')]
#print(files)
# select a percent of the files randomly
random_files = random.sample(files, int(len(files)*data_set_percent_size))
#random_files = np.random.choice(files, int(len(files)*data_set_percent_size))
#print(random_files)
# move the randomly selected images by renaming directory
for random_file_name in random_files:
#print(directory+'/'+random_file_name)
#print(target_directory+'/'+random_file_name)
os.rename(directory+'/'+random_file_name, target_directory+'/'+random_file_name)
continue
# move the relevant labels for the randomly selected images
for image_labels in random_files:
# strip extension and add .txt to find corellating label file then rename directory.
os.rename(directory+'/'+(os.path.splitext(image_labels)[0]+'.txt'), target_directory+'/'+(os.path.splitext(image_labels)[0]+'.txt'))
continue
Solution 4
You could retrieve files in this way:
files=(/tmp/*.jpg)
n=${#files[@]}
file_to_retrieve="${files[RANDOM % n]}"
cp $file_to_retrieve <destination>
make a loop 8 times.
Related videos on Youtube
Admin
Updated on September 18, 2022Comments
-
Admin almost 2 years
Sometimes I have a folder full of jpg's and I need to randomly choose 8 or so of them. How could I automate this so my account randomly chooses 8 jpg's from the folder and copies them to another destination?
My question is simple really, instead of using
cp
and giving it a file name then destination file name, I want to build a script that randomly chooses 8 of the .jpgs in the folder, and copies those to another folder. -
roaima over 6 yearsThe
-e *.jpg
expects a set of matching files in the current directory. If there are no matches it will (usually) return the single literal*.jpg
toshuf
, which then has only one element to consider. -
gented over 5 yearsSo essentially rather than an answer you provide a list of variable names.
-
havakok over 4 yearsWhat if some of the file names start with
-
? I triedshuf -zn8 -e *.jpg | xargs -0 cp -vt -- {} target/
to no avail. -
Asad Aizaz over 4 yearsThank you for this solution; it works with a large number of files, as opposed to the accepted solution.
-
Jake Ireland over 3 yearsFor anyone who has found this answer and has seen @havakok's question, they also asked the question here and obtained an answer: unix.stackexchange.com/a/544902/372726
-
Manu CJ over 3 yearsThis solution does not work with a large number of files. Halavus answer solves the problem in that case.
-
Phlogi over 3 yearsIf you are on MacOS, first install coreutils for the shuf command (
brew install coreutils
), then use:find /some/dir/ -type f -name "*.jpg" -print0 | xargs -0 shuf -e -n 8 -z | xargs -0 -J % cp -v % /your/target/dir
-
Admin about 2 yearsFor me it is the best answer because for a huge directory with thousand of images, the @chaos answer fails