Fast image normalisation in python

16,588

Your timings seem very slow to me. Perhaps something is wrong with your install?

I tried this test program:

#!/usr/bin/python3

import sys
import numpy as np
import cv2
from PIL import Image
from profilehooks import profile

@profile
def try_numpy(img):
    ar = np.array(img).astype(np.float32)
    for i in range(1000):
        mn = np.min(ar)
        mx = np.max(ar)
        norm = (ar - mn) * (1.0 / (mx - mn))

@profile
def try_cv2(img):
    for i in range(1000):
        norm = cv2.normalize(img, None, alpha=0, beta=1,
                             norm_type=cv2.NORM_MINMAX, dtype=cv2.CV_32F)

img = Image.open(sys.argv[1])
try_numpy(img)

img = cv2.imread(sys.argv[1])
try_cv2(img)

And on this modest 2015 i5 laptop running Ubuntu 19.04 I see:

$ ./try291.py ~/pics/150x150.png 
*** PROFILER RESULTS ***
try_cv2 (./try291.py:17)
function called 1 times

         1002 function calls in 0.119 seconds

   Ordered by: cumulative time, internal time, call count

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001    0.119    0.119 try291.py:17(try_cv2)
     1000    0.118    0.000    0.118    0.000 {normalize}

*** PROFILER RESULTS ***
try_numpy (./try291.py:9)
function called 1 times

         10067 function calls in 0.113 seconds

   Ordered by: cumulative time, internal time, call count
   List reduced from 52 to 40 due to restriction <40>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.064    0.064    0.113    0.113 try291.py:9(try_numpy)
     2000    0.004    0.000    0.045    0.000 fromnumeric.py:69(_wrapreduction)

So they both take about 0.1ms per call, ~50x faster than the numbers you see.

To speed it up further:

  • Do you have any a priori knowledge about the range of pixel values? Perhaps you could skip the search for the max and min.
  • Depending on your sampling density, it could be faster to normalize the whole input image, then cut out your 150x150 patches afterwards.
Share:
16,588
Pradip Gupta
Author by

Pradip Gupta

Updated on June 04, 2022

Comments

  • Pradip Gupta
    Pradip Gupta almost 2 years

    I am looking for a faster approach to normalise image in Python. I want to convert all pixels to values between 0 and 1.

    INPUT: 150x150 RGB images in JPEG format.

    OS/HARDWARE: LINUX/P40 GPU with 8GB RAM

    USE-CASE: Image Preprocessing for a real-time classification task.

    Current time per image is ~5-10 milliseconds. I am looking for a method that can reduce this time.

    I tried two approaches, with numpy and opencv.

    Using numpy (Approx time: 8ms):

    norm = (img - np.min(img)) / (np.max(img) - np.min(img))
    

    Using opencv (Approx time: 3ms):

    norm = cv2.normalize(img, None, alpha=0, beta=1, norm_type=cv2.NORM_MINMAX, dtype=cv2.CV_32F)
    

    Both these methods are slow for my usecase. Can anyone guide me with a faster method for image normalisation?

    • BlackBear
      BlackBear about 5 years
      What is your use-case exactly? How big are your images? How fast do you want it to be? (if numpy is too slow either python is the wrong language, or your expectations are unrealistic)
    • Marcin Zablocki
      Marcin Zablocki about 5 years
      First of all, you're invoking np.min 2 times, which is a waste of time if you're that concerned about performance.
    • Marcin Zablocki
      Marcin Zablocki about 5 years
      + please explain what type is your img
    • Mark Setchell
      Mark Setchell about 5 years
      Please indicate your image dimensions, how long it currently takes, how long you need it to take and give an idea of your OS and whether you are running on a Raspberry Pi or a supercomputer.
    • Pradip Gupta
      Pradip Gupta about 5 years
      Hi, I have updated the question.
    • Pradip Gupta
      Pradip Gupta about 5 years
      @MarcinZablocki: I tried that but it does not give much difference. I think the divide operation takes a big chunck of time there.
    • Mark Setchell
      Mark Setchell about 5 years
      What is your use case? And your OS and hardware please?
  • Simon Caby
    Simon Caby about 5 years
    Sorry, but you're comparing try_numpy which uses a 'division', and cv2 that (thanks god) doesn't. Actually, cv2 is a bit slower than the best solution : norm = (img - mn) * (1.0 / (mx - mn)), about 3% faster than cv2 (for some unknown reason...) And also, you don't have a clue what data type is used in try_numpy. If you force float32, it will be much better.
  • jcupitt
    jcupitt about 5 years
    Oh, thanks! I've updated it.