how do i get coordinates of image shown in opencv

16,012

Solution 1

I was working on your other, related question when you deleted it and see you are having performance issues in locating the ball. As your ball appears to be on a nice, simple white background (apart from the score and the close button at top right), there are easier/faster ways of finding the ball.

First, work in greyscale so that you only have 1 channel, instead of 3 channels of RGB to process - that is generally faster.

Then, overwrite the score and menu at top-right with white pixels so that the only thing left in the image is the ball. Now invert the image so that all the whites become black, then you can use findNonZero() to find anything that is not the background, i.e. the ball.

Now find the lowest and highest coordinate in the y-direction and average them for the centre of the ball, likewise in the x-direction for the other way.

#!/usr/bin/env python3

# Load image - work in greyscale as 1/3 as many pixels
im = cv2.imread('ball.png',cv2.IMREAD_GRAYSCALE)

# Overwrite "Current Best" with white - these numbers will vary depending on what you capture
im[134:400,447:714] = 255

# Overwrite menu and "Close" button at top-right with white - these numbers will vary depending on what you capture
im[3:107,1494:1726] = 255

# Negate image so whites become black
im=255-im

# Find anything not black, i.e. the ball
nz = cv2.findNonZero(im)

# Find top, bottom, left and right edge of ball
a = nz[:,0,0].min()
b = nz[:,0,0].max()
c = nz[:,0,1].min()
d = nz[:,0,1].max()
print('a:{}, b:{}, c:{}, d:{}'.format(a,b,c,d))

# Average top and bottom edges, left and right edges, to give centre
c0 = (a+b)/2
c1 = (c+d)/2
print('Ball centre: {},{}'.format(c0,c1))

That gives:

a:442, b:688, c:1063, d:1304
Ball centre: 565.0,1183.5

which, if I draw a red box in shows:

enter image description here

The processing takes 845 microseconds on my Mac, or less than a millisecond, which corresponds to 1,183 frames per second. Obviously you have your time to grab the screen, but I can't control that.

Note that you could also resize the image down by a factor of say 4 (or maybe 8 or 16) in each direction and still be sure of finding the ball and that may make it even faster.

Keywords: Ball, track, tracking, locating, finding, position of, image, image processing, python, OpenCV, numpy, bounding box, bbox.

Solution 2

You can do it like this:

1. crop an image of the ball from a screenshot or so, sth. like

img = cv2.imread("screenshot.jpg")
crop_img = img[y:y+h, x:x+w] # you will have to look for the parameters by trial and error

2. use template matching to look where the ball is in your image

3. get the point in the middle of the resulting rectangle and move your mouse there

I hope this helps, if you need more help on how to achieve any of this feel free to ask

Share:
16,012
Lomore
Author by

Lomore

hobby coder

Updated on July 19, 2022

Comments

  • Lomore
    Lomore almost 2 years

    Sorry but title doesnt really make sense

    i am trying to make an ai that clicks on the ball to make it bounce. for context heres a picture of the application enter image description here

    in the game when you click the ball it goes up and then comes back down and the aim of the game is to keep it up.

    i have writen some code that turns the image into a mask with opencv, heres a picture of the result

    enter image description here

    what i now need to do is find the location of the ball in pixels/coordinates so i can make the mouse move to it and click it. By the way the ball has a margin on the left and right of it so it doesn't just go strait up and down but left and right too. Also the ball isnt animated,just a moving image.

    How would i get the ball location in pixels/coordinates so i can move the mouse to it.

    heres a copy of my code:

    import numpy as np
    from PIL import ImageGrab
    import cv2
    import time
    import pyautogui
    
    
    def draw_lines(img,lines):
        for line in lines:
            coords = line[0]
            cv2.line(img, (coords[0], coords[1]), (coords[2], coords[3]), [255,255,255], 3)
    
    def process_img(original_image):
        processed_img = cv2.cvtColor(original_image, cv2.COLOR_BGR2GRAY)
        processed_img = cv2.Canny(processed_img, threshold1=200, threshold2=300)
        vertices = np.array([[0,0],[0,800],[850,800],[850,0]
                             ], np.int32)
        processed_img = roi(processed_img, [vertices])
    
        # more info: http://docs.opencv.org/3.0-beta/doc/py_tutorials/py_imgproc/py_houghlines/py_houghlines.html
        #                          edges       rho   theta   thresh         # min length, max gap:        
        lines = cv2.HoughLinesP(processed_img, 1, np.pi/180, 180,      20,         15)
        draw_lines(processed_img,lines)
        return processed_img
    
    def roi(img, vertices):
        #blank mask:
        mask = np.zeros_like(img)
        # fill the mask
        cv2.fillPoly(mask, vertices, 255)
        # now only show the area that is the mask
        masked = cv2.bitwise_and(img, mask)
        return masked
    def main():
        last_time = time.time()
        while(True):
            screen =  np.array(ImageGrab.grab(bbox=(0,40, 800, 850)))
            new_screen = process_img(screen)
            print('Loop took {} seconds'.format(time.time()-last_time))
            last_time = time.time()
            cv2.imshow('window', new_screen)
            #cv2.imshow('window2', cv2.cvtColor(screen, cv2.COLOR_BGR2RGB))
            if cv2.waitKey(25) & 0xFF == ord('q'):
                cv2.destroyAllWindows()
                break
    
    def mouse_movement():
        ##Set to move relative to where ball is
        pyautogui.moveTo(300,400)
        pyautogui.click();
    main()
    

    Sorry if this is confusing but brain.exe has stopped working :( Thanks