I want to know the size of bounding box in object-detection api

12,407

Solution 1

Just to extend Beta's answer:

You can get the predicted bounding boxes from the detection graph. An example for this is given in the Tutorial IPython notebook on github. This is where Beta's code snipped comes from. Access the detection_graph and extract the coordinates of the predicted bounding boxes from the tensor:

By calling np.squeeze(boxes) you reshape them to (m, 4), where m denotes the amount of predicted boxes. You can now access the boxes and compute the length, area or what ever you want.

But remember that the predicted box coordinates are normalized! They are in the following order:

[ymin, xmin, ymax, xmax]

So computing the length in pixel would be something like:

def length_of_bounding_box(bbox):
    return bbox[3]*IMG_WIDTH - bbox[1]*IMG_WIDTH

Solution 2

I wrote a full answer on how to find the bounding box coordinates here and thought it might be useful to someone on this thread too.

Google Object Detection API returns bounding boxes in the format [ymin, xmin, ymax, xmax] and in normalised form (full explanation here). To find the (x,y) pixel coordinates we need to multiply the results by width and height of the image. First get the width and height of your image:

width, height = image.size

Then, extract ymin,xmin,ymax,xmax from the boxes object and multiply to get the (x,y) coordinates:

ymin = boxes[0][i][0]*height
xmin = boxes[0][i][1]*width
ymax = boxes[0][i][2]*height
xmax = boxes[0][i][3]*width

Finally print the coordinates of the box corners:

print 'Top left'
print (xmin,ymin,)
print 'Bottom right'
print (xmax,ymax)

Solution 3

You can call boxes, like the following:

boxes = detection_graph.get_tensor_by_name('detection_boxes:0')

similarly for scores, and classes.

Then just call them in session run.

(boxes, scores, classes) = sess.run(
              [boxes, scores, classes],
              feed_dict={image_tensor: imageFile})
Share:
12,407

Related videos on Youtube

SUN JIAWEI
Author by

SUN JIAWEI

Updated on August 24, 2022

Comments

  • SUN JIAWEI
    SUN JIAWEI over 1 year

    I have used the API

    (https://github.com/tensorflow/models/tree/master/object_detection)

    And then,

    How would I know the length of bounding box?

    I have used Tutorial IPython notebook on github in real-time.

    But I don't know use which command to calculate the length of boxes.

  • SUN JIAWEI
    SUN JIAWEI over 6 years
    Thanks for detailed answer! ! But how can I "Access the detection_graph and extract the coordinates of the predicted bounding boxes from the tensor". I'm not know clearly about the code.
  • SUN JIAWEI
    SUN JIAWEI over 6 years
    Thanks for detailed answer! However, after calling boxes, how could I got the length in session?
  • Beta
    Beta over 6 years
    @SUNJIAWEI: If you check the boxes values, it will give you the coordinates of the objects. Suppose you are checking if the image has a person or not. The boxes give you the coordinates (or location) where the person exists in the image. If you want just length of the person, you can extract the length of the person.
  • Gal_M
    Gal_M over 6 years
    @SUNJIAWEI you access the coordinates of the ith box with boxes[0][i]
  • KolaB
    KolaB about 6 years
    @iTiger I'm not sure if something has changed in tensorflow between the dates of the answers here (and below), but I was not getting the expected results when following the [xmin, ymin, xmax, ymax] convention. I had a look at the source code of the draw_bounding_box_on_image_array function on github on 08/02/2018 using this link ( github.com/tensorflow/models/blob/master/research/… ) and the order is [ymin, xmin, ymax, xmax]
  • ITiger
    ITiger about 6 years
    @KolaB yes it looks like they changed the order of the coordinates. Thanks for your comment! I'll edit my answer.
  • Aniket Bote
    Aniket Bote over 5 years
    @ITiger the link specified by you is broken. I guess it was for older version. As of now I think the code in object_dectection.ipynb is also chnaged. What changes should I do to the new file so I get the co-ordinates of bounding box. When I print output_dict['detection_boxes'] i get 100 tuples for all the test images even for those on whom the model failed to predict. Help is appreciated.