Algorithm to find the total number of connected sets in a matrix

19,476

Solution 1

I don't think you will need to think of it as a general graph problem and apply any algorithm such as BFS or DFS.

You will need to do three scans of the matrix.

scan 1:

start from the top

  1. give every number each 1 with 1..n, in you example the first row would after that step would look like

    1 0 0 2

  2. go to the next line and for every 1 in the row check if the neighbor to your left is non-0
    • if non-0 take on the value to the left
    • if 0 check for non-0 neighbors in the previous line and take on the value of the left most one
    • if all of those are 0 that simply add 1 to the maximum number given so far
  3. repeat 2 until last line has been processed

and your example should look like follows

1 0 0 2
0 0 2 0
0 0 2 0
3 0 0 2

scan 2:

start from the bottom check if each neighbor has the same number as the left most neighbor as well as the same number as the neighbor in the row below it

basically if you have a matrix like this

1 0 2

1 0 2

0 1 0

to check ensure that a set has really the same number

scan 3:

count the number of unique non-0 entries in the matrix

Solution 2

Connected-component labeling algorithm is intended to mark out connected groups of elements (both for 4-connectivity and for 8-connectivity)

Solution 3

Pythonic Implementation, More understandable code:

# sea is 2 D array of 0 and 1s we have to find 1's group surrounded by 0's
def dfs(sea, i, j, b, h, visited):
    surround = ((-1, -1), (0, -1), (1, -1),
                (-1, 0), (1, 0),
                (-1, 1), (0, 1), (1, 1)
                )
    if can_visit(sea, i, j, b, h, visited):
        for s in surround:
            visited[(i, j)] = 1
            dfs(sea, i + s[0], j + s[1], b, h, visited)


def can_visit(sea, i, j, b, h, visited):
    if i >= 0 and j >= 0 and i < b and j < h:
        if (i, j) not in visited and sea[i][j] == 1:
            return True


def find_island(sea):
    visited = {}
    h = len(sea)
    count = 0
    for i, row in enumerate(sea):
        b = len(row)
        for j, item in enumerate(row):
            if can_visit(sea, i, j, b, h, visited):
                count += 1
                dfs(sea, i, j, b, h, visited)
    return count


sea = [[1, 1, 0, 0, 0],
       [0, 1, 0, 0, 1],
       [1, 0, 0, 1, 1],
       [0, 0, 0, 0, 0],
       [1, 0, 1, 0, 1]
       ]

print find_island(sea)

Solution 4

There are 3 connected sets. All 1 which are neighbors of each other are considered as one single set. All 1 at a[1,4], a[2,3], a[3,3] and a[4,4] form one set and one at a[1,1] form one set and one at a[4,1] forms one set.

Solution 5

You want to use a disjoint set datastructure and algorithm. This will pick a unique representative for each connected component, which you can count at the end.

To efficiently evaluate which elements are neighbors, you can scan the matrix line by line, maintaining a list of segments (of consecutive 1's) from the previous line, while determining which segments on the current line are adjacent to them.

Share:
19,476
user1484638
Author by

user1484638

Updated on June 04, 2022

Comments

  • user1484638
    user1484638 almost 2 years

    i wanted to know which algorithm should i apply here. Would a DFS do?

    Given a 2–d matrix. Find the total number of connected sets in that matrix.

    Connected set can be defined as group of cell(s) which has 1 mentioned on it and have at least one other cell in that set with which they share the neighbor relationship. A cell with 1 in it and no surrounding neighbor having 1 in it can be considered as a set with one cell in it. Neighbors can be defined as all the cells adjacent to the given cell in 8 possible directions (i.e. N, W, E, S, NE, NW, SE, SW direction). A cell is not a neighbor of itself.

    For example:

    1 0 0 1
    
    0 0 1 0
    
    0 0 1 0
    
    1 0 0 1
    

    number of connected sets is 3

    0 0 1 0 0 1 0 0
    
    1 0 0 0 0 0 0 1
    
    0 0 1 0 0 1 0 1
    
    0 1 0 0 0 1 0 0
    
    1 0 0 0 0 0 0 0
    
    0 0 1 1 0 1 1 0
    
    1 0 1 1 0 1 1 0
    
    0 0 0 0 0 0 0 0
    

    number of connected set is 9.

  • Saeed Amiri
    Saeed Amiri almost 12 years
    I can't see how you will assign representative element to specific set, such that all other elements in the set will be descendants of that element.
  • comingstorm
    comingstorm almost 12 years
    A nutshell description of classic disjoint set operations: each disjoint set is represented as an inverted tree, with parent links only. To find the representative element of a set, follow the parent links to find the root of the tree. To join two trees, find the root of each, and make one the parent of the other.
  • Saeed Amiri
    Saeed Amiri almost 12 years
    I know what is disjoint sets, one of a most important methods in disjoint sets is Find method (see wiki), In fact I asked you how you will implement Find method? (By this problem statement.).
  • comingstorm
    comingstorm almost 12 years
    The root of the tree is the representative element. The wikipedia link describes this, as well as the optimizations necessary to achieve amortized O(inverseAckermann(N)) performance.
  • user2979190
    user2979190 over 10 years
    Can anyone please explain the second step of Scan 1
  • PolyMesh
    PolyMesh almost 9 years
    I found this easy to read and implement compared to the other answers. Thank you!
  • aerin
    aerin almost 7 years
    I think this is the best answer here!