How to transform numpy.matrix or array to scipy sparse matrix

140,633

Solution 1

You can pass a numpy array or matrix as an argument when initializing a sparse matrix. For a CSR matrix, for example, you can do the following.

>>> import numpy as np
>>> from scipy import sparse
>>> A = np.array([[1,2,0],[0,0,3],[1,0,4]])
>>> B = np.matrix([[1,2,0],[0,0,3],[1,0,4]])

>>> A
array([[1, 2, 0],
       [0, 0, 3],
       [1, 0, 4]])

>>> sA = sparse.csr_matrix(A)   # Here's the initialization of the sparse matrix.
>>> sB = sparse.csr_matrix(B)

>>> sA
<3x3 sparse matrix of type '<type 'numpy.int32'>'
        with 5 stored elements in Compressed Sparse Row format>

>>> print sA
  (0, 0)        1
  (0, 1)        2
  (1, 2)        3
  (2, 0)        1
  (2, 2)        4

Solution 2

There are several sparse matrix classes in scipy.

bsr_matrix(arg1[, shape, dtype, copy, blocksize]) Block Sparse Row matrix
coo_matrix(arg1[, shape, dtype, copy]) A sparse matrix in COOrdinate format.
csc_matrix(arg1[, shape, dtype, copy]) Compressed Sparse Column matrix
csr_matrix(arg1[, shape, dtype, copy]) Compressed Sparse Row matrix
dia_matrix(arg1[, shape, dtype, copy]) Sparse matrix with DIAgonal storage
dok_matrix(arg1[, shape, dtype, copy]) Dictionary Of Keys based sparse matrix.
lil_matrix(arg1[, shape, dtype, copy]) Row-based linked list sparse matrix

Any of them can do the conversion.

import numpy as np
from scipy import sparse
a=np.array([[1,0,1],[0,0,1]])
b=sparse.csr_matrix(a)
print(b)

(0, 0)  1
(0, 2)  1
(1, 2)  1

See http://docs.scipy.org/doc/scipy/reference/sparse.html#usage-information .

Solution 3

In Python, the Scipy library can be used to convert the 2-D NumPy matrix into a Sparse matrix. SciPy 2-D sparse matrix package for numeric data is scipy.sparse

The scipy.sparse package provides different Classes to create the following types of Sparse matrices from the 2-dimensional matrix:

  1. Block Sparse Row matrix
  2. A sparse matrix in COOrdinate format.
  3. Compressed Sparse Column matrix
  4. Compressed Sparse Row matrix
  5. Sparse matrix with DIAgonal storage
  6. Dictionary Of Keys based sparse matrix.
  7. Row-based list of lists sparse matrix
  8. This class provides a base class for all sparse matrices.

CSR (Compressed Sparse Row) or CSC (Compressed Sparse Column) formats support efficient access and matrix operations.

Example code to Convert Numpy matrix into Compressed Sparse Column(CSC) matrix & Compressed Sparse Row (CSR) matrix using Scipy classes:

import sys                 # Return the size of an object in bytes
import numpy as np         # To create 2 dimentional matrix
from scipy.sparse import csr_matrix, csc_matrix 
# csr_matrix: used to create compressed sparse row matrix from Matrix
# csc_matrix: used to create compressed sparse column matrix from Matrix

create a 2-D Numpy matrix

A = np.array([[1, 0, 0, 0, 0, 0],\
              [0, 0, 2, 0, 0, 1],\
              [0, 0, 0, 2, 0, 0]])
print("Dense matrix representation: \n", A)
print("Memory utilised (bytes): ", sys.getsizeof(A))
print("Type of the object", type(A))

Print the matrix & other details:

Dense matrix representation: 
 [[1 0 0 0 0 0]
 [0 0 2 0 0 1]
 [0 0 0 2 0 0]]
Memory utilised (bytes):  184
Type of the object <class 'numpy.ndarray'>

Converting Matrix A to the Compressed sparse row matrix representation using csr_matrix Class:

S = csr_matrix(A)
print("Sparse 'row' matrix: \n",S)
print("Memory utilised (bytes): ", sys.getsizeof(S))
print("Type of the object", type(S))

The output of print statements:

Sparse 'row' matrix:
(0, 0) 1
(1, 2) 2
(1, 5) 1
(2, 3) 2
Memory utilised (bytes): 56
Type of the object: <class 'scipy.sparse.csr.csc_matrix'>

Converting Matrix A to Compressed Sparse Column matrix representation using csc_matrix Class:

S = csc_matrix(A)
print("Sparse 'column' matrix: \n",S)
print("Memory utilised (bytes): ", sys.getsizeof(S))
print("Type of the object", type(S))

The output of print statements:

Sparse 'column' matrix:
(0, 0) 1
(1, 2) 2
(2, 3) 2
(1, 5) 1
Memory utilised (bytes): 56
Type of the object: <class 'scipy.sparse.csc.csc_matrix'>

As it can be seen the size of the compressed matrices is 56 bytes and the original matrix size is 184 bytes.

For a more detailed explanation and code examples please refer to this article: https://limitlessdatascience.wordpress.com/2020/11/26/sparse-matrix-in-machine-learning/

Share:
140,633

Related videos on Youtube

Flake
Author by

Flake

A programmer.

Updated on July 08, 2022

Comments

  • Flake
    Flake almost 2 years

    For SciPy sparse matrix, one can use todense() or toarray() to transform to NumPy matrix or array. What are the functions to do the inverse?

    I searched, but got no idea what keywords should be the right hit.

  • Virgil Ming
    Virgil Ming almost 6 years
    The question asks how to generate scipy sparse matrix using numpy matrix/array, not inverse as matrix operation.
  • Nirmal
    Nirmal over 5 years
    What about higher dimensional arrays?
  • Martin Thoma
    Martin Thoma about 5 years
    I get a memory error for my matrix (~25,000x25,000). Also, the memory consumption jumps like crazy when I apply sparse.csr_matrix