how do I remove rows/columns from this matrix using python

12,873

Solution 1

The Basic Idea

Here's what I came up with:

>>> import numpy as np
>>> l = [['hotel','good','bad'],['hilton',1,2],['ramada',3,4]]
>>> a = np.array(l) # convert to a numpy array to make multi-dimensional slicing possible
>>> a
array([['hotel', 'good', 'bad'],
       ['hilton', '1', '2'],
       ['ramada', '3', '4']], 
      dtype='|S4')
>>> a[1:,1:] # exclude the first row and the first column
array([['1', '2'],
       ['3', '4']], 
      dtype='|S4')
>>> a[1:,1:].astype(np.float32) # convert to float
array([[ 1.,  2.],
       [ 3.,  4.]], dtype=float32)

You can pass your 2d list to the numpy array constructor, slice the 2d array to get rid of the first row and column and then use the astype method to convert everything to a float.

All on one line, that'd be:

>>> l = [['hotel','good','bad'],['hilton',1,2],['ramada',3,4]]
>>> np.array(l)[1:,1:].astype(np.float32)
array([[ 1.,  2.],
       [ 3.,  4.]], dtype=float32)

The ValueError

You're getting a ValueError because you actually have a jagged array. Using the variable new_list from the code in your question you can prove this to yourself:

>>> [len(x) for x in new_list]
[9, 9, 9, 9, 9, 9, 9, 9, 9, 8]

The last row is only of length 8, instead of 9, like all the others. Given a 2d jagged list, the numpy.array constructor will create a 1d numpy array with a dtype of object. The entries in that array are Python lists. The astype call is attempting to convert Python lists to float32, which is failing. I'm guessing this was just a case of human error. If you fix the missing entry, you should be good to go.

Solution 2

If you have a list of lists, then:

new_list = [row[1:] for row in current_list[1:]]

So, create a new matrix ignoring the first row, and for each row after, ignore the first column.

If it happened to be a numpy.array, then you could use:

your_array[1:,1:]
Share:
12,873
banditKing
Author by

banditKing

Student, hacker, mad hatter ;-)

Updated on June 05, 2022

Comments

  • banditKing
    banditKing almost 2 years

    My matrix looks like this.

     ['Hotel', ' "excellent"', ' "very good"', ' "average"', ' "poor"', ' "terrible"', ' "cheapest"', ' "rank"', ' "total reviews"']
     ['westin', ' 390', ' 291', ' 70', ' 43', ' 19', ' 215', ' 27', ' 813']
     ['ramada', ' 136', ' 67', ' 53', ' 30', ' 24', ' 149', ' 49', ' 310 ']
     ['sutton place', '489', ' 293', ' 106', ' 39', ' 20', ' 299', ' 24', ' 947']
     ['loden', ' 681', ' 134', ' 17', ' 5', ' 0', ' 199', ' 4', ' 837']
     ['hampton inn downtown', ' 241', ' 166', ' 26', ' 5', ' 1', ' 159', ' 21', ' 439']
     ['shangri la', ' 332', ' 45', ' 20', ' 8', ' 2', ' 325', ' 8', ' 407']
     ['residence inn marriott', ' 22', ' 15', ' 5', ' 0', ' 0', ' 179', ' 35', ' 42']
     ['pan pacific', ' 475', ' 262', ' 86', ' 29', ' 16', ' 249', ' 15', ' 868']
     ['sheraton wall center', ' 277', ' 346', ' 150', ' 80', ' 26', ' 249', ' 45', ' 879']
     ['westin bayshore', ' 390', ' 291', ' 70', ' 43', ' 19', ' 199', ' 813']
    

    I want to remove the top row and the 0th column from this and create a new matrix.

    How do I do this?

    Normally in java or so Id use the following code:

     for (int y; y< matrix[x].length; y++)
         for(int x; x < matrix[Y].length; x++)
          {
            if(x == 0 || y == 0)
             {
               continue
              }
              else
               {
                 new_matrix[x][y] = matrix[x][y];
               }
    
    
          }
    

    Is there a way such as this in python to iterate and selectively copy elements?

    Thanks


    EDIT

    Im also trying to convert each matrix element from a string to a float as I iterate over the matrix.

    This my updated modified code based on the answer below.

    A = []
    f = open("csv_test.csv",'rt')
    try:
        reader = csv.reader(f)
        for row in reader:
            A.append(row)
     finally:
         f.close()
    
     new_list = [row[1:] for row in A[1:]]
     l = np.array(new_list)
     l.astype(np.float32)
     print l
    

    However Im getting an error

      --> l.astype(np.float32)
           print l
    
    
          ValueError: setting an array element with a sequence.
    
  • banditKing
    banditKing over 11 years
    thanks for your suggestions. I forgot one thing....., if you could help that would be awesome. Please see EDIT above
  • banditKing
    banditKing over 11 years
    Thanks, I tried this, however getting an error, I updated my "EDIT" section above. Would appreciate your suggestions