Getting No loop matching the specified signature and casting error

94,878

Solution 1

try specifiying the

dtype = 'float'

When the matrix is created. Example:

a=np.matrix([[1,2],[3,4]], dtype='float')

Hope this works!

Solution 2

Faced the similar problem. Solved the problem my mentioning dtype and flatten the array.

numpy version: 1.17.3

a = np.array(a, dtype=np.float)
a = a.flatten()

Solution 3

As suggested previously, you need to ensure X_opt is a float type. For example in your code, it would look like this:

X_opt = X[:, [0,1,2]]
X_opt = X_opt.astype(float)
regressor_OLS = sm.OLS(endog=y, exog=X_opt).fit()
regressor_OLS.summary()

Solution 4

Was facing a similar problem, I used df.values[]

y = df.values[:, 4]

fixed the issue by using df.iloc[].values function.

y = dataset.iloc[:, 4].values

df.values[] function returns object datatype

array([192261.83, 191792.06, 191050.39, 182901.99, 166187.94, 156991.12,
   156122.51, 155752.6, 152211.77, 149759.96, 146121.95, 144259.4,
   141585.52, 134307.35, 132602.65, 129917.04, 126992.93, 125370.37,
   124266.9, 122776.86, 118474.03, 111313.02, 110352.25, 108733.99,
   108552.04, 107404.34, 105733.54, 105008.31, 103282.38, 101004.64,
   99937.59, 97483.56, 97427.84, 96778.92, 96712.8, 96479.51,
   90708.19, 89949.14, 81229.06, 81005.76, 78239.91, 77798.83,
   71498.49, 69758.98, 65200.33, 64926.08, 49490.75, 42559.73,
   35673.41, 14681.4], dtype=object)

but

df.iloc[:, 4].values returns floats array

which is what

regressor_OLS = sm.OLS(endog=y, exog=X_opt).fit()

OLS() fun accepts

OR

you can just change the datatype of y before inserting it into the fun OLS()

y = np.array(y, dtype = float)
Share:
94,878
Shehan Ekanayake
Author by

Shehan Ekanayake

Updated on July 10, 2022

Comments

  • Shehan Ekanayake
    Shehan Ekanayake almost 2 years

    I'm a beginner to python and machine learning . I get below error when i try to fit data into statsmodels.formula.api OLS.fit()

    Traceback (most recent call last):

    File "", line 47, in regressor_OLS = sm.OLS(y , X_opt).fit()

    File "E:\Anaconda\lib\site-packages\statsmodels\regression\linear_model.py", line 190, in fit self.pinv_wexog, singular_values = pinv_extended(self.wexog)

    File "E:\Anaconda\lib\site-packages\statsmodels\tools\tools.py", line 342, in pinv_extended u, s, vt = np.linalg.svd(X, 0)

    File "E:\Anaconda\lib\site-packages\numpy\linalg\linalg.py", line 1404, in svd u, s, vt = gufunc(a, signature=signature, extobj=extobj)

    TypeError: No loop matching the specified signature and casting was found for ufunc svd_n_s

    code

    #Importing Libraries
    import numpy as np # linear algebra
    import pandas as pd # data processing
    import matplotlib.pyplot as plt #Visualization
    
    
    #Importing the dataset
    dataset = pd.read_csv('Video_Games_Sales_as_at_22_Dec_2016.csv')
    #dataset.head(10) 
    
    #Encoding categorical data using panda get_dummies function . Easier and straight forward than OneHotEncoder in sklearn
    #dataset = pd.get_dummies(data = dataset , columns=['Platform' , 'Genre' , 'Rating' ] , drop_first = True ) #drop_first use to fix dummy varible trap 
    
    
    dataset=dataset.replace('tbd',np.nan)
    
    #Separating Independent & Dependant Varibles
    #X = pd.concat([dataset.iloc[:,[11,13]], dataset.iloc[:,13: ]] , axis=1).values  #Getting important  variables
    X = dataset.iloc[:,[10,12]].values
    y = dataset.iloc[:,9].values #Dependant Varible (Global sales)
    
    
    #Taking care of missing data
    from sklearn.preprocessing import Imputer
    imputer =  Imputer(missing_values = 'NaN' , strategy = 'mean' , axis = 0)
    imputer = imputer.fit(X[:,0:2])
    X[:,0:2] = imputer.transform(X[:,0:2])
    
    
    #Splitting the dataset into the Training set and Test set
    from sklearn.cross_validation import train_test_split
    X_train, X_test, y_train, y_test = train_test_split(X,y,test_size = 0.2 , random_state = 0)
    
    #Fitting Mutiple Linear Regression to the Training Set
    from sklearn.linear_model import LinearRegression
    regressor = LinearRegression()
    regressor.fit(X_train,y_train)
    
    #Predicting the Test set Result
    y_pred = regressor.predict(X_test)
    
    
    #Building the optimal model using Backward Elimination (p=0.050)
    import statsmodels.formula.api as sm
    X = np.append(arr = np.ones((16719,1)).astype(float) , values = X , axis = 1)
    
    X_opt = X[:, [0,1,2]]
    regressor_OLS = sm.OLS(y , X_opt).fit()
    regressor_OLS.summary() 
    

    Dataset

    dataset link

    Couldn't find anything helpful to solve this issue on stack-overflow or google .

  • Dwa
    Dwa about 3 years
    I love how I can quickly come to Stackoverflow and most of the time get some quick solutions to problems that bewildered me for a long time...
  • pauljohn32
    pauljohn32 over 2 years
    flatten worked for me, without the dtype cast. Thanks. Can you please explain why you suggested flatten and what it is doing.