Oracle Data Provider for .NET: Connection request timed out

34,849

Solution 1

We had a similar issue, and it took a while to debug this and fix this. Our code on getting stressed with many input files, and many threads processing, each thread using Entity framework and opening Oracle db connection, and doing a bunch of db queries and inserts, used to file occasionally. But works most of the time.

I modified out DbContext constructor to explicitly open the OracleConnection. I added some code like this

for (i = 0; i < 5; i++)
   try {
       oracleConnection.Open();
   } catch (OracleException) {
     Sleep for 15 ms and retry. 
     On last attempt I also do OracleConnection.ClearAllPools()
   }

It improved, but still didn't solve it completely. I broke in the catch from debugger, and saw that many threads are trying to open and few threads are processing away. On Open in Oracle stack, Oracle for its internal purpose does ThreadPool.QueueUserWorkItem and waits for its completion. I can see on top of stack its wait. Here plenty of pooled connections are available (default is 100), I am hardly using 10. So it is not out of resource.

But the issue is in our code also we used ThreadPool.QueueUserWorkItem with no additional throttling. I thought that was cool to just queue all jobs we need to do, how much ever we need to this, and let .NET take care of this. But this has subtle issue. All our jobs have consumed the full queue count. When OracleConnection wants to get a pooled connection from the pool, it also queues to the thread pool. But it is never going to complete. Our jobs are all waiting for OracleConnection.Open, and its Queued Thread proc will still be in queue. So finally the wait will exit by timeout. It is a pity that even though there is plenty of pooled connection is available, we have consumed all ThreadPool proc, and Oracle's threadpool didn't even get a chance. Here setting ThreadPool.SetMaxThreads also isn't going to help. The issue is still the same. We hog all thread pool resource, and Orcale isn't going to find one and will still be in queue.

The fix is not to rely on only ThreadPool, but we add our own throttling as well. I used BlockingCollection and sempahores and add only some limit number of concurrent jobs in ThreadPool, say 5. In this way OracleConnection will always find a ThreadPool thread available, and wont fail.

Solution 2

Even I used to get this issue more Frequently, even after the use of Connection.Close()

After a long Analysis I have learnt few thing as mentioned below

  1. Connection.Close() doesnt dispose the db connection
  2. Connection Time out doesnt mean that issue is only with db query
  3. Connection Time out could also be due to exhaustive connections in the connection pool (which was the culprit for me as it reached the maximum sessions of the db connection)

Fix :- Analysis took long time but fix is of just 2 mins

using(DbConnection instance)
{

}

Eg :-

using (DbConnection  objDbConnection = new DbConnection())
{
   ojDbConnection.PersistData();
}

Under PersistData(); All the db Operations like Open, close e.tc. will be performed

As we all knows "Using" is short form of

try
{

}
catch()
{

}
Finally
{
  Dispose objDbConnection;
}

Hope it helps, as it helped me

Solution 3

try adding connection.close() at the end. I don't see releasing connections in your code and returning them to connection pool explicitly. So connection is being returned to connection pool only when GC is started.

Share:
34,849
DjD
Author by

DjD

Updated on July 18, 2022

Comments

  • DjD
    DjD almost 2 years

    We have a C# WCF web service hosted on Windows 2008 SP2/IIS 7 accessing an Oracle database. Usually data access works fine but during load testing, it often times out and logs and exception saying:

    Error occurred when processing XXXXXXXX Web Service
    Oracle.DataAccess.Client.OracleException Connection request timed out at Oracle.DataAccess.Client.OracleException.HandleErrorHelper(Int32 errCode, OracleConnection conn, IntPtr opsErrCtx, OpoSqlValCtx* pOpoSqlValCtx, Object src, String procedure, Boolean bCheck)
       at Oracle.DataAccess.Client.OracleException.HandleError(Int32 errCode, OracleConnection conn, IntPtr opsErrCtx, Object src)
       at Oracle.DataAccess.Client.OracleConnection.Open()
       at MyWorkspace.WorkForceDataAccess.CheckStaffIdInRSW()
       at MyWorkspace.MyClass.MyFunction(MyDataType MyData)
    

    To query the database, we use something like this:

    OracleConnection orConn = new OracleConnection();
    orConn.ConnectionString = "user id=xxx; password=xxx; Connection Timeout=600; Max Pool Size=150; data source= (DESCRIPTION =(ADDRESS = (PROTOCOL = TCP)(HOST = MYHOST.MYDOMAIN.com)(PORT = 1771)) (CONNECT_DATA =(SERVER = DEDICATED)(SERVICE_NAME = MYSERVICE.MYDOMAIN.com)))";
    orConn.Open();
    
    using (var cmd = new OracleCommand("MY_UTIL.check_StaffIdInRSW", orConn) { CommandType = CommandType.StoredProcedure })
    {
        cmd.Parameters.Add("P_Staff_Id", OracleDbType.Int32);
        cmd.Parameters["P_Staff_Id"].Direction = ParameterDirection.Input;
        cmd.Parameters["P_Staff_Id"].Value = Convert.ToInt32(MyDataObject.StaffId);
    
        cmd.Parameters.Add("P_retvalue", OracleDbType.Int32);
        cmd.Parameters["P_retvalue"].Direction = ParameterDirection.Output;
    
        cmd.ExecuteNonQuery(); // Execute the function
    
        //obtain result
        returnVal = int.Parse(cmd.Parameters["P_retvalue"].Value.ToString());
    }
    

    I am pretty confident that the stored procedure that is being invoked is not taking all the time. It is a pretty simple procedure that quickly checks if the P_Staff_Id exists in the table and returns the result.

    Additionally, this occurs only during load testing. During normal operations things are fine but during heavy loads with 1 message per second, this occurs after running smooth for some time.

    As a workaround, I have added "Connection Timeout=600; Max Pool Size=150“ to the connection string, but that did not fix the issue.

    We have the same application running on a development server and it works fine. We never encountered this problem there.

    Any suggestions as to what to try would be appreciated. It looks like I am running out of options.

    • Husqvik
      Husqvik over 8 years
      The stacktrace suggest that the procedure is not the issue. The exception is raised before its execution within Connection.Open, so it looks like the database machine is overloaded so it can't respond to client within timeout period. It shouldn't be related to pool size or processes limitations in Oracle, these throw different exceptions. Also I would be suspicious about that pool size because it wouldn't make sense to have pool significantly larger than number of cores database can use. Or you have connection leak somewhere.
    • DjD
      DjD over 8 years
      I added the Connection Timeout and Max Pool Size to the connection string after this issue came up - but it didn't help. The web service was working fine in the DEV environment without these. By connection leak, would you suggest closing and disposing the OracleConnection object explicitly after it is used?
    • Husqvik
      Husqvik over 8 years
      Ad connection leak - if the connection object has just short live within single function then using (var connection = ...) {...} is definitely safer. But I don't expect this is the issue. You would get different exception when pool is fully used. Ad load testing - I expect you run many instances of the application or function in parallel. Also expect that you use dedicated connections, not shared servers as Oracle setting. Can you check how the sessions look in the database during the test to see how many sessions and how many active sessions there actually are.
  • DjD
    DjD about 7 years
    We implemented something like this connection. close() to close connections explicitly and that seemed to work. Thank you all for your help.
  • Nicolas
    Nicolas about 6 years
    Many thanks to this answer! Excellent insight into the inner mechanics creating a connection. I am facing the same issue and a long time it wasn't clear to me that the cause of the problem was not the database code, but a different part of the application!