Fastest way to retrieve data from database

47,993

Solution 1

Here are 2 suggestions that will help. The first suggestion is what will improve your performance significantly. The second suggestion will help also, though probably not make your app faster in your case.

Suggestion 1

You call the method GetDataFromDB(string query) very often. This is bad because you create a new SqlConnection and SqlCommand each time. This takes time and resources. Also, if there is any network delay, that is multiplied by the number of calls you are making. So it's just a bad idea.

I suggest that you call that method once and have it populate a collection like a Dictionary<int, string> so that you can quickly look up your Username value from the user id key.

Like this:

// In the DataField class, have this code.
// This method will query the database for all usernames and user ids and
// return a Dictionary<int, string> where the key is the Id and the value is the 
// username. Make this a global variable within the DataField class.
Dictionary<int, string> usernameDict = GetDataFromDB("select id, username from Users");

// Then in the GetValue(int userId) method, do this:
public string GetValue(int userId)
{
    // Add some error handling and whatnot. 
    // And a better name for this method is GetUsername(int userId)
    return this.usernameDict[userId];
}

Suggestion 2

Here is another way that you can improve things, though slightly in this case—use the StringBuilder class. There are significant performance gains (here is an overview: http://support.microsoft.com/kb/306822).

SringBuilder sb = new StringBuilder();
sb.Append("<table><tr><th>Username</th>");
foreach (DataField f in fields)
{
    sb.Append("<th>" + f.Name + "</th>");
}

// Then, when you need the string
string html = sb.ToString();

Let me know if you need some more clarification, but what you are asking for is very do-able. We can work this out!

If you make these 2 simple changes, you will have great performance. I guarantee it.

Solution 2

The database design you choose is named Entity-Attribute-Value, a design that is well known for its performance problems. SQL Server team has release a whitepaper for guidance around EAV design, see Best Practices for Semantic Data Modeling for Performance and Scalability.

Alas, you already have the design in place and what can you do about it now? The important thing is to reduce the miriad calls to the dB to one single call, and execute one single set oriented statement to retrieve the data. The name of the game is Table Valued Parameters:

declare @users as UsersType;

insert into @users (UserId) values (7), (42), (89);

select ut.Id, 
  ut.Username, 
  df.Name as DataFieldName, 
  dfv.Value
from Users ut
join @users up on ut.Id = up.UserId
join DataFieldValues dfv on ut.Id = dfv.UserId
join DataFields df on dfv.DataFieldId = df.Id
order by ut.Id;

For a full example, see this SqlFiddle.

While, strictly speaking, it is possible to retrieve a result on the shape you desire (data field names transposed as column names) using the PIVOT operator, I would very strongly advise against doing so. PIVOT on its own is a performance quagmire, but when you add the dynamic nature of the desired result set is basically impossible to pull it off. The traditional result set consisting of one-row-per attribute is trivial to parse into a table, because the required order by user Id guarantees a clean break between sets of correlated attributes.

Solution 3

This is slow because under the hood you are making 20 x 10 = 200 queries to the database. Correct way would be to load everything in one turn.

You should post some details about the way you load data. If you are using Entity Framework, you should use something called Eager Loading using Include command.

// Load all blogs and related posts
var blogs1 = context.Blogs
                      .Include(b => b.Posts)
                      .ToList();

Some samples can be found here: http://msdn.microsoft.com/en-us/data/jj574232.aspx

EDIT:

It seems that you are not using the tools .NET Framework gives you. These days you don't have to do your own database access for simple scenarious like yours. Also, you should avoid concatenating string HTML like you do.

I would suggest you to redesign your application using existing ASP.NET controls and Entity Framework.

Here is a sample with step by step instructions for you: http://www.codeproject.com/Articles/363040/An-Introduction-to-Entity-Framework-for-Absolute-B

Solution 4

As Remus Rusanu said, you can get the data you want in the format you require by using the PIVOT relational operator, as far as performance of PIVOT is concerned, I've found that it will depend on the indexing of your tables and the variability and size of the data set. I would be greatly interested in hearing more from him about his opinion of PIVOTs as we are all here to learn. There is a great discussion on PIVOT vs JOINS here.

If the DataFields table is a static set then you may not need to worry about generating the SQL dynamically and you can build yourself a stored procedure; if it does vary you may need to take the performance hit of dynamic SQL(here is an excellent article on this) or use a different approach.

Unless you have further need for the data try to keep the returned set to the minimum you need for display it's a good way to reduce overhead as everything will need to go over the network unless your db is on the same physical server as the web server.

Make sure that you perform as few separate data calls as possible will reduce that time you spend raising and dropping connections.

You should always double-check of data calls within a loop when the control for the loop is based on a (probably related?) data set as this screams JOIN.

When you are experimenting with your SQL try to become familiar with execution plans these will help you figure out why you have slow running queries check out these resources for more info.

Whatever you approach you decide you need to figure out where the bottlenecks are in your code, something as basic as stepping through the execution can help with this as it will allow you to see for yourself where problems lie, this will also allow you to identify for yourself possible problems with your approach and build good design choice habits.

Marc Gravel has some interesting points to make about c# data reading here the article is a bit old but worth a read.

PIVOTing your data.(Sorry Remus ;-) ) Bases on the data example you have provided, the following code will get what you need with no in-query recursion:

--Test Data
DECLARE @Users AS TABLE ( Id int
                        , Username VARCHAR(50)
                        , Name VARCHAR(50)
                        , Email VARCHAR(50)
                        , [Role] INT --Avoid reserved words for column names.
                        , Active INT --If this is only ever going to be 0 or 1 it should be a bit.
                        );

DECLARE @DataFields AS TABLE ( Id int
                        , Name VARCHAR(50)
                        , [Type] INT --Avoid reserved words for column names.
                        );

DECLARE @DataFieldsValues AS TABLE ( Id int
                        , UserId int
                        , DataFieldId int
                        , Value VARCHAR(50)
                        );

INSERT INTO @users  ( Id
                    , Username
                    , Name
                    , Email
                    , [Role]
                    , Active) 
VALUES (1,'enb081','enb081','[email protected]',2,1),
       (2,'Mack','Mack','[email protected]',1,1),
       (3,'Bob','Bobby','[email protected]',1,0)


INSERT INTO @DataFields  
                    ( Id
                    , Name
                    , [Type]) 
VALUES (1,'DataField1',3),
       (2,'DataField2',1),
       (3,'DataField3',2),
       (4,'DataField4',0)

INSERT INTO @DataFieldsValues  
                    ( Id
                    , UserId
                    , DataFieldId
                    , Value) 
VALUES (1,1,1,'value11'),
       (2,1,2,'value12'),
       (3,1,3,'value13'),
       (4,1,4,'value14'),
       (5,2,1,'value21'),
       (6,2,2,'value22'),
       (7,2,3,'value23'),
       (8,2,4,'value24')

--Query
SELECT *
FROM
(   SELECT  ut.Username, 
            df.Name as DataFieldName, 
            dfv.Value
    FROM @Users ut
    INNER JOIN @DataFieldsValues dfv 
        ON ut.Id = dfv.UserId
    INNER JOIN @DataFields df 
        ON dfv.DataFieldId = df.Id) src
PIVOT
(   MIN(Value) FOR DataFieldName IN (DataField1, DataField2, DataField3, DataField4)) pvt

--Results
Username    DataField1  DataField2  DataField3  DataField4
enb081      value11     value12     value13     value14
Mack        value21     value22     value23     value24

The most important thing to remember is to try things out for yourself as whatever we suggest might be altered by factors at your site that we aren't aware of.

Solution 5

Make sure that you are not making a connection to the database for each loop.

As I can see, the f.GetValue(u) part is a method that returns a string value that was fetched from the database.

Put the data in an object once and for all and do the same thing as f.GetValue(u) is doing here.

Share:
47,993
enb081
Author by

enb081

//

Updated on December 25, 2020

Comments

  • enb081
    enb081 over 3 years

    I am working on a ASP.NET project with C# and Sql Server 2008.

    I have three tables:

    Users DataFields DataField Values

    Each user has a specific value for each data field, and this value is stored in the DataFieldsValues.

    Now I want to display a report that looks like this:

    enter image description here

    I have created the objects User, and DataField. In the DataField object, there is the Method string GetValue(User user), in which I get the value of a field for a certain user.

    Then I have the list of Users List<User> users and the list of DataFields List<DataField> fields and I do the following:

    string html = string.Empty;
    html += "<table>";
    html += "<tr><th>Username</th>";
    foreach (DataField f in fields)
    {
       html += "<th>" + f.Name + "</th>";
    }
    html += "</tr>"
    
    foreach (User u in users)
    {
       html += "<tr><td>" + u.Username + "</td>"
       foreach (DataField f in fields)
       {
          html += "<td>" + f.GetValue(u) + "</td>";
       }
       html += "</tr>"
    }
    Response.Write(html);
    

    This works fine, but it is extremely slow, and I am talking about 20 users and 10 data fields. Is there any better way in terms of performance to achieve this?

    EDIT: For each parameter inside the classes, I retrieve the value using the following method:

    public static string GetDataFromDB(string query)
    {
        string return_value = string.Empty;
        SqlConnection sql_conn;
        sql_conn = new SqlConnection(ConfigurationManager.ConnectionStrings["XXXX"].ToString());
        sql_conn.Open();
        SqlCommand com = new SqlCommand(query, sql_conn);
        //if (com.ExecuteScalar() != null)
        try
        {
            return_value = com.ExecuteScalar().ToString();
        }
        catch (Exception x)
        {
        }
        sql_conn.Close();
        return return_value;
    } 
    

    For instance:

    public User(int _Id)
    {
    this.Id = _Id
    this.Username = DBAccess.GetDataFromDB("select Username from Users where Id=" + this.Id)
     //...
    }
    
  • enb081
    enb081 over 10 years
    I don't want to use the existing ASP.NET controls because I want to be flexible with the html and shape it the way I need It.
  • enb081
    enb081 over 10 years
    Also, please focus on my case. I have not one simple query to put it in a SQLDataSource control.
  • Kaspars Ozols
    Kaspars Ozols over 10 years
    I am saying that you would not have this problem at all if you were using .NET the way it is meant to be used. Best way to optimize performance in your case is to reduce DB calls. One way is to start using Entity Framework which would simplify data access a lot.
  • evilom
    evilom over 10 years
    @enb081 for example you have 20 loops that uses the f.GetValue(u)...each loop will connect to the database to fetch some data. The steps done by your app to fetch data from the database will take a lot more time rather than getting the data from a locally stored list of objects.
  • Kaspars Ozols
    Kaspars Ozols over 10 years
    Your statement about Entity Framework is wrong. You don't need to read everything in memory just to access one column values. EF is smart enough to load just bare minimum of data, just use correct LINQ statement which limits returned data: var relatedPostIds = context.Blogs.SelectMany(b => b.Posts).Select(p => p.ID).ToList();
  • Valentin Petkov
    Valentin Petkov over 10 years
    I simulate requests with EF over my company DB five years ago with my team we try our biggest table was ~970M rows 17s to extract 10 rows old school 0s I have 10 years experience in analytic company. A year ago come Senior .NET developer who was so ambitious to teach as the NEW EF and He will rebuild all the code with the NEW EF ask him but when they waste 3 programmers time for 4 months they did answer why they FAIL....They couldn't run even one of the queries which I do....well all my live I use EF approach. but for small thinks
  • Valentin Petkov
    Valentin Petkov over 10 years
    The only what Microsoft care is how to teach people to use EF and consume more Compute instances in Azure. And when you analyze Azure price structure you will see which part is most expensive. Then you start design your applications to consume less resources if you are paying. Eh if someone else paying EF is easy WIN!!!!
  • Kaspars Ozols
    Kaspars Ozols over 10 years
    Last two comments where more like a rant than anything argumented and useful. I don't say EF is magic bullet, but it makes software developement a lot easier in most scenarious. You could run into performance problems if you work with really huge DB but for most cases EF works just fine.
  • DRapp
    DRapp over 10 years
    Down-vote, because although it will run all 3 at one call, it does NOT take advantage of the query engine preparing the join of data up front between user and components, thus making the user do extra work once the data IS retrieved.
  • Valentin Petkov
    Valentin Petkov over 10 years
    do you have comment for this? stackoverflow.com/questions/20479792/…
  • Remus Rusanu
    Remus Rusanu over 10 years
    +1 from me. If OP does end up using PIVOT, despite the dynamic SQL required and variable (='unknown at compile time') shape of result, then this is the way :)
  • Keith
    Keith over 10 years
    EF has changed a lot over the last 5 years. While it IS slower than old school, with the right settings, and writing your queries correctly, its not noticeably so. The biggest performance issue with EF is with bulk insert/update operations. Otherwise you are looking at a difference of less than 1s. Obviously incredibly complex queries are still best handled via SP/View, which can STILL be executed from EF.
  • Valentin Petkov
    Valentin Petkov over 10 years
    Help out the guy with EF preformance the link before you comment
  • Jon Hanna
    Jon Hanna over 10 years
    Unless you're going to make use of the contents of blogs1 more than once, that call to .ToList() is just a waste of time and memory.
  • Kaspars Ozols
    Kaspars Ozols over 10 years
    Well, it depends on what you do with the result set. The post was about performance problems because of multiple DB queries being executed, so in my sample I included call to .ToList() just to ensure that all the data is read in one turn using single query.
  • Chris Leyva
    Chris Leyva over 10 years
    I agree completely with DRapp. In addition, you have now combined the logic into one query, making it harder to understand and maintain. Executing these 3 queries on the database does not cause the performance issue. You can create 3 query strings (one for Users, one for DataFields, and one for DataFieldsValues) and execute those queries separately within the same SqlConnection.
  • Chris Leyva
    Chris Leyva over 10 years
    @enb081 This improves performance because creating new SqlConnections adds more time and memory to your app. Especially if there is some network delay, that will make things worse. If you take a look at my answer, I put the result of the query in a Dictionary which will let you look up the Usernames very quickly. Hope it helps.
  • eglasius
    eglasius over 10 years
    I agree with your answer, but honestly, I completely disagree on the comment about not having this problem at all if you are using Entity Framework. You can make exactly the same mistake when using Entity Framework, specially when you don't know what you are doing.
  • enb081
    enb081 over 10 years
    I opened only one SqlConnection with database and saved in session. This dramatically improved performance. StringBuilder also improved it. This answer has both of the suggestions although they are all mentioned in the other answers too. Thank you very much for all your valuable advice!
  • Aaron Carlson
    Aaron Carlson over 10 years
    @enb081 - Putting an open database connection in Session is a bad idea. The SQL Server database driver in .net has built in connection pooling. By storing open connections in session your reimplementing this feature in a much less efficient and effective way. I think what the other posters were trying to say is that you need to reduce the number of queries you perform against the database and not the number of open connections.
  • enb081
    enb081 over 10 years
    @AaronCarlson Why is putting an open database connection in Session a bad idea?
  • Aaron Carlson
    Aaron Carlson over 10 years
    @enb081 - First, session is per user. Putting an open connection in session per user means that those connections will never be returned to the connection pool. Database connections are not serializable and thus you will only be able to support inproc session state which will limit your applications ability to run in a server farm. I would first recomend you read about connection pooling.
  • Chris Leyva
    Chris Leyva over 10 years
    For the record, my answer never suggested saving the open database connection. I suggested that you just call the database once to get all the data you would need and to then store that in a dictionary that you could use to look up your usernames and such. As opposed to opening a database connection and querying it every time you needed the username.
  • Aaron Carlson
    Aaron Carlson over 10 years
    @ChrisLava - It would be helpful if you updated your answer to not include opening the connection as part of the performance problem. I believe the way you are lumping opening the database connection and querying together as the performance problem is a reason why the OP went down the route of storing an open connection in session state.
  • Chris Leyva
    Chris Leyva over 10 years
    @AaronCarlson, Opening the connection many times is part of the problem though. Which is why the OP saw improvements when they stored the one connection and kept using that. I think the 2nd paragraph in "Suggestion 1" is very clear and does not imply storing an open connection.
  • Aaron Carlson
    Aaron Carlson over 10 years
    @ChrisLava - Reading this may help you understand where I am comming from. It was clear to me that you were not suggesting storing an open connection. However, I can see how the OP came to that conclusion based on how you worded your answer. I am only trying to clear up a common misconception about the performance overhead of opening a Sql Database Connection when connection pooling is enabled. I feel that it would be a good improvement to your correct answer.