Why SQL query takes too long time to execute when fetching bulk data (without space)?

77,527

Solution 1

This sounds like it might be a caching issue. In brief:

  • Data is stored on the hard drive
  • When a query comes in, SQL reads the data from the hard drive (disk) into memory, and then passes it back from memory to the user who requested it
  • Reading data from disk is costly in terms of time
  • To increase performance, any data read from disk to memory is left in memory “for a while”
  • Done this way, subsequent queries that access the same data will find it in memory, and not have to read from disk again
  • In SQL Server, this part of memory is referred to as the “buffer cache”
  • There are extensive articles and discussions on how all this works both in Books Online (SQL Server documentation) and elsewhere.

So, my theory is:

  • When you ran Select * from TblJobs, SQL loaded the relevant data from disk into memory
  • When you updated the data, it first got updated in memory and then written back to disk… with the changed data remaining in memory
  • When you ran Select * from TblJobs again, it read the data straight from memory.
  • That first read was still extremely long. quite possibly, as @Insac said, the table is fragmented on the hard drive and required “extra” time to read.

To test this, use the command DBCC DropCleanBuffers. This will clear out the buffer cache, requiring all subsequent queries to read from disk. So:

  • Run DBCC DropCleanBuffers to clear the buffers
  • Run Select * from TblJobs to read the data straight from disk. Time how long it takes.
  • Run Select * from TblJobs again, several times, timing each. These will be read from memory
  • Run DBCC DropCleanBuffers to clear the buffers again
  • Run Select * from TblJobs to read the data from disk again.
  • Run Select * from TblJobs again, several times, timing each.

Much depends on how much data is being read and how much memory you have on the machine--but hey, memory gets pretty big these days, and I suspect this will not be an issue.

You can mix in runs of Select JobId, JobTitle, JobDescription from TblJobs. That returns the exact same set of data, and it should make no difference to your execution times.

Solution 2

Well selecting * from a table is always going to be fairly slow as you are asking the DBMS to scan the entire table for every row and return the results. If you want specific results and to optimize, I'd recommend you modify the queries to select the specific columns you want. Eg.

select * from TblJobs.JobTitle, TblJobs.JobDescription

As for the space between your text in your job description resulting in a faster retrieval time, this is more of a guess but could it be due to the indexing nature specified by you or your DBMS when you set up the table? The extremely long continuous string might take longer to read or determine where to proceed to retrieve your data. That or the DBMS has caching mechanisms in place that may have made the second search considerably quicker after the first execution (See here)

Hope this helps.

Share:
77,527
Ishan Jain
Author by

Ishan Jain

I'm a web developer working on HTML, CSS, Jquery/Javascript, Asp.net (C#). LinkedIn StackOverflow Career

Updated on September 01, 2020

Comments

  • Ishan Jain
    Ishan Jain over 3 years

    I have a problem, when getting bulk amount of data from a table.

    I have a database table TblJobs, in this table some columns contain bulk amount of data (approx 60,000 characters in this column).

    My table:

    TblJobs

    JobId   JobTitle     JobDescription 
    ----------------------------------------------------------------
     1       Job1         TextTextTextTextTextTextTextTextTextTextTextText... (approx 40,000 characters without any space in job description)  
     2       Job2         HelloHelloHelloHelloHelloHelloHelloHelloHelloHell..(approx 60,000 characters without any space  in job description)  
     3       Job3         DemoDemoDemoDemoDemoDemoDemoDemoDemoDemoDemoDemo...(approx 60,000 characters without any space  in job description)  
     4       Job4         TestingTestingTestingTestingTestingTestingTesti....(approx 50,000 characters without any space  in job description)  
    

    Structure of table is:

    JobId          -  Int
    JobTitle       -  VarChar(500)
    JobDescription -  VarChar(MAX)
    

    Now my problem is, when I execute query to select all columns from TblJobs it takes too long to execute (approx 30 sec). using this -

    Select * from TblJobs
    

    or

    Select JobId, JobTitle, JobDescription from TblJobs  
    

    I was surprised when modify some data into table's column JobDescription, this query execute in 3-5 sec only.

    In modification - I provide some spaces between data of JobDescription column.

    For example you can see following table, in this I only include some space between jobDescription column (I didn't change datatype or amount of data):

    JobId   JobTitle     JobDescription 
    ------------------------------------------------------------------------     
     1       Job1         Text TextTextText**<space>**TextTextTextText**<space>**TextTextTextText... (approx 40,000 characters with some space in job description)  
     2       Job2         HelloHello**<space>**HelloHelloHelloHello**<space>**HelloHelloHelloHell..(approx 60,000 characters with some space  in job description)  
     3       Job3         DemoDemoDemoDemo**<space>**DemoDemoDemoDemoDemo**<space>**DemoDemoDemo...(approx 60,000 characters with some space  in job description)  
     4       Job4         TestingTesting**<space>**TestingTestingTesting**<space>**TestingTesti....(approx 50,000 characters with some space  in job description)  
    

    So my question is, why select query takes too long time to execute when jobdescription don't have any space? I think, time issue not related to amount of data in my case.

  • Ishan Jain
    Ishan Jain almost 11 years
    select * from TblJobs.JobTitle, TblJobs.JobDescription - syntax is not correct, I already try this - select TblJobs.JobTitle, TblJobs.JobDescription from TblJobs. but it's take almost same time to execute.
  • Admin
    Admin almost 11 years
    Oh wow, my apologies silly mistake on my part. That's what I get for writing it on a phone, It would have only been a small optimization as it would have been one less column but you still scanned all the rows. I'll do a bit more research and get back to you, it's an interesting question.