Saturday, May 10, 2008

How To Increase Performance of Loading Large Data from Database (Part-1)

Issue: While loading Large amount of Data from Database:
  • Connection Timeout occurs
  • Application gets Hang/Stuck
  • Application takes too much time to load data

Level: Advanced

Knowledge Required:
SQL Server 2005
Stored Procedures

First of all this is NOT a good practice to load ALL THE DATA from a Table into memory. A Human being cannot process/analyse all this huge data in a glance, instead we usually interested in a short information. For example we have a Table that contains all the Contact Numbers living in the City along with there Addresses and Names. Now we usually want to extract out one particular number or one particular name, NOT all the names and numbers.

But still we (developers) are forced to create such applications which give the user to access all the data by SCROLLING up or down.

So in this Article I will show you how to increase this performance by the following ways:

  1. Implement Paging in SQL Server 2005 (Database Level)
  2. Implement Paging using DataAdapter (Application Level)
  3. Use the Virtual Mode of DataGridView Control
In this part I will discuss the 1st one.
1) Paging in SQL Server 2005
Paging means we divide our huge data into number of small chunks. We will display 1 page at a time and will provide next previous buttons so user can navigate forward and backword. I think this is the fastest way to load data. We will send the Page Number and Number of Records Per Page to Stored Procedure, which will return only that part of data. For example: I have created a Table tbl_LargeData in which there are 3 Fields:

  • Row_ID [primary key, int, Identity Column]
  • SomeData [varchar(255)]
  • InsertDateTime [DateTime, Default = GetDate()]
In this table I have put some fake data with Total 1,000,000 Rows.

Now this is a bit Large Data (NOT that much Large). Now to test, I created a project in VB and simply loaded the Data (in a DataTable) by executing the Query

SELECT * FROM tbl_LargeData

I have 1 GB RAM and AMD Athlon 64 Processor 3500+.

Unbound DataTable: It took 28 Seconds to fill the DataTable.
Bound DataTable: I bind that DataTable to BindingSource, it took 51 seconds, almost double.

BindingSource also increases time to Fill a DataTable, becuase BindingSource itself keeps another Cache of Data for Sorting and Filtering Purpose.

The Loading of Data can significantly increase Time, if:

  • System has low RAM and Processing Speed
  • Other Applications are also running on Client PC
  • Database Server is NOT on same machine, it is somewhere on the LAN
  • Client PC is connected to server using low Band Width
Therefore to Implement Paging I have created 2 Stored Procedures in Database:

  1. GetLargeDataPageInfo
  2. GetLargeDataWithPaging
The GetLargeDataPageInfo stored procedure has a Parameter @PageSize int. User has to provide the Page Size (i.e. Number of Records Per Page) then this stored procedure will return

Total RecordsTotal Pages

GetLargeDataWithPaging stored procedure is the main stored procedure which Returns the Particular Page of Data. The script is:

CREATE PROCEDURE [dbo].[GetLargeDataWithPaging]
@PageNumber int,
@PageSize int

-- For Paging we have used the ROW_NUMBER() function
-- which operates on Ordering of Column

DECLARE @RowStart int;
DECLARE @RowEnd int;

-- Calculate the first row's Index
-- and Last Row's Index
SET @RowStart = ((@PageNumber - 1) * @PageSize) + 1;
SET @RowEnd = @RowStart + @PageSize - 1;

SELECT Row_ID, SomeData, InsertDateTime
) AS Row_Num,
FROM tbl_LargeData
) AS DerivedTable
WHERE Row_Num Between @RowStart AND @RowEnd;

Source Code:
Note: Execute the ScriptToInsertBulkData.sql file (ziped inside the TemporaryDB.rar) to add the Fake Data

Update (12-May-2008): Point #2 changed to "Implement Paging using DataAdapter" which was "Implement Paging using DataReader"

No comments: