Using ExcelDataReader to read Excel data starting from a particular cell

153,292

Solution 1

If you are using ExcelDataReader 3+ you will find that there isn't any method for AsDataSet() for your reader object, You need to also install another package for ExcelDataReader.DataSet, then you can use the AsDataSet() method.
Also there is not a property for IsFirstRowAsColumnNames instead you need to set it inside of ExcelDataSetConfiguration.

Example:

using (var stream = File.Open(originalFileName, FileMode.Open, FileAccess.Read))
{
    IExcelDataReader reader;

    // Create Reader - old until 3.4+
    ////var file = new FileInfo(originalFileName);
    ////if (file.Extension.Equals(".xls"))
    ////    reader = ExcelDataReader.ExcelReaderFactory.CreateBinaryReader(stream);
    ////else if (file.Extension.Equals(".xlsx"))
    ////    reader = ExcelDataReader.ExcelReaderFactory.CreateOpenXmlReader(stream);
    ////else
    ////    throw new Exception("Invalid FileName");
    // Or in 3.4+ you can only call this:
    reader = ExcelDataReader.ExcelReaderFactory.CreateReader(stream)

    //// reader.IsFirstRowAsColumnNames
    var conf = new ExcelDataSetConfiguration
    {
        ConfigureDataTable = _ => new ExcelDataTableConfiguration
        {
            UseHeaderRow = true 
        }
    };

    var dataSet = reader.AsDataSet(conf);

    // Now you can get data from each sheet by its index or its "name"
    var dataTable = dataSet.Tables[0];

    //...
}

You can find row number and column number of a cell reference like this:

var cellStr = "AB2"; // var cellStr = "A1";
var match = Regex.Match(cellStr, @"(?<col>[A-Z]+)(?<row>\d+)");
var colStr = match.Groups["col"].ToString();
var col = colStr.Select((t, i) => (colStr[i] - 64) * Math.Pow(26, colStr.Length - i - 1)).Sum();
var row = int.Parse(match.Groups["row"].ToString());

Now you can use some loops to read data from that cell like this:

for (var i = row; i < dataTable.Rows.Count; i++)
{
    for (var j = col; j < dataTable.Columns.Count; j++)
    {
        var data = dataTable.Rows[i][j];
    }
}

Update:

You can filter rows and columns of your Excel sheet at read time with this config:

var i = 0;
var conf = new ExcelDataSetConfiguration
{
    UseColumnDataType = true,
    ConfigureDataTable = _ => new ExcelDataTableConfiguration
    {
        FilterRow = rowReader => fromRow <= ++i - 1,
        FilterColumn = (rowReader, colIndex) => fromCol <= colIndex,
        UseHeaderRow = true
    }
};

Solution 2

To be more clear, I will begin at the beginning.

I will rely on the sample code found in https://github.com/ExcelDataReader/ExcelDataReader, but with some modifications to avoid inconveniences.

The following code detects the file format, either xls or xlsx.

FileStream stream = File.Open(filePath, FileMode.Open, FileAccess.Read);
IExcelDataReader excelReader;

//1. Reading Excel file
if (Path.GetExtension(filePath).ToUpper() == ".XLS")
{
    //1.1 Reading from a binary Excel file ('97-2003 format; *.xls)
    excelReader = ExcelReaderFactory.CreateBinaryReader(stream);
}
else
{
    //1.2 Reading from a OpenXml Excel file (2007 format; *.xlsx)
    excelReader = ExcelReaderFactory.CreateOpenXmlReader(stream);
}

//2. DataSet - The result of each spreadsheet will be created in the result.Tables
DataSet result = excelReader.AsDataSet();

//3. DataSet - Create column names from first row
excelReader.IsFirstRowAsColumnNames = false;

Now we can access the file contents in a more convenient way. I use DataTable for this. The following is an example to access a specific cell, and print its value in the console:

DataTable dt = result.Tables[0];
Console.WriteLine(dt.Rows[rowPosition][columnPosition]);

If you do not want to do a DataTable, you can do the same as follows:

Console.WriteLine(result.Tables[0].Rows[rowPosition][columnPosition]);

It is important not try to read beyond the limits of the table, for this you can see the number of rows and columns as follows:

Console.WriteLine(result.Tables[0].Rows.Count);
Console.WriteLine(result.Tables[0].Columns.Count);

Finally, when you're done, you should close the reader and free resources:

//5. Free resources (IExcelDataReader is IDisposable)
excelReader.Close();

I hope you find it useful.

(I understand that the question is old, but I make this contribution to enhance the knowledge base, because there is little material about particular implementations of this library).

Solution 3

For ExcelDataReader v3.6.0 and above. I struggled a bit to iterate over the Rows. So here's a little more to the above code. Hope it helps for few atleast.

using (var stream = System.IO.File.Open(copyPath, FileMode.Open, FileAccess.Read))
                    {

                        IExcelDataReader excelDataReader = ExcelDataReader.ExcelReaderFactory.CreateReader(stream);

                        var conf = new ExcelDataSetConfiguration()
                        {
                            ConfigureDataTable = a => new ExcelDataTableConfiguration
                            {
                                UseHeaderRow = true
                            }
                        };

                        DataSet dataSet = excelDataReader.AsDataSet(conf);
                        //DataTable dataTable = dataSet.Tables["Sheet1"];
                        DataRowCollection row = dataSet.Tables["Sheet1"].Rows;
                        //DataColumnCollection col = dataSet.Tables["Sheet1"].Columns;

                        List<object> rowDataList = null;
                        List<object> allRowsList = new List<object>();
                        foreach (DataRow item in row)
                        {
                            rowDataList = item.ItemArray.ToList(); //list of each rows
                            allRowsList.Add(rowDataList); //adding the above list of each row to another list
                        }

                    }

Solution 4

One way to do it :

FileStream stream = File.Open(@"c:\working\test.xls", FileMode.Open, FileAccess.Read);

IExcelDataReader excelReader = ExcelReaderFactory.CreateBinaryReader(stream);

excelReader.IsFirstRowAsColumnNames = true;

DataSet result = excelReader.AsDataSet();

The result.Tables contains the sheets and the result.tables[0].Rows contains the cell rows.

Solution 5

I found this useful to read from a specific column and row:

FileStream stream = File.Open(@"C:\Users\Desktop\ExcelDataReader.xlsx", FileMode.Open, FileAccess.Read);
IExcelDataReader excelReader = ExcelReaderFactory.CreateOpenXmlReader(stream);
DataSet result = excelReader.AsDataSet();
excelReader.IsFirstRowAsColumnNames = true;         
DataTable dt = result.Tables[0];
string text = dt.Rows[1][0].ToString();
Share:
153,292
ChhavishJ
Author by

ChhavishJ

Updated on July 09, 2022

Comments

  • ChhavishJ
    ChhavishJ almost 2 years

    I am using ExcelDataReader to read data from my Excel workbook in C#.
    But structure of my Excel sheet is such that data to be read can start from any particular cell and not necessarily A1.

    Can any one Please suggest a way on how this can be achieved using ExcelDataReader?

  • Mr. Blond
    Mr. Blond almost 8 years
    How exactly would you read an excel starting from a specific cell? Could you please explain little more in detail @Sievajet?
  • Khalil Khalaf
    Khalil Khalaf almost 8 years
    But this keeps throwing an error saying it is not supported.. and I tried this social.msdn.microsoft.com/Forums/en-US/…
  • cesAR
    cesAR over 7 years
    The question is focused on ExelDataReaer, which is a library used to avoid using OLEDB. ExelDataReaer works reading them as binary files.
  • LuisEduardox
    LuisEduardox almost 6 years
    Note that in new version (v3.4) is not necessary validate file extension, because of ExcelDataReader validate for us. In case file Extension is invalid, it will generate a exception. Reference
  • shA.t
    shA.t almost 6 years
    @LuisEduardox tnx, I update my answer to show your comment ;).
  • Alex Gordon
    Alex Gordon over 5 years
    thanks so much for this great function. i have a date in XLSX format 10/08/2018 for some reason when i convert this to CSV format, it's inserting a time as well 10/08/2018 12:00:00AM i've attempted attacking this by setting UseColumnDataType = false however, that yielded the same result
  • Alex Gordon
    Alex Gordon over 5 years
    is it possible to do this on HttpRequest body rather than a filestream? i have a payload that's coming in via the body of a POST request
  • Alex Gordon
    Alex Gordon over 5 years
    Would love an example
  • cesAR
    cesAR over 5 years
    @l--''''''---------'''''''''''' It's been a long time since I used the library, but I read the documentation now, and apparently it only supports FileStream. One idea that occurs to me, is to use GetResponseStream() and store your payload in a file, then perform the classic procedure.
  • shA.t
    shA.t over 5 years
    @l--''''''---------'''''''''''' I found your question and add an example as an answer, HTH ;).
  • Grant Shannon
    Grant Shannon almost 5 years
    In the above code - just in case you were wondering - the Sheets in a Workbook have index starting at zero - so: var dataTable = dataSet.Tables[0] references Sheet 1 in a workbook and var dataTable = dataSet.Tables[1] references Sheet2 in a Workbook etc etc ...
  • shA.t
    shA.t over 4 years
    @GrantShannon tnx, and I add a comment to code to notify that -HTH ;).
  • Michael Welch
    Michael Welch over 3 years
    I don't know if this is correct or not, but since this is a new answer to a pretty old question with other highly voted answers, it would be useful to summarize what is different about your approach and why you wanted to share it.
  • toha
    toha over 3 years
    what is the underscore named? _ =>
  • shA.t
    shA.t over 3 years
    @toha in C# when we need to pass a variable to an arrow function (lamda) and we know that we will never use it in that function, we usually use underscore as a variable name [An old naming convention] -HTH ;)
  • toha
    toha over 3 years
    @shA.t Correct. It is variable name which is will be send to arrow function.. Writer does not care what is name of variable. so he put underscore instead of x or anything else name of it. stackoverflow.com/questions/18300654/…