Reading Excel files from C#

508,833

Solution 1

var fileName = string.Format("{0}\\fileNameHere", Directory.GetCurrentDirectory());
var connectionString = string.Format("Provider=Microsoft.Jet.OLEDB.4.0; data source={0}; Extended Properties=Excel 8.0;", fileName);

var adapter = new OleDbDataAdapter("SELECT * FROM [workSheetNameHere$]", connectionString);
var ds = new DataSet();

adapter.Fill(ds, "anyNameHere");

DataTable data = ds.Tables["anyNameHere"];

This is what I usually use. It is a little different because I usually stick a AsEnumerable() at the edit of the tables:

var data = ds.Tables["anyNameHere"].AsEnumerable();

as this lets me use LINQ to search and build structs from the fields.

var query = data.Where(x => x.Field<string>("phoneNumber") != string.Empty).Select(x =>
                new MyContact
                    {
                        firstName= x.Field<string>("First Name"),
                        lastName = x.Field<string>("Last Name"),
                        phoneNumber =x.Field<string>("Phone Number"),
                    });

Solution 2

If it is just simple data contained in the Excel file you can read the data via ADO.NET. See the connection strings listed here:

http://www.connectionstrings.com/?carrier=excel2007 or http://www.connectionstrings.com/?carrier=excel

-Ryan

Update: then you can just read the worksheet via something like select * from [Sheet1$]

Solution 3

The ADO.NET approach is quick and easy, but it has a few quirks which you should be aware of, especially regarding how DataTypes are handled.

This excellent article will help you avoid some common pitfalls: http://blog.lab49.com/archives/196

Solution 4

This is what I used for Excel 2003:

Dictionary<string, string> props = new Dictionary<string, string>();
props["Provider"] = "Microsoft.Jet.OLEDB.4.0";
props["Data Source"] = repFile;
props["Extended Properties"] = "Excel 8.0";

StringBuilder sb = new StringBuilder();
foreach (KeyValuePair<string, string> prop in props)
{
    sb.Append(prop.Key);
    sb.Append('=');
    sb.Append(prop.Value);
    sb.Append(';');
}
string properties = sb.ToString();

using (OleDbConnection conn = new OleDbConnection(properties))
{
    conn.Open();
    DataSet ds = new DataSet();
    string columns = String.Join(",", columnNames.ToArray());
    using (OleDbDataAdapter da = new OleDbDataAdapter(
        "SELECT " + columns + " FROM [" + worksheet + "$]", conn))
    {
        DataTable dt = new DataTable(tableName);
        da.Fill(dt);
        ds.Tables.Add(dt);
    }
}

Solution 5

How about Excel Data Reader?

http://exceldatareader.codeplex.com/

I've used in it anger, in a production environment, to pull large amounts of data from a variety of Excel files into SQL Server Compact. It works very well and it's rather robust.

Share:
508,833

Related videos on Youtube

dbkk
Author by

dbkk

Updated on July 08, 2022

Comments

  • dbkk
    dbkk almost 2 years

    Is there a free or open source library to read Excel files (.xls) directly from a C# program?

    It does not need to be too fancy, just to select a worksheet and read the data as strings. So far, I've been using Export to Unicode text function of Excel, and parsing the resulting (tab-delimited) file, but I'd like to eliminate the manual step.

  • StingyJack
    StingyJack over 15 years
    This way is by far the fastest.
  • Adam Ralph
    Adam Ralph over 15 years
    Yes, but that would involve creating an Excel.Application instance, loading the xls file, etc. If the requirement is purely to read some data from the file then it's much easier and far more lightweight to use one of the ADO.NET methods described in the other answers.
  • hitec
    hitec almost 15 years
    Couldn't agree more Cherian. This code is many years old... before I even was proficient with Resharper :)
  • Admin
    Admin over 14 years
    Of course that's not true, Stingy. You have to sift through all the data and write crappy DB code (hand craft your models, map columns to properties, yadda yadda). The quickest way is to let some other poor SOB do this for you. That's why people use frameworks instead of writing everything from the bottom up.
  • Admin
    Admin over 14 years
    I won't down you, but I recently started using FileHelpers and was shocked at how ... crappy it is. For instance, the only way to map columns in a csv to properties... excuse me, FIELDS, of a model is to create the fields in the order of the columns. I don't know about you, but I wouldn't rely on a quirk of the compiler for one of the most central design considerations of my f8king framework.
  • Mark Anthony Ogsimer
    Mark Anthony Ogsimer about 14 years
    Besides that I have had times where it didn't give me the right results due to localization problems... the neverending fight of seperators
  • Triynko
    Triynko almost 14 years
    Worthless method! Truncates text columns to 255 characters when read. Beware! See: stackoverflow.com/questions/1519288/… ACE engine does same thing!
  • melih
    melih almost 14 years
    Triynko, it has been a super long time since I used this method, but IIRC you can get around the 255 char limit by defining an ODBC DSN for the spreadsheet and then define the columns as longer in length and then use the DSN to connect to the spreadsheet. It's a pain to do that, but I believe that gets around that.
  • kenny
    kenny almost 14 years
    I would suspect you can protect it from Excel, but not from man with compiler...like anything...it's just bytes.
  • Kevin Le - Khnle
    Kevin Le - Khnle almost 14 years
    If seems like the Select in this approach tries to guess the data type of the column and force upon that guessed data type. For example, if you have a column with mostly double values, it won't like you passing x.Field<string>, but expects x.Field<double>. IS this true?
  • Kevin Le - Khnle
    Kevin Le - Khnle almost 14 years
    You answered my question (in the form of a comment above).
  • Robin Robinson
    Robin Robinson almost 14 years
    Just looked it up on MSDN. Looks like the <T> is just used to attempt to cast the contents in the column to a type. In this example and just casting the data in the columns to strings. If you wanted a double you would need to call double.Parse(x.Field<string>("Cost") or something like that. Field is an extension method for DataRow and it looks like there aren't an non generic versions.
  • Sam
    Sam almost 14 years
    The code is ugly, but it shows how to get the sheet names, great!
  • David Keaveny
    David Keaveny over 13 years
    I'll second Excel Data Reader; it has also led to the incredibly useful Excel Data Driven Tests library, which uses NUnit 2.5's TestCaseSource attribute to make data-driven tests using Excel spreadsheets ridiculously easy. Just beware that Resharper doesn't yet support TestCaseSource, so you have to use the NUnit runner.
  • shen
    shen over 13 years
    Does adding a double.Parse to the Linq query slow it down much?
  • shen
    shen over 13 years
    Too slow, using Office PIA as the baseline, everything else is faster - even just using an Object array passed from .Value2 property. Which is still using the PIA.
  • shen
    shen over 13 years
    Hard to justify when there are so many simple and effective ways (for free) of reading from and writing to Excel.
  • shen
    shen over 13 years
    @gsvirdi, post a seperate question on Excel file security, this question is on performance.
  • xanadont
    xanadont over 13 years
    @Anonymous-type I did read the question and was offering a helpful alternative to a desired OSS implementation ... because, well, I was pretty sure there was nothing available. And, judging by the accepted answer, a requirement of having Office installed is not an issue.
  • Robin Robinson
    Robin Robinson over 13 years
    Not that I have noticed. I haven't done any real performance on this. For our uses, it isn't being done a lot.
  • zihotki
    zihotki over 13 years
    Be aware that using ADO.NET to read data from exel requires Microsoft Access or Microsoft Access Database Engine Redistributable installed.
  • mena talla mostafa
    mena talla mostafa about 13 years
    There is already mention of exceldatareader here stackoverflow.com/questions/15828/reading-excel-files-from-c‌​/… .Why do you think we need another answer. You should comment the link, not to create long thread garbage
  • Brian Low
    Brian Low about 13 years
    The driver will also guess at the columns types based on the first several rows. If you have a column with what looks like integers in the first rows you will encounter an error when you hit a non-integer (e.g. a float, a string)
  • Jeremy Holovacs
    Jeremy Holovacs almost 13 years
    worksheet isn't defined... seems a bit odd to me after clearly defining everything else.
  • aquinas
    aquinas over 12 years
    This also will not work at ALL if you are running in a 64 bit process. forums.asp.net/p/1128266/1781961.aspx
  • Andreas Grech
    Andreas Grech about 12 years
    Note that if you're reading xlsx, you need to use this connection string instead: string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0}; Extended Properties=Excel 12.0;", fileName)
  • martinstoeckli
    martinstoeckli about 12 years
    It was helpful, especially the part about reading the sheetnames.
  • Duncan
    Duncan almost 12 years
    Sadly the Jet.OLEDB driver is not 64-bit compatible; you will need to switch to target x86 rather than Any CPU (if you still want to go ahead with this method). Alternatively install the 64-bit ACE driver and change the conn string to use this driver (as indicated by Andreas) - microsoft.com/en-us/download/…
  • David Burton
    David Burton over 11 years
    Doesn't look particularly active any more, compared to, say, NPOI
  • Chad
    Chad over 11 years
    FYI: I tried it and it didn't meet my need to be able to read an encrypted file.
  • Ian1971
    Ian1971 over 11 years
    Unfortunately, there are some issues with this library that we've just encountered. Firstly we've had some currency fields coming out as dates. Secondly it is crashing if the workbook has any empty sheets in it. So, although it was very easy to integrate we are now re-evaluating whether to keep using this library. It does not seem to be being actively developed.
  • Engr.MTH
    Engr.MTH over 11 years
    Cannot install the 64 bit ACE driver if the target machine has a 32 bit version of office installed.
  • Drewmate
    Drewmate over 11 years
    This is a really great little library. It just converts everything into Lists of Lists of strings, which is just fine for the kind of work I needed it for.
  • RichieHindle
    RichieHindle over 11 years
    It also assumes the presence of some optional elements in xlsx file that cause it to fail to read the data if they're absent.
  • RegisteredUser
    RegisteredUser over 11 years
    If this helps anyone, the Jet driver works fine in Win7 64bit... as long as I actually have the document open in Excel.
  • kingfleur
    kingfleur over 11 years
    We're having problems with Excel files coming from SQL Server Reporting Services. They just don't work, unless you open them and save them (even unedited). @RichieHindle: what optional elements are you talking about (hoping this might help me with my SSRS Excel files)?
  • RichieHindle
    RichieHindle over 11 years
    @Peter: I think it was a missing <dimension> element in the <worksheet> that was causing trouble for me.
  • Ian1971
    Ian1971 over 11 years
    As an update to my comment above. We did keep going with this library, and in fact I and another guy have become developers on the project and it is now actively being worked on again. The issues I mentioned have now been fixed, as has open office support and hopefully SSRS (need someone to test it).