Can a CSV file have a comment?

142,486

Solution 1

The CSV "standard" (such as it is) does not dictate how comments should be handled, no, it's up to the application to establish a convention and stick with it.

Solution 2

In engineering data, it is common to see the # symbol in the first column used to signal a comment.

I use the ostermiller CSV parsing library for Java to read and process such files. That library allows you to set the comment character. After the parse operation you get an array just containing the real data, no comments.

Solution 3

No, CSV doesn't specify any way of tagging comments - they will just be loaded by programs like Excel as additional cells containing text.

The closest you can manage (with CSV being imported into a specific application such as Excel) is to define a special way of tagging comments that Excel will ignore. For Excel, you can "hide" the comment (to a limited degree) by embedding it into a formula. For example, try importing the following csv file into Excel:

=N("This is a comment and will appear as a simple zero value in excel")
John, Doe, 24

You still end up with a cell in the spreadsheet that displays the number 0, but the comment is hidden.

Alternatively, you can hide the text by simply padding it out with spaces so that it isn't displayed in the visible part of cell:

                              This is a sort-of hidden comment!,
John, Doe, 24

Note that you need to follow the comment text with a comma so that Excel fills the following cell and thus hides any part of the text that doesn't fit in the cell.

Nasty hacks, which will only work with Excel, but they may suffice to make your output look a little bit tidier after importing.

Solution 4

I think the best way to add comments to a CSV file would be to add a "Comments" field or record right into the data.

Most CSV-parsing applications that I've used implement both field-mapping and record-choosing. So, to comment on the properties of a field, add a record just for field descriptions. To comment on a record, add a field at the end of it (well, all records, really) just for comments.

These are the only two reasons I can think of to comment a CSV file. But the only problem I can foresee would be programs that refuse to accept the file at all if any single record doesn't pass some validation rules. In that case, you'd have trouble writing a string-type field description record for any numeric fields.

I am by no means an expert, though, so feel free to point out any mistakes in my theory.

Solution 5

A Comma Separated File is really just a text file where the lines consist of values separated by commas.

There is no standard which defines the contents of a CSV file, so there is no defined way of indicating a comment. It depends on the program which will be importing the CSV file.

Of course, this is usually Excel. You should ask yourself how does Excel define a comment? In other words, what would make Excel ignore a line (or part of a line) in the CSV file? I'm not aware of anything which would do this.

Share:
142,486

Related videos on Youtube

Pure.Krome
Author by

Pure.Krome

Just another djork trying to ply his art in this mad mad world. Tech stack I prefer to use: Laguage: C# / .NET Core / ASP.NET Core Editors: Visual Studio / VS Code Persistence: RavenDB, SqlServer (MSSql or Postgres) Source control: Github Containers: Docker & trying to learn K&'s Cloud Platform: Azure Caching/CDN: Cloudflare Finally: A Tauntaun sleeping bag is what i've always wanted spaces > tabs

Updated on August 21, 2021

Comments

  • Pure.Krome
    Pure.Krome almost 2 years

    Is there any official way to allow a CSV formatted file to allow comments, either on its own line OR at the end of a line?

    I tried checking wikipedia on this and also RFC 4180 but both do not mention anything which leads me to believe that it's not part of the file format so it's bad luck to me and I should then use a seperate ReadMe.txt file thingy to explain the file.

    Lastly, i know it's easy for me to add my own comments in, but i was hoping that something like Excel could just import it straight away with no need for a consumer to have to customize the import process.

    So, thoughts?

    • Square Rig Master
      Square Rig Master over 13 years
      What would you comment on? The values in each line or the file itself? Is XML file an alternative for you?
    • new123456
      new123456 almost 12 years
      The preposal was shot down for Python.
    • hunt
      hunt over 9 years
      Maybe a version string for the data @SquareRigMaster . Just like I am trying to do now?
    • Richard Smith
      Richard Smith about 3 years
      @SquareRigMaster – Or a copyright statement.
  • vipw
    vipw almost 12 years
    RFC 4180 is the standard now.
  • Tyler Mumford
    Tyler Mumford almost 11 years
    Aaand, I just read that you didn't want to customize the import process. Sorry 'bout that. Hopefully somebody finds this useful, then.
  • Qix - MONICA WAS MISTREATED
    Qix - MONICA WAS MISTREATED over 8 years
    There is no standard which defines the contents of a CSV file False.
  • Paul Weibert
    Paul Weibert over 8 years
    RFC 4180 is not a standard, rfc4180 tells: "This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited."
  • Alien Technology
    Alien Technology over 8 years
    @Qix - from section 2 of the referenced document: "While there are various specifications and implementations for the CSV format (for ex. [4], [5], [6] and [7]), there is no formal specification in existence"
  • Marco Sulla
    Marco Sulla about 8 years
    OK, can we say is a de facto standard?
  • usr-local-ΕΨΗΕΛΩΝ
    usr-local-ΕΨΗΕΛΩΝ almost 8 years
    All RFCs are memos not intended to provide any Internet standard AFAIK
  • Steve Hole
    Steve Hole almost 8 years
    Yah ... that's not true. There are standards track documents and non-standard track (informational) documents. The entire process, including descriptions, processes and rules for IETF issued documents is defined by RFC2026 with some follow on amendments. Every RFC will specify at the beginning which track it is on.
  • maurice
    maurice over 7 years
    Since there will probably be a block of code whose main purpose is to load this csv, maybe that's the best place to put comments relating to the csv data. Or in the commit messages when checking csv changes into source control.
  • IAmNaN
    IAmNaN over 5 years
    RFC is an acronym that stands for "Request For Comments," meaning it is intended on gathering feedback from the community. That being said, almost the entire internet runs on unratified RFCs, or less. The CSV "standard" itself is essentially undefined without RFC4180. It is the most definitive model we have although it might change someday. As it stands, RFC4180 has no provisions for inserting comments. If you add your own commenting mechanism to the format, don't expect interoperability with other reader/writers that follow RFC4180.
  • Ben Hershey
    Ben Hershey almost 4 years
    Good post. Another reason I can think of for why you might want comments is to add some meta-data about the file as a whole. Adding a whole column or row just for one cell with this info this feels a bit awkward.
  • Crog
    Crog over 2 years
    Some parsers (Matlab included) support detecting lines starting in a prefix character and handling this differently as comments etc. For example adding some form of 'meta' versioning for optimising/guiding the code interpreting the data can be achieved via comment and '#' is what I have more often seen and used: #Csv/Version 1.9 Time,ValueA,ValueB 0.0, 123, 456 0.1, 123, 349
  • dat
    dat over 2 years
    With emacs, csv-comment-start defaults to #
  • Chiarcos
    Chiarcos almost 2 years
    The use of # is also a de facto standard in TSV formats ("CoNLL formats") in language technology. These formats pre-date the current CSV spec by more than a decade. Main difference to CSV is that they require the separator to be TAB (or, earlier, SPACE) rather than comma, but technically, that's still regarded a CSV format.
  • Greg Wittmeyer
    Greg Wittmeyer about 1 year
    Microsoft IIS log files use the # for comments.