Precautions when outputting CSV in C #

1 minute read

CSV is convenient, isn’t it?

When you want to export data to spreadsheet software such as Excel, creating .xlsx is not impossible, but it is quite troublesome. Therefore, it is output as CSV.
If you’re just dealing with numbers and alphabets


File.WriteAllLines(FileName, CSVData);

You can write in the form like.
However, if CSVData contains non-ASCII characters, Excel will not recognize it correctly.
It should have been written in UTF-8, but as a result of reading it as ShiftJIS, the characters are garbled.
(You can open it from Excel, but if you open it with CSV association, the characters will be garbled)

Well, here’s the solution. In the Japanese version of Excel, the default is recognized as Shift JIS. On the other hand, ANSI recognizes that it is the English version. On the other hand, if you export to UTF-8 with BOM, Excel will recognize it properly and read it as UTF-8.
If there is no BOM, it will be the default code page, and if it has a BOM, it will be read according to the BOM.

How to export with UTF-8 (BOM)

that? You might think that.
Indeed, File.WriteAllLines has an Encoding argument. However, it writes in UTF-8 (without BOM) by default, so it seems that writing without Encode argument is executed by specifying Encoding.Text.UTF8Encoding for Encoding.
Yes, I had thought that way.

Actually, this is written on the Microsoft site.

The default behavior of the method is that WriteAllLines writes data using utf-8 encoding without the byte order mark (BOM). If you need to include a UTF-8 identifier (such as a byte order mark) at the beginning of the file, use Method Overload with WriteAllLines (String, String [], Encoding) encoding to UTF8.

Yes, Encoding.Text.UTF8 The method with Encoding and the method without Encoding behave differently.

That’s right.


File.WriteAllLines(FileName, CSVDataEncoding.Text.UTF8Encoding);

Is correct. Excel will now recognize and read the character code properly. I’m happy.