Tuesday, August 28, 2012

C# Code to change the encoding of a text file to the desired encoding

Note: I started writing this blog a few months ago, and forgot to publish this. Anyway, better late than never, enjoy.

This code lets you convert multiple files in a directory to the desired encoding. The reason for this utility is the source control which is being used by my client doesn't support anything other than ASCII (7 bit). This function I have below should be sufficient for the most needs, but I have shared the entire code as well.

Function to change the encoding:


        private static void ChangeEncoding(string FolderPath, string TargetEncoding)
        {
            try
            {
                DirectoryInfo info = new DirectoryInfo(FolderPath);
                FileInfo[] Fi = info.GetFiles();
                StreamWriter swriter;
                StreamReader sreader;
                string FileName;
                foreach (FileInfo obj in Fi)
                {
                    FileName = obj.FullName;
                    File.SetAttributes(FileName, FileAttributes.Normal);
                    File.Move(FileName, FileName + ".proback");
                    sreader = new StreamReader(FileName + ".proback");
                    if (TargetEncoding.Equals("BigEndianUnicode"))
                    {

                        swriter = new
StreamWriter(FileName, false, Encoding.BigEndianUnicode);
                    }
                    else if (TargetEncoding.Equals("Unicode"))
                    {
                        swriter = new StreamWriter(FileName, false, Encoding.Unicode);
                    }
                    else if (TargetEncoding.Equals("UTF32"))
                    {
                        swriter = new StreamWriter(FileName, false, Encoding.UTF32);
                    }
                    else if (TargetEncoding.Equals("UTF7"))
                    {
                        swriter = new StreamWriter(FileName, false, Encoding.UTF7);
                    }
                    else if (TargetEncoding.Equals("UTF8"))
                    {
                        swriter = new StreamWriter(FileName, false, Encoding.UTF8);
                    }
                    else if (TargetEncoding.Equals("ASCII"))
                    {
                        swriter = new StreamWriter(FileName, false, Encoding.ASCII);
                    }
                    else
                    {
                        swriter = new StreamWriter(FileName, false, Encoding.Default);
                    }

                    while (!sreader.EndOfStream)
                    {
                        swriter.WriteLine(sreader.ReadLine());
                    }

                    sreader.Close();
                    swriter.Close();
                    sreader.Dispose();
                    swriter.Dispose();
                    File.Delete(FileName + ".proback");
                   
                }
            }
            catch (Exception ex)
            {
                throw ex;  
            }
        }


 Downloads:  


Note: You need .NET framework 4

Complete Source - https://docs.google.com/open?id=0B1z9Rc2ld5VPSExPSE9sUEh1SkE


(Once the link opens, press Ctrl+S to download)

1 comment:

aodennison said...

If the parameter string TargetEncoding were Encoding encoding instead most of your code is not needed. Also your code assumes the input is ASCII, which is fine for your purposes, but will make a mess of other encodings. Next, consider writing to a temp file, then on success move rename the source, then the temp to the source file. That way an error will not leave you with a corrupt/incomplete source file. Also you want to avoid converting or overwriting your backup files. Since if you convert your files twice you will have lost your originals. See https://en.wikipedia.org/wiki/Idempotence.

Consider the preferred way of closing and disposing is the using {...} statement.

Finally, consider rules for not touching binary files. JPG's and exe's will be corrupt after changing the encoding.