how can i correctly parse a text file delimited by white space

左心房为你撑大大i 提交于 2019-12-19 21:39:31

问题


Below is my sample text file

{

Here is my schema file

[Sample File.txt]
ColNameHeader=True
Format=TabDelimited
CharacterSet=ANSI

And here is the code i have so far writen to try and read the above sample file, the data rows read from the text file above is supposed to be returned for display in a dataGridView control. The problem is, its being returned as single column, yet i want to use those white spaces as the column delimiters. I have tried different character delimiters with out success.

public DataSet LoadCSV(int numberOfRows)
    {
        DataSet ds = new DataSet();
            // Creates and opens an ODBC connection
            string strConnString = "Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=" + this.dirCSV.Trim() + ";Extensions=asc,csv,tab,txt;Persist Security Info=False";

            string sql_select;
            OdbcConnection conn;
            conn = new OdbcConnection(strConnString.Trim());
            conn.Open();

            //Creates the select command text
            if (numberOfRows == -1)
            {
                sql_select = "select * from [" + this.FileNevCSV.Trim() + "]";
            }
            else
            {
                sql_select = "select top " + numberOfRows + " * from [" + this.FileNevCSV.Trim() + "]";
            }

            //Creates the data adapter
            OdbcDataAdapter obj_oledb_da = new OdbcDataAdapter(sql_select, conn);

            //Fills dataset with the records from CSV file
            obj_oledb_da.Fill(ds, "csv");

            //closes the connection
            conn.Close();

        return ds;
    }

And setting the dataGridView's data source like to

    // loads the first 500 rows from CSV file
this.dataGridView_preView.DataSource = LoadCSV(500);
this.dataGridView_preView.DataMember = "csv";

i, get this in the datagridview, i get one column yet i expect to see the data returned as 7 columns.

Plus, i have no idea where F2 and F3 columns are coming from


回答1:


I would probably do this a different way. I would use a StreamReader, and read in the file line by line, break the string up into object properties, and store the objects in a list. Then you bind the list to the datagridviews datasource. I demonstrate two quick ways to do this.

1 -Tab separated data

If the file is tab separated, as it seems to be, split the line into an array and assign each index with to a property like so.

public partial class Form1 : Form
{
    private void Form1_Load(object sender, EventArgs e)
    {
        var rows = new List<Row>();
        var sr = new StreamReader(@"C:\so_test.txt");
        while (!sr.EndOfStream)
        {
            string s = sr.ReadLine();
            if (!String.IsNullOrEmpty(s.Trim()))
            {
                rows.Add(new Row(s));
            }
        }
        sr.Close();
        dataGridView1.DataSource = rows;
    }
}

public class Row
{
    public double Number1 { get; set; }
    public double Number2 { get; set; }
    public double Number3 { get; set; }
    public double Number4 { get; set; }
    public double Number5 { get; set; }
    public double Number6 { get; set; }
    public double Number7 { get; set; }
    public string Date1 { get; set; }

    public Row(string str)
    {
        string[] separator = { "\t" };
        var arr = str.Split(separator, StringSplitOptions.None);
        Number1 = Convert.ToDouble(arr[0]);
        Number2 = Convert.ToDouble(arr[1]);
        Number3 = Convert.ToDouble(arr[2]);
        Number4 = Convert.ToDouble(arr[3]);
        Number5 = Convert.ToDouble(arr[4]);
        Number6 = Convert.ToDouble(arr[5]);
        Number7 = Convert.ToDouble(arr[6]);
        Date1 = arr[7];
    }
}

2 -Hard Start points and lengths

If the data is tab separated, but conforms to strict start and endpoints for each column, you could declare the startpoints and lengths for each column as constants and get those via substring. This would only need a change in code in your Row class, like this. I have left of the constants from brevity, and just hardcoded them.

    public Row(string str)
    {
        Number1 = Convert.ToDouble(str.Substring(4, 6));
        Number2 = Convert.ToDouble(str.Substring(16, 6));
        Number3 = Convert.ToDouble(str.Substring(28, 7));
        Number4 = Convert.ToDouble(str.Substring(40, 7));
        Number5 = Convert.ToDouble(str.Substring(52, 6));
        Number6 = Convert.ToDouble(str.Substring(64, 6));
        Number7 = Convert.ToDouble(str.Substring(76, 6));
        Date1 = str.Substring(88, 24);
    }




回答2:


This modified schema.ini file does seem to solve the problem.

[Sample File.txt]
Format=FixedLength
Format=TabDelimited
MaxScanRows=25
CharacterSet=ANSI
Col1=Col1 Memo Width 10
Col2=Col2 Memo Width 15
Col3=Col3 Memo Width 11
Col4=Col4 Memo Width 12
Col5=Col5 Memo Width 10
Col6=Col6 Memo Width 11
Col7=Col7 Memo Width 150


来源:https://stackoverflow.com/questions/11720787/how-can-i-correctly-parse-a-text-file-delimited-by-white-space

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!