c++ program for reading an unknown size csv file (filled only with floats) with constant (but unknown) number of columns into an array

后端 未结 4 1796
别那么骄傲
别那么骄傲 2021-01-16 09:24

was wondering if someone could give me a hand im trying to build a program that reads in a big data block of floats with unknown size from a csv file. I already wrote this i

4条回答
  •  伪装坚强ぢ
    2021-01-16 10:00

    I would, obviously, just use IOStreams. Reading a homogeneous array or arrays from a CSV file without having to bother with any quoting is fairly trivial:

    #include 
    #include 
    #include 
    #include 
    
    std::istream& comma(std::istream& in)
    {
        if ((in >> std::ws).peek() != std::char_traits::to_int_type(',')) {
            in.setstate(std::ios_base::failbit);
        }
        return in.ignore();
    }
    
    int main()
    {
        std::vector> values;
        std::istringstream in;
        for (std::string line; std::getline(std::cin, line); )
        {
            in.clear();
            in.str(line);
            std::vector tmp;
            for (double value; in >> value; in >> comma) {
                tmp.push_back(value);
            }
            values.push_back(tmp);
        }
    
        for (auto const& vec: values) {
            for (auto val: vec) {
                std::cout << val << ", ";
            }
            std::cout << "\n";
        }
    }
    

    Given the simple structure of the file, the logic can actually be simplified: Instead of reading the values individually, each line can be viewed as a sequence of values if the separators are read automatically. Since a comma won't be read automatically, the commas are replaced by spaced before creating the string stream for the internal lines. The corresponding code becomes

    #include 
    #include 
    #include 
    #include 
    #include 
    #include 
    #include 
    
    int main()
    {
        std::vector > values;
        std::ifstream fin("textread.csv");
        for (std::string line; std::getline(fin, line); )
        {
            std::replace(line.begin(), line.end(), ',', ' ');
            std::istringstream in(line);
            values.push_back(
                std::vector(std::istream_iterator(in),
                                    std::istream_iterator()));
        }
    
        for (std::vector >::const_iterator
                 it(values.begin()), end(values.end()); it != end; ++it) {
            std::copy(it->begin(), it->end(),
                      std::ostream_iterator(std::cout, ", "));
            std::cout << "\n";
        }
    }
    

    Here is what happens:

    1. The destination values is defined as a vector of vectors of double. There isn't anything guaranteeing that the different rows are the same size but this is trivial to check once the file is read.
    2. An std::ifstream is defined and initialized with the file. It may be worth checking the file after construction to see if it could be opened for reading (if (!fin) { std::cout << "failed to open...\n";).
    3. The file is processed one line at a time. The lines are simply read using std::getline() to read them into a std::string. When std::getline() fails it couldn't read another line and the conversion ends.
    4. Once the line is read, all commas are replaced by spaces.
    5. From the thus modified line a string stream for reading the line is constructed. The original code reused a std::istringstream which was declared outside the loop to save the cost of constructing the stream all the time. Since the stream goes bad when the lines is completed, it first needed to be in.clear()ed before its content was set with in.str(line).
    6. The individual values are iterated using an std::istream_iterator which just read a value from the stream it is constructed with. The iterator given in is the start of the sequence and the default constructed iterator is the end of the sequence.
    7. The sequence of values produced by the iterators is used to immediately construct a temporary std::vector representing a row.
    8. The temporary vector is pushed to the end of the target array.

    Everything after that is simply printing the content of the produced matrix using C++11 features (range-based for and variables with automatically deduced type).

提交回复
热议问题