libRDataFrame
0.815
A library with R style datatypes and associated utilities
|
A library, containing the consolidated collection of routines to support R-style Data Structures for C++
see Project page
Some components of the is project are of use outside of the R data structure Use Case, such as the utility to read and write CSV files in C++. The links to these are included to below.
This project has the goal of developing an R-like dataframe class and other data structures (like an R factor data type (called a category for C++) for similar applications for analysis of data written entirely in C++.
The necessary building blocks of this dataframe analog are a Variable composite data structure including vector class for the raw data, for the columns of the dataframe, and a vector class to contain the variable objects. Of note, R arrays are column major memory structures, so this is simulates such a structure to provide at least a virtual variable oriented structure.
The R dataframe is, a list derivative with vectors for the columns as elements of the list.As noted below, this is a bit untenable in C++. The base reference to the elements (rows and columns) is the usual numeric endices. However, associative reference with variable (column) and observation (row) also a feature. This is accomplished with a vector< vector< Node> > > structure.
Being relatively new to C++, this is my first major project in C++. At this point, my breadth and depth of understanding of C++ is growing but is at times an impediment to finding the best way to do things. Each revision provides new experience. Suggestions and comments are welcome.
Initially a list of vectors was the basic plan in keeping with the R structure. However R lists and C++ STL lists are different enough as to handicape any attempt to mimic R structures in C++. Thus a move tp the data.frame in C++ being a vector of vectors.
The principal issue is the need to random access the variables (columns) in the data.frame is the C++ implementation. The C++ list is implemented as a sequentially accessed object to allow access for actions on the elements across the list. The random access of individual list members is not implemented in the list structure.
Excluding the category data type class, all the relevant getters, setters and display functions are implemented and tested in the Unit Test main() module.
See related project, UtilCSV RDataframe7
Author web page, http://crunches-data.appspot.com Email, medma tix@ gmail .com
Copyright 2016 D.A.York