OpenML-CC18 Machine Learning Dataset
Date
Publish URL
Tags
Categories

OpenML-CC18 is a comprehensive machine learning dataset. The dataset is complemented by a standardized OpenML-based interface and complementary software toolkits written in Python, Java, and R, demonstrating how to easily perform comprehensive benchmark studies using standardized OpenML-based benchmark suites and complementary software toolkits written in Python, Java, and R. The main notable features of the dataset are ease of use (through methods using standardized data formats, APIs, and existing clients); machine-readable meta-information on the suite content, and online sharing of results. These features enable large-scale comparisons. The dataset is a machine learning benchmark suite consisting of 72 classification datasets, which are carefully selected from thousands of datasets on OpenML.