
A popular dataset for binary classification tasks, predicting the presence of heart disease based on patient attributes.

Classic introductory dataset for predicting survival on the Titanic. Good for learning data cleaning and feature engineering.

A collection of 60,000 32x32 color images in 10 classes, with 6,000 images per class. Widely used for image classification.

Dataset for binary sentiment classification, containing a set of 25,000 highly polar movie reviews for training, and 25,000 for testing.

Comprehensive open-source database on terrorist events around the world from 1970 through 2017 (and often updated).

Highly imbalanced dataset containing transactions made by credit cards in September 2013 by European cardholders.

Famous dataset for multiclass classification. Contains 3 classes of 50 instances each, where each class refers to a type of iris plant.

A large database of handwritten digits that is commonly used for training various image processing systems.

A dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes.

Two datasets are included, related to red and white vinho verde wine samples from the north of Portugal. The goal is to model wine quality based on physicochemical tests.