{
 "cells": [
  {
   "source": [
    "---\n",
    "description: Create your Continual Learning Benchmark and Start Prototyping\n",
    "---\n",
    "\n",
    "# Benchmarks\n",
    "\n",
    "Welcome to the \"_benchmarks_\" tutorial of the \"_From Zero to Hero_\" series. In this part we will present the functionalities offered by the `Benchmarks` module."
   ],
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true,
    "pycharm": {
     "name": "#%% md\n"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "!pip install git+https://github.com/ContinualAI/avalanche.git"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "## 🎯 Nomenclature\n",
    "\n",
    "First off, let's clarify a bit the nomenclature we are going to use, introducing the following terms: `Datasets`, `Scenarios`, `Benchmarks` and `Generators`.\n",
    "\n",
    "* By `Dataset` we mean a **collection of examples** that can be used for training or testing purposes but not already organized to be processed as a stream of batches or tasks. Since Avalanche is based on Pytorch, our Datasets are [torch.utils.Datasets](https://pytorch.org/docs/stable/_modules/torch/utils/data/dataset.html#Dataset) objects.\n",
    "* By `Scenario` we mean a **particular setting**, i.e. specificities about the continual stream of data, a continual learning algorithm will face.\n",
    "* By `Benchmark` we mean a well-defined and carefully thought **combination of a scenario with one or multiple datasets** that we can use to asses our continual learning algorithms.\n",
    "* By `Generator` we mean a function that **given a specific scenario and a dataset can generate a Benchmark**.\n",
    "\n",
    "## 📚 The Benchmarks Module\n",
    "\n",
    "The `bechmarks` module offers 3 types of utils:\n",
    "\n",
    "* **Datasets**: all the Pytorch datasets plus additional ones prepared by our community and particularly interesting for continual learning.\n",
    "* **Classic Benchmarks**: classic benchmarks used in CL litterature ready to be used with great flexibility.\n",
    "* **Benchmarks Generators**: a set of functions you can use to create your own benchmark starting from any kind of data and scenario. In particular, we distinguish two type of generators: `Specific` and `Generic`. The first ones will let you create a benchmark based on a clear scenarios and Pytorch dataset\\(s\\); the latters, instead, are more generic and flexible, both in terms of scenario definition then in terms of type of data they can manage.\n",
    "  * _Specific_:\n",
    "    * **nc\\_benchmark**: given one or multiple datasets it creates a benchmark instance based on scenarios where _New Classes_ \\(NC\\) are encountered over time. Notable scenarios that can be created using this utility include _Class-Incremental_, _Task-Incremental_ and _Task-Agnostic_ scenarios.\n",
    "    * **ni\\_benchmark**: it creates a benchmark instance based on scenarios where _New Instances_ \\(NI\\), i.e. new examples of the same classes are encountered over time. Notable scenarios that can be created using this utility include _Domain-Incremental_ scenarios.\n",
    "  * _Generic_:\n",
    "    * **filelist\\_benchmark**: It creates a benchmark instance given a list of filelists.\n",
    "    * **paths\\_benchmark**:  It creates a benchmark instance given a list of file paths and class labels.\n",
    "    * **tensors\\_benchmark**: It creates a benchmark instance given a list of tensors.\n",
    "    * **dataset\\_benchmark**: It creates a benchmark instance given a list of pytorch datasets.\n",
    "\n",
    "But let's see how we can use this module in practice!\n",
    "\n",
    "## 🖼️ Datasets\n",
    "\n",
    "Let's start with the `Datasets`. As we previously hinted, in _Avalanche_ you'll find all the standard Pytorch Datasets available in the torchvision package as well as a few others that are useful for continual learning but not already officially available within the Pytorch ecosystem."
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "standard-kruger",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "import torch\n",
    "import torchvision\n",
    "from avalanche.benchmarks.datasets import MNIST, FashionMNIST, KMNIST, EMNIST, \\\n",
    "QMNIST, FakeData, CocoCaptions, CocoDetection, LSUN, ImageNet, CIFAR10, \\\n",
    "CIFAR100, STL10, SVHN, PhotoTour, SBU, Flickr8k, Flickr30k, VOCDetection, \\\n",
    "VOCSegmentation, Cityscapes, SBDataset, USPS, Kinetics400, HMDB51, UCF101, \\\n",
    "CelebA, CORe50, TinyImagenet, CUB200, OpenLORIS\n",
    "\n",
    "# As we would simply do with any Pytorch dataset we can create the train and \n",
    "# test sets from it. We could use any of the above imported Datasets, but let's\n",
    "# just try to use the standard MNIST.\n",
    "train_MNIST = MNIST(\n",
    "    './data/mnist', train=True, download=True, transform=torchvision.transforms.ToTensor()\n",
    ")\n",
    "test_MNIST = MNIST(\n",
    "    './data/mnist', train=False, download=True, transform=torchvision.transforms.ToTensor()\n",
    ")\n",
    "\n",
    "# Given these two sets we can simply iterate them to get the examples one by one\n",
    "for i, example in enumerate(train_MNIST):\n",
    "    pass\n",
    "print(\"Num. examples processed: {}\".format(i))\n",
    "\n",
    "# or use a Pytorch DataLoader\n",
    "train_loader = torch.utils.data.DataLoader(\n",
    "    train_MNIST, batch_size=32, shuffle=True\n",
    ")\n",
    "for i, (x, y) in enumerate(train_loader):\n",
    "    pass\n",
    "print(\"Num. mini-batch processed: {}\".format(i))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "employed-psychiatry",
   "metadata": {},
   "source": [
    "Of course also the basic utilities `ImageFolder` and `DatasetFolder` can be used. These are two classes that you can use to create a Pytorch Dataset directly from your files \\(following a particular structure\\). You can read more about these in the Pytorch official documentation [here](https://pytorch.org/docs/stable/torchvision/datasets.html#imagefolder).\n",
    "\n",
    "We also provide an additional `FilelistDataset` and `AvalancheDataset` classes. The former to construct a dataset from a filelist [\\(caffe style\\)](https://ceciliavision.wordpress.com/2016/03/08/caffedata-layer/) pointing to files anywhere on the disk. The latter to augment the basic Pytorch Dataset functionalities with an extention to better deal with a stack of transformations to be used during train and test."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "lesser-coordinator",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "from avalanche.benchmarks.utils import ImageFolder, DatasetFolder, FilelistDataset, AvalancheDataset"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cosmetic-zoning",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "source": [
    "## 🛠️ Benchmarks Basics\n",
    "\n",
    "The _Avalanche_ benchmarks \\(instances of the _Scenario_ class\\), contains several attributes that characterize the benchmark. However, the most important ones are the `train` and `test streams`.\n",
    "\n",
    "In _Avalanche_ we often suppose to have access to these **two parallel stream of data** \\(even though some benchmarks may not provide such feature, but contain just a unique test set\\).\n",
    "\n",
    "Each of these `streams` are _iterable_, _indexable_ and _sliceable_ objects that are composed of unique **experiences**. Experiences are batch of data \\(or \"_tasks_\"\\) that can be provided with or without a specific task label.\n",
    "\n",
    "#### Efficiency\n",
    "\n",
    "It is worth mentioning that all the data belonging to a _stream_ are not loaded into the RAM beforehand. Avalanche actually loads the data when a specific _mini-batches_ are requested at training/test time based on the policy defined by each `Dataset` implementation.\n",
    "\n",
    "This means that memory requirements are very low, while the speed is guaranteed by a multi-processing data loading system based on the one defined in Pytorch.\n",
    "\n",
    "#### Scenarios\n",
    "\n",
    "So, as we have seen, each `scenario` object in _Avalanche_ has several useful attributes that characterizes the benchmark, including the two important `train` and `test streams`. Let's check what you can get from a scenario object more in details:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "commercial-duplicate",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "from avalanche.benchmarks.classic import SplitMNIST\n",
    "split_mnist = SplitMNIST(n_experiences=5, seed=1)\n",
    "\n",
    "# Original train/test sets\n",
    "print('--- Original datasets:')\n",
    "print(split_mnist.original_train_dataset)\n",
    "print(split_mnist.original_test_dataset)\n",
    "\n",
    "# A list describing which training patterns are assigned to each experience.\n",
    "# Patterns are identified by their id w.r.t. the dataset found in the\n",
    "# original_train_dataset field.\n",
    "print('--- Train patterns assignment:')\n",
    "print(split_mnist.train_exps_patterns_assignment)\n",
    "\n",
    "# A list describing which test patterns are assigned to each experience.\n",
    "# Patterns are identified by their id w.r.t. the dataset found in the\n",
    "# original_test_dataset field\n",
    "print('--- Test patterns assignment:')\n",
    "print(split_mnist.test_exps_patterns_assignment)\n",
    "\n",
    "# the task label of each experience.\n",
    "print('--- Task labels:')\n",
    "print(split_mnist.task_labels)\n",
    "\n",
    "# train and test streams\n",
    "print('--- Streams:')\n",
    "print(split_mnist.train_stream)\n",
    "print(split_mnist.test_stream)\n",
    "\n",
    "# A list that, for each experience (identified by its index/ID),\n",
    "# stores a set of the (optionally remapped) IDs of classes of patterns\n",
    "# assigned to that experience.\n",
    "print('--- Classes in each experience:')\n",
    "split_mnist.classes_in_experience"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "broken-juice",
   "metadata": {},
   "source": [
    "#### Train and Test Streams\n",
    "\n",
    "The _train_ and _test streams_ can be used for training and testing purposes, respectively. This is what you can do with these streams:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "intelligent-bristol",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "# each stream has a name: \"train\" or \"test\"\n",
    "train_stream = split_mnist.train_stream\n",
    "print(train_stream.name)\n",
    "\n",
    "# we have access to the scenario from which the stream was taken\n",
    "train_stream.benchmark\n",
    "\n",
    "# we can slice and reorder the stream as we like!\n",
    "substream = train_stream[0]\n",
    "substream = train_stream[0:2]\n",
    "substream = train_stream[0,2,1]\n",
    "\n",
    "len(substream)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "reserved-browser",
   "metadata": {},
   "source": [
    "#### Experiences\n",
    "\n",
    "Each stream can in turn be treated as an iterator that produces a unique `experience`, containing all the useful data regarding a _batch_ or _task_ in the continual stream our algorithms will face. Check out how can you use these experiences below:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "velvet-entry",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "# we get the first experience\n",
    "experience = train_stream[0]\n",
    "\n",
    "# task label and dataset are the main attributes\n",
    "t_label = experience.task_label\n",
    "dataset = experience.dataset\n",
    "\n",
    "# but you can recover additional info\n",
    "experience.current_experience\n",
    "experience.classes_in_this_experience\n",
    "experience.classes_seen_so_far\n",
    "experience.previous_classes\n",
    "experience.future_classes\n",
    "experience.origin_stream\n",
    "experience.benchmark\n",
    "\n",
    "# As always, we can iterate over it normally or with a pytorch\n",
    "# data loader.\n",
    "# For instance, we can use tqdm to add a progress bar.\n",
    "from tqdm import tqdm\n",
    "for i, data in enumerate(tqdm(dataset)):\n",
    "  pass\n",
    "print(\"\\nNumber of examples:\", i + 1)\n",
    "print(\"Task Label:\", t_label)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "spatial-chase",
   "metadata": {},
   "source": [
    "## 🏛️ Classic Benchmarks\n",
    "\n",
    "Now that we know how our benchmarks work in general through scenarios, streams and experiences objects, in this section we are going to explore **common benchmarks** already available for you with one line of code yet flexible enough to allow proper tuning based on your needs:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "phantom-musician",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "from avalanche.benchmarks.classic import CORe50, SplitTinyImageNet, \\\n",
    "SplitCIFAR10, SplitCIFAR100, SplitCIFAR110, SplitMNIST, RotatedMNIST, \\\n",
    "PermutedMNIST, SplitCUB200, SplitImageNet\n",
    "\n",
    "# creating PermutedMNIST (Task-Incremental)\n",
    "perm_mnist = PermutedMNIST(\n",
    "    n_experiences=2,\n",
    "    seed=1234,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bound-aquatic",
   "metadata": {},
   "source": [
    "Many of the classic benchmarks will download the original datasets they are based on automatically and put it under the `\"~/.avalanche/data\"` directory.\n",
    "\n",
    "### How to Use the Benchmarks\n",
    "\n",
    "Let's see now how we can use the classic benchmark or the ones that you can create through the generators \\(see next section\\). For example, let's try out the classic `PermutedMNIST` benchmark \\(_Task-Incremental_ scenario\\)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "informal-disability",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "# creating the benchmark instance (scenario object)\n",
    "perm_mnist = PermutedMNIST(\n",
    "  n_experiences=3,\n",
    "  seed=1234,\n",
    ")\n",
    "\n",
    "# recovering the train and test streams\n",
    "train_stream = perm_mnist.train_stream\n",
    "test_stream = perm_mnist.test_stream\n",
    "\n",
    "# iterating over the train stream\n",
    "for experience in train_stream:\n",
    "  print(\"Start of task \", experience.task_label)\n",
    "  print('Classes in this task:', experience.classes_in_this_experience)\n",
    "\n",
    "  # The current Pytorch training set can be easily recovered through the\n",
    "  # experience\n",
    "  current_training_set = experience.dataset\n",
    "  # ...as well as the task_label\n",
    "  print('Task {}'.format(experience.task_label))\n",
    "  print('This task contains', len(current_training_set), 'training examples')\n",
    "\n",
    "  # we can recover the corresponding test experience in the test stream\n",
    "  current_test_set = test_stream[experience.current_experience].dataset\n",
    "  print('This task contains', len(current_test_set), 'test examples')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "coral-characterization",
   "metadata": {},
   "source": [
    "## 🐣 Benchmarks Generators\n",
    "\n",
    "What if we want to create a new benchmark that is not present in the \"_Classic_\" ones? Well, in that case _Avalanche_ offer a number of utilites that you can use to create your own benchmark with maximum flexibility: the **benchmarks generators**!\n",
    "\n",
    "### Specific Generators\n",
    "\n",
    "The _specific_ scenario generators are useful when starting from one or multiple Pytorch datasets you want to create a \"**New Instances**\" or \"**New Classes**\" benchmark: i.e. it supports the easy and flexible creation of a _Domain-Incremental_, _Class-Incremental or Task-Incremental_ scenarios among others.\n",
    "\n",
    "For the **New Classes** scenario you can use the following function:\n",
    "\n",
    "* `nc_benchmark`\n",
    "\n",
    "for the **New Instances**:\n",
    "\n",
    "* `ni_benchmark`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "lightweight-favor",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "from avalanche.benchmarks.generators import nc_benchmark, ni_benchmark"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "hairy-fancy",
   "metadata": {},
   "source": [
    "Let's start by creating the MNIST dataset object as we would normally do in Pytorch:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "classified-vertex",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "from torchvision.transforms import Compose, ToTensor, Normalize, RandomCrop\n",
    "train_transform = Compose([\n",
    "    RandomCrop(28, padding=4),\n",
    "    ToTensor(),\n",
    "    Normalize((0.1307,), (0.3081,))\n",
    "])\n",
    "\n",
    "test_transform = Compose([\n",
    "    ToTensor(),\n",
    "    Normalize((0.1307,), (0.3081,))\n",
    "])\n",
    "\n",
    "mnist_train = MNIST(\n",
    "    './data/mnist', train=True, download=True, transform=train_transform\n",
    ")\n",
    "mnist_test = MNIST(\n",
    "    './data/mnist', train=False, download=True, transform=test_transform\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "isolated-morning",
   "metadata": {},
   "source": [
    "Then we can, for example, create a new benchmark based on MNIST and the classic _Domain-Incremental_ scenario:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "modified-bradley",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "scenario = ni_benchmark(\n",
    "    mnist_train, mnist_test, n_experiences=10, shuffle=True, seed=1234,\n",
    "    balance_experiences=True\n",
    ")\n",
    "\n",
    "train_stream = scenario.train_stream\n",
    "\n",
    "for experience in train_stream:\n",
    "    t = experience.task_label\n",
    "    exp_id = experience.current_experience\n",
    "    training_dataset = experience.dataset\n",
    "    print('Task {} batch {} -> train'.format(t, exp_id))\n",
    "    print('This batch contains', len(training_dataset), 'patterns')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "german-actress",
   "metadata": {},
   "source": [
    "Or, we can create a benchmark based on MNIST and the _Class-Incremental_ \\(what's commonly referred to as \"_Split-MNIST_\" benchmark\\):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "special-beginning",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "scenario = nc_benchmark(\n",
    "    mnist_train, mnist_test, n_experiences=10, shuffle=True, seed=1234,\n",
    "    task_labels=False\n",
    ")\n",
    "\n",
    "train_stream = scenario.train_stream\n",
    "\n",
    "for experience in train_stream:\n",
    "    t = experience.task_label\n",
    "    exp_id = experience.current_experience\n",
    "    training_dataset = experience.dataset\n",
    "    print('Task {} batch {} -> train'.format(t, exp_id))\n",
    "    print('This batch contains', len(training_dataset), 'patterns')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dried-award",
   "metadata": {},
   "source": [
    "### Generic Generators\n",
    "\n",
    "Finally, if you cannot create your ideal benchmark since it does not fit well in the aforementioned _new classes_ or _new instances_ scenarios, you can always use our **generic generators**:\n",
    "\n",
    "* **filelist\\_benchmark**\n",
    "* **paths\\_benchmark**\n",
    "* **dataset\\_benchmark**\n",
    "* **tensors\\_benchmark**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "filled-machinery",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "from avalanche.benchmarks.generators import filelist_benchmark, dataset_benchmark, \\\n",
    "                                            tensors_benchmark, paths_benchmark"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "advance-badge",
   "metadata": {},
   "source": [
    "Let's start with the `filelist_benchmark` utility. This function is particularly useful when it is important to preserve a particular order of the patterns to be processed \\(for example if they are frames of a video\\), or in general if we have data scattered around our drive and we want to create a sequence of batches/tasks providing only a txt file containing the list of their paths.\n",
    "\n",
    "For _Avalanche_ we follow the same format of the _Caffe_ filelists \\(\"_path_ _class\\_label_\"\\):\n",
    "\n",
    "/path/to/a/file.jpg 0  \n",
    "/path/to/another/file.jpg 0  \n",
    "...  \n",
    "/path/to/another/file.jpg M  \n",
    "/path/to/another/file.jpg M  \n",
    "...  \n",
    "/path/to/another/file.jpg N  \n",
    "/path/to/another/file.jpg N  \n",
    "\n",
    "\n",
    "So let's download the classic \"_Cats vs Dogs_\" dataset as an example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "accessible-italian",
   "metadata": {},
   "outputs": [],
   "source": [
    "!wget -N --no-check-certificate \\\n",
    "    https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip\n",
    "!unzip -q -o cats_and_dogs_filtered.zip"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "minus-forty",
   "metadata": {},
   "source": [
    "You can now see in the `content` directory on colab the image we downloaded. We are now going to create the filelists and then use the `filelist_benchmark` function to create our benchmark:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "specified-muslim",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "import os\n",
    "# let's create the filelists since we don't have it\n",
    "dirpath = \"cats_and_dogs_filtered/train\"\n",
    "\n",
    "for filelist, rel_dir, t_label in zip(\n",
    "        [\"train_filelist_00.txt\", \"train_filelist_01.txt\"],\n",
    "        [\"cats\", \"dogs\"],\n",
    "        [0, 1]):\n",
    "    # First, obtain the list of files\n",
    "    filenames_list = os.listdir(os.path.join(dirpath, dir))\n",
    "\n",
    "    # Create the text file containing the filelist\n",
    "    # Filelists must be in Caffe-style, which means\n",
    "    # that they must define path in the format:\n",
    "    #\n",
    "    # relative_path_img1 class_label_first_img\n",
    "    # relative_path_img2 class_label_second_img\n",
    "    # ...\n",
    "    #\n",
    "    # For instance:\n",
    "    # cat/cat_0.png 1\n",
    "    # dog/dog_54.png 0\n",
    "    # cat/cat_3.png 1\n",
    "    # ...\n",
    "    # \n",
    "    # Paths are relative to a root path\n",
    "    # (specified when calling filelist_benchmark)\n",
    "    with open(filelist, \"w\") as wf:\n",
    "        for name in filenames_list:\n",
    "            wf.write(\n",
    "                \"{} {}\\n\".format(os.path.join(rel_dir, name), t_label)\n",
    "            )\n",
    "\n",
    "# Here we create a GenericCLScenario ready to be iterated\n",
    "generic_scenario = filelist_benchmark(\n",
    "   dirpath,  \n",
    "   [\"train_filelist_00.txt\", \"train_filelist_01.txt\"],\n",
    "   [\"train_filelist_00.txt\"],\n",
    "   task_labels=[0, 0],\n",
    "   complete_test_set_only=True,\n",
    "   train_transform=ToTensor(),\n",
    "   eval_transform=ToTensor()\n",
    ")"
   ]
  },
  {
   "source": [
    "In the previous cell we created a benchmark instance starting from file lists. However, `paths_benchmark` is a better choice if you already have the list of paths directly loaded in memory:"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "train_experiences = []\n",
    "for rel_dir, label in zip(\n",
    "        [\"cats\", \"dogs\"],\n",
    "        [0, 1]):\n",
    "    # First, obtain the list of files\n",
    "    filenames_list = os.listdir(os.path.join(dirpath, dir))\n",
    "\n",
    "    # Don't create a file list: instead, we create a list of \n",
    "    # paths + class labels\n",
    "    experience_paths = []\n",
    "    for name in filenames_list:\n",
    "      instance_tuple = (os.path.join(dirpath, rel_dir, name), label)\n",
    "      experience_paths.append(instance_tuple)\n",
    "    train_experiences.append(experience_paths)\n",
    "\n",
    "# Here we create a GenericCLScenario ready to be iterated\n",
    "generic_scenario = paths_benchmark(\n",
    "   train_experiences,\n",
    "   [train_experiences[0]],  # Single test set\n",
    "   task_labels=[0, 0],\n",
    "   complete_test_set_only=True,\n",
    "   train_transform=ToTensor(),\n",
    "   eval_transform=ToTensor()\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "major-strength",
   "metadata": {},
   "source": [
    "Let us see how we can use the `dataset_benchmark` utility, where we can use several PyTorch datasets as different batches or tasks. This utility expectes a list of datasets for the train, test (and other custom) streams. Each dataset will be used to create an experience:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "accredited-aside",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "train_cifar10 = CIFAR10(\n",
    "    './data/cifar10', train=True, download=True\n",
    ")\n",
    "test_cifar10 = CIFAR10(\n",
    "    './data/cifar10', train=False, download=True\n",
    ")\n",
    "\n",
    "generic_scenario = dataset_benchmark(\n",
    "    [train_MNIST, train_cifar10],\n",
    "    [test_MNIST, test_cifar10]\n",
    ")"
   ]
  },
  {
   "source": [
    "Adding task labels can be achieved by wrapping each datasets using `AvalancheDataset`. Apart from task labels, `AvalancheDataset` allows for more control over transformations and offers an ever growing set of utilities (check the documentation for more details)."
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Alternatively, task labels can also be a list (or tensor)\n",
    "# containing the task label of each pattern\n",
    "\n",
    "train_MNIST_task0 = AvalancheDataset(train_cifar10, task_labels=0)\n",
    "test_MNIST_task0 = AvalancheDataset(test_cifar10, task_labels=0)\n",
    "\n",
    "train_cifar10_task1 = AvalancheDataset(train_cifar10, task_labels=1)\n",
    "test_cifar10_task1 = AvalancheDataset(test_cifar10, task_labels=1)\n",
    "\n",
    "scenario_custom_task_labels = dataset_benchmark(\n",
    "    [train_MNIST_task0, train_cifar10_task1],\n",
    "    [test_MNIST_task0, test_cifar10_task1]\n",
    ")\n",
    "\n",
    "print('Without custom task labels:',\n",
    "      generic_scenario.train_stream[1].task_label)\n",
    "\n",
    "print('With custom task labels:',\n",
    "      scenario_custom_task_labels.train_stream[1].task_label)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "premier-desktop",
   "metadata": {},
   "source": [
    "And finally, the `tensors_benchmark` generator:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "absolute-seeking",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "pattern_shape = (3, 32, 32)\n",
    "\n",
    "# Definition of training experiences\n",
    "# Experience 1\n",
    "experience_1_x = torch.zeros(100, *pattern_shape)\n",
    "experience_1_y = torch.zeros(100, dtype=torch.long)\n",
    "\n",
    "# Experience 2\n",
    "experience_2_x = torch.zeros(80, *pattern_shape)\n",
    "experience_2_y = torch.ones(80, dtype=torch.long)\n",
    "\n",
    "# Test experience\n",
    "# For this example we define a single test experience,\n",
    "# but \"tensors_benchmark\" allows you to define even more than one!\n",
    "test_x = torch.zeros(50, *pattern_shape)\n",
    "test_y = torch.zeros(50, dtype=torch.long)\n",
    "\n",
    "generic_scenario = tensors_benchmark(\n",
    "    train_tensors=[(experience_1_x, experience_1_y), (experience_2_x, experience_2_y)],\n",
    "    test_tensors=[(test_x, test_y)],\n",
    "    task_labels=[0, 0],  # Task label of each train exp\n",
    "    complete_test_set_only=True\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "excessive-lobby",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "This completes the \"_Benchmark_\" tutorial for the \"_From Zero to Hero_\" series. We hope you enjoyed it!\n",
    "\n",
    "## 🤝 Run it on Google Colab\n",
    "\n",
    "You can run _this chapter_ and play with it on Google Colaboratory: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ContinualAI/colab/blob/master/notebooks/avalanche/2.-benchmarks.ipynb)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}