{ "cells": [ { "cell_type": "markdown", "metadata": { "nbsphinx": "hidden" }, "source": [ "It looks like you might be running this notebook in Colab! If you want to enable GPU acceleration, ensure you select a GPU runtime in the top-right dropdown menu 🔥" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Train a model\n", "\n", "> **FYI**, you can open this documentation as a [Google Colab notebook](https://colab.research.google.com/github/jla-gardner/graph-pes/blob/main/docs/source/quickstart/quickstart.ipynb) to follow along interactively\n", "\n", "[graph-pes-train](https://jla-gardner.github.io/graph-pes/cli/graph-pes-train/root.html) provides a unified interface to train any [GraphPESModel](https://jla-gardner.github.io/graph-pes/models/root.html#graph_pes.GraphPESModel), including those packaged within [graph_pes.models](https://jla-gardner.github.io/graph-pes/models/root.html), custom ones defined by you, and any of the wrapper interfaces that ``graph-pes`` provides to other machine learning frameworks.\n", "\n", "For more information on the ``graph-pes-train`` command, and the plethora of options available for specification in your ``config.yaml`` see the [CLI reference](https://jla-gardner.github.io/graph-pes/cli/graph-pes-train/root.html).\n", "\n", "Below, we train a lightweight [NequIP](https://jla-gardner.github.io/graph-pes/models/many-body/nequip.html) model on the [C-GAP-17](https://jla-gardner.github.io/load-atoms/datasets/C-GAP-17.html) dataset.\n", "\n", "## Installation\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Successfully installed graph-pes-0.0.34\n" ] } ], "source": [ "!pip install graph-pes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We should now have access to the ``graph-pes-train`` command. We can check this by running:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "usage: graph-pes-train [-h] [args ...]\n", "\n", "Train a GraphPES model using PyTorch Lightning.\n", "\n", "positional arguments:\n", " args Config files and command line specifications. Config files\n", " should be YAML (.yaml/.yml) files. Command line specifications\n", " should be in the form my/nested/key=value. Final config is built\n", " up from these items in a left to right manner, with later items\n", " taking precedence over earlier ones in the case of conflicts.\n", "\n", "optional arguments:\n", " -h, --help show this help message and exit\n", "\n", "Copyright 2023-25, John Gardner\n" ] } ], "source": [ "!graph-pes-train -h" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data definition\n", "\n", "When training a model, we typically want 3 sets of data (i.e. labelled atomic structures): a training set, a validation set, and a test set.\n", "\n", "Below, we use [load-atoms](https://jla-gardner.github.io/load-atoms/) to download and split the C-GAP-17 dataset into training, validation and test datasets, and write these to `xyz` files. (``graph-pes`` supports other dataset formats too, including ``ase sqlite`` databases -- see [here](https://jla-gardner.github.io/graph-pes/data/datasets.html#useful-datasets) for more details)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "4c9284a1589c41feb63442c9f5e05a53", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n" ], "text/plain": [] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import ase.io\n", "from load_atoms import load_dataset\n", "\n", "structures = load_dataset(\"C-GAP-17\")\n", "train, val, test = structures.random_split([0.8, 0.1, 0.1])\n", "\n", "ase.io.write(\"train-cgap17.xyz\", train)\n", "ase.io.write(\"val-cgap17.xyz\", val)\n", "ase.io.write(\"test-cgap17.xyz\", test)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can visualise the kinds of structures we're training on using [load_atoms.view](https://jla-gardner.github.io/load-atoms/api/viz.html):" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", "