Gatlen's Opinionated Template (GOTem)

Cutting-edge, opinionated, and ambitious project builder for power users and researchers. Built on (and synced with) the foundation of CookieCutter Data Science (CCDS) V2, this template incorporates carefully selected defaults, dependency stack, customizations, and contemporary best practices for Python development, research projects, and academic work.

Quickstart

Gatlen's Opinionated Template (GOTem) works on all platforms w/ Python 3.10+. While I try my best to keep in-sync with the upstream CookieCutter Data Science (CCDS) V2, being a one-man-maintainer on this project means I may neglect features I tend not to use and deviations from what I use may not receive as thorough testing. If you wish to change any of the defaults, I recommend forking this project.

I recommend installing gotem it with uv. GOTem is Available on PyPi here.

With uv (recommended)

```bash
uv pip install gatlens-opinionated-template

# From the parent directory where you want your project
gotem
```

With pipx

```bash
pipx install gatlens-opinionated-template

# From the parent directory where you want your project
gotem
```

With pip

```bash
pip install gatlens-opinionated-template

# From the parent directory where you want your project
gotem
```

With conda (coming soon!)

```bash
# conda install cookiecutter-data-science -c conda-forge

# From the parent directory where you want your project
# ccds
```

Starting a new project

Starting a new project is as easy as running this command at the command line. No need to create a directory first, the cookiecutter will do it for you.

gotem

The gotem commandline tool defaults to the GOTem template, but you can pass your own template as the first argument if you want. The CCDS team has built significant tooling around Cookiecutter to make it easier to use and more customizable.

Example

ccds https://github.com/drivendataorg/cookiecutter-data-science project_name (project_name):My Analysis repo_name (my_analysis):my_analysis module_name (my_analysis): author_name (Your name (or your organization/company/team)):Dat A. Scientist description (A short description of the project.):This is my analysis of the data. python_version_number (3.12):3.12 Select dataset_storage 1 - none 2 - azure 3 - s3 4 - gcs Choose from [1/2/3/4] (1):3 bucket (bucket-name):s3://my-aws-bucket aws_profile (default): Select environment_manager 1 - uv 2 - none 3 - virtualenv 4 - conda 5 - pipenv Choose from [1/2/3/4/5] (1):2 Select dependency_file 1 - requirements.txt 2 - environment.yml 3 - Pipfile Choose from [1/2/3] (1):1 Select pydata_packages 1 - basic 2 - none Choose from [1/2] (1):2 Select open_source_license 1 - No license file 2 - MIT 3 - BSD-3-Clause Choose from [1/2/3] (1):2 Select docs 1 - mkdocs 2 - none Choose from [1/2] (1):1

Now that you've got your project, you're ready to go! You should do the following:

Check out the directory structure below so you know what's in the project and how to use it.
Read the opinions that are baked into the project so you understand best practices and the philosophy behind the project structure.
Read the using the template guide to understand how to get started on a project that uses the template.

Enjoy!

Directory structure

The directory structure of your new project will look something like this (depending on the settings that you choose):

📁 .
├── ⚙️ .cursorrules                    <- LLM instructions for Cursor IDE
├── 💻 .devcontainer                   <- Devcontainer config
├── ⚙️ .gitattributes                  <- GIT-LFS Setup Configuration
├── 🧑‍💻 .github
│   ├── ⚡️ actions
│   │   └── 📁 setup-python-env       <- Automated python setup w/ uv
│   ├── 💡 ISSUE_TEMPLATE             <- Templates for Raising Issues on GH
│   ├── 💡 pull_request_template.md   <- Template for making GitHub PR
│   └── ⚡️ workflows                  
│       ├── 🚀 main.yml               <- Automated cross-platform testing w/ uv, precommit, deptry, 
│       └── 🚀 on-release-main.yml    <- Automated mkdocs updates
├── 💻 .vscode                        <- Preconfigured extensions, debug profiles, workspaces, and tasks for VSCode/Cursor powerusers
│   ├── 🚀 launch.json
│   ├── ⚙️ settings.json
│   ├── 📋 tasks.json
│   └── ⚙️ '{{ cookiecutter.repo_name }}.code-workspace'
├── 📁 data
│   ├── 📁 external                      <- Data from third party sources
│   ├── 📁 interim                       <- Intermediate data that has been transformed
│   ├── 📁 processed                     <- The final, canonical data sets for modeling
│   └── 📁 raw                           <- The original, immutable data dump
├── 🐳 docker                            <- Docker configuration for reproducability
├── 📚 docs                              <- Project documentation (using mkdocs)
├── 👩‍⚖️ LICENSE                           <- Open-source license if one is chosen
├── 📋 logs                              <- Preconfigured logging directory for
├── 👷‍♂️ Makefile                          <- Makefile with convenience commands (PyPi publishing, formatting, testing, and more)
├── 🚀 Taskfile.yml                    <- Modern alternative to Makefile w/ same functionality
├── 📁 notebooks                         <- Jupyter notebooks
│   ├── 📓 01_name_example.ipynb
│   └── 📰 README.md
├── 🗑️ out
│   ├── 📁 features                      <- Extracted Features
│   ├── 📁 models                        <- Trained and serialized models
│   └── 📚 reports                       <- Generated analysis
│       └── 📊 figures                   <- Generated graphics and figures
├── ⚙️ pyproject.toml                     <- Project configuration file w/ carefully selected dependency stacks
├── 📰 README.md                         <- The top-level README
├── 🔒 secrets                           <- Ignored project-level secrets directory to keep API keys and SSH keys safe and separate from your system (no setting up a new SSH-key in ~/.ssh for every project)
│   └── ⚙️ schema                         <- Clearly outline expected variables
│       ├── ⚙️ example.env
│       └── 🔑 ssh
│           ├── ⚙️ example.config.ssh
│           ├── 🔑 example.something.key
│           └── 🔑 example.something.pub
└── 🚰 '{{ cookiecutter.module_name }}'  <- Easily publishable source code
    ├── ⚙️ config.py                     <- Store useful variables and configuration (Preset)
    ├── 🐍 dataset.py                    <- Scripts to download or generate data
    ├── 🐍 features.py                   <- Code to create features for modeling
    ├── 📁 modeling
    │   ├── 🐍 __init__.py
    │   ├── 🐍 predict.py               <- Code to run model inference with trained models
    │   └── 🐍 train.py                 <- Code to train models
    └── 🐍 plots.py                     <- Code to create visualizations