Lecture 22
Duke University
STA 313 - Spring 2026
Mini-project 2 due today at 5 pm
Project 2 peer evaluation 1 due Wednesday at 5 pm – no extensions as we’d like to share summaries before lab the next day
Project 2 presentation schedule: https://vizdata.org/projects/project-2.html#due-dates
Upcoming HW deadlines:
Join SSMU for a seminar featuring Davis Vaughan, a software engineer at Posit, as he asks the timely question: is YOUR degree worth it?! Davis works on Positron, a data science-focused IDE, as well as the R packages that make up the tidyverse. He will walk through examples of how he and his colleagues use Claude Code and other AI tools to amplify their own skills, rather than replace them.
🗓️ 4:30 PM-5:30 PM on Wednesday, April 8th
📍 Old Chem 116
R packages:
More aRtists: rtistry art gallery by Ijeamaka Anyene
A whole course on Generative Art by Danielle Navarro: https://art-from-code.netlify.app
Plot A:

Plot B:

Plot A:

Plot B:

plotnine – a Python data visualization package based on the grammar of graphics and inspired by ggplot2 – and do so in Positron using uv for package management.plotnine and how to create different types of plots.polars to read in data from CSV files, but we will not cover how to prepare your data for visualization in Python.
there are some Python details we can’t avoid…
uvPython’s packaging ecosystem has historically been fragmented:
pip, virtualenv, venv, conda, poetry, pipenv, etc.requirements.txt, setup.py, pyproject.toml, etc.pyenv)uv is a modern tool that aims to unify these concerns with a fast, Rust-based implementation.
uv is a Python package and project manager developed by Astral.
Key features:
pyproject.tomlpip and virtualenvuv is already installed on the departmental servers, for local installs:
On MacOS/Linux:
or with homebrew:
or with pip / pipx
Once installed you should be able to run the following,
As long as you have something higher than 0.9.* you should be fine.
uv can install and manage multiple Python versions,
The pinned version is stored in ~/.python-version and will be used automatically.
Use uv init to create a new project,
Initialized project `my-project`
total 32
drwxr-xr-x 8 mine staff 256 Apr 6 00:22 .
drwx------@ 55 mine staff 1760 Apr 6 00:22 ..
drwxr-xr-x@ 9 mine staff 288 Apr 6 00:22 .git
-rw-r--r--@ 1 mine staff 109 Apr 6 00:22 .gitignore
-rw-r--r--@ 1 mine staff 5 Apr 6 00:22 .python-version
-rw-r--r--@ 1 mine staff 88 Apr 6 00:22 main.py
-rw-r--r--@ 1 mine staff 156 Apr 6 00:22 pyproject.toml
-rw-r--r--@ 1 mine staff 0 Apr 6 00:22 README.md
This creates a pyproject.toml, a sample main.py script, and basic git infrastructure. Generally, we only really care about the pyproject.toml which we can exclusively generate via uv init --bare.
pyproject.tomlModern project metadata file, tracks python version and package dependencies among other details.
Once we have our project setup we can add (and install) dependencies directly via uv. uv add updates pyproject.toml and installs the package (creating a venv if needed).
Using CPython 3.14.3
Creating virtual environment at: .venv
Resolved 19 packages in 188ms
Installed 17 packages in 134ms
+ contourpy==1.3.3
+ cycler==0.12.1
+ fonttools==4.62.1
+ kiwisolver==1.5.0
+ matplotlib==3.10.8
+ mizani==0.14.4
+ numpy==2.4.4
+ packaging==26.0
+ pandas==3.0.2
+ patsy==1.0.2
+ pillow==12.2.0
+ plotnine==0.15.3
+ pyparsing==3.3.2
+ python-dateutil==2.9.0.post0
+ scipy==1.17.1
+ six==1.17.0
+ statsmodels==0.14.6
pyproject.tomlVirtual environments isolate project dependencies from the system Python and other projects. Packages are installed in a local folder in your project.
As we just saw, using uv add will create a new virtual environment in .venv by default if there is not an existing venv.
Positron automatically detects virtual environments in your project directory. When you open a folder containing a .venv directory (created by uv), Positron will:
If not automatically detected, you can manually select the interpreter via the Command Palette (Cmd+Shift+P / Ctrl+Shift+P) and searching for “Python: Select Interpreter”.
uv syncSince the .venv folder is system specific (and large) it is not typically committed to git. Instead you will likely clone a repository that just has a pyproject.toml file.
Use uv sync to construct the venv and install all dependencies for the project.
New project setup:
Go to ae-16 and let’s make a simple plot with plotnine!
For now, ae-16-Python.qmd only.
plotnine is a Python visualization library that implements the grammar of graphics.
A plot is built from layers of components:
| Component | Description |
|---|---|
| Data | The dataset to visualize |
| Aesthetics | Mappings from data to visual properties |
| Geoms | Geometric objects that represent data |
| Scales | Control how data values map to visual values |
| Facets | Split data into multiple subplots |
| Coords | Coordinate system for the plot |
| Themes | Control non-data visual elements |
All plots begin with passing data to ggplot():
Tip
Plotnine works best with tidy data:
The aes() function maps data columns to visual properties:
Common aesthetic mappings:
| Aesthetic | Description |
|---|---|
x, y |
Position on axes |
color |
Color of points/lines |
fill |
Fill color of shapes |
size |
Size of points |
shape |
Shape of points |
alpha |
Transparency |
Note
In Python, wrap the entire plot expression in parentheses () to allow line breaks.
Geoms determine how data is visually represented:
| Geom | Description |
|---|---|
geom_point() |
Scatter plot |
geom_line() |
Line plot |
geom_bar() |
Bar chart |
geom_histogram() |
Histogram |
geom_boxplot() |
Box plot |
geom_smooth() |
Smoothed line |
geom_text() |
Text labels |
geom_segment() |
Line segments |
geom_area() |
Area plot |
geom_density() |
Density plot |
+Components are combined using the + operator:
Note
In Python, move the + to the start of the line and add a line break before +.
Scales customize how data values map to visual values.
Naming pattern: scale_<aesthetic>_<type>
Common scale functions:
scale_x_continuous(), scale_y_continuous() - continuous axesscale_x_log10(), scale_y_log10() - log-transformed axesscale_color_manual(), scale_fill_manual() - custom colorsscale_color_brewer(), scale_fill_brewer() - ColorBrewer palettesFacets split data into multiple subplots:
Coordinate functions specify the plot’s coordinate system:
Themes control non-data visual elements like fonts, colors, and grid lines.
Pre-built themes:
Use labs() to add titles and axis labels:
from plotnine import *
from plotnine.data import mpg
(
ggplot(mpg, aes(x="cty", y="hwy"))
+ geom_point(aes(color="displ"), alpha=0.7)
+ geom_smooth(method="lm", color="blue")
+ scale_color_continuous(cmap_name="viridis")
+ facet_wrap("~drv", ncol=1)
+ labs(
title="City vs Highway MPG",
x="City MPG",
y="Highway MPG",
color="Engine\nDisplacement"
)
+ theme_bw()
+ theme(
figure_size=(3, 6),
legend_position="bottom"
)
)
| ggplot2 (R) | plotnine (Python) |
|---|---|
aes(x = var) |
aes(x="var") (quoted strings) |
+ at end of line |
+ at start of line (inside parens) |
theme(legend.position = ...) |
theme(legend_position=...) (underscores) |
| No parens needed | Wrap in () for multi-line plots |
ggsave() |
.save() method on plot object |
ae-16Go to ae-16 and work on ae-16-R-and-Python.qmd.