Skip to content

Commit b462eaf

Browse files
committed
added bonus notebook on curve regression
1 parent 12d2ff0 commit b462eaf

3 files changed

Lines changed: 193 additions & 115 deletions

File tree

Lines changed: 191 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,191 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "code",
5+
"execution_count": null,
6+
"metadata": {},
7+
"outputs": [],
8+
"source": [
9+
"import pymc3 as pm\n",
10+
"import matplotlib.pyplot as plt\n",
11+
"import numpy as np\n",
12+
"from utils import ECDF, despine_traceplot\n",
13+
"from data import load_decay\n",
14+
"import pandas as pd\n",
15+
"import theano.tensor as tt\n",
16+
"\n",
17+
"%load_ext autoreload\n",
18+
"%autoreload 2\n",
19+
"%matplotlib inline\n",
20+
"%config InlineBackend.figure_format = 'retina'"
21+
]
22+
},
23+
{
24+
"cell_type": "markdown",
25+
"metadata": {},
26+
"source": [
27+
"# Introduction\n",
28+
"\n",
29+
"Now that you've learned about Bayesian estimation, we're going to explore one more topic: Bayesian curve fitting.\n",
30+
"\n",
31+
"By \"curve fitting\", we're really talking about any curve: those that are bendy, those that are straight, and those that are in between. \n",
32+
"\n",
33+
"In order to reinforce this point, rather than show you plain vanilla linear regression, we will work through an exponential decay curve example.\n",
34+
"\n",
35+
"# Problem Setup\n",
36+
"\n",
37+
"You've taken radioactive decay measurements of an unknown element in a secure facility. The measurements are noisy, though, and potentially have some bias. In the face of this, we would like to be able to characterize the decay constant of this unknown material, potentially leading to an identification of the material.\n",
38+
"\n",
39+
"Let's load in the data."
40+
]
41+
},
42+
{
43+
"cell_type": "code",
44+
"execution_count": null,
45+
"metadata": {},
46+
"outputs": [],
47+
"source": [
48+
"np.random.seed(42)\n",
49+
"\n",
50+
"df = load_decay()\n",
51+
"df.head(5)"
52+
]
53+
},
54+
{
55+
"cell_type": "markdown",
56+
"metadata": {},
57+
"source": [
58+
"**Exercise:** Plot `activity` vs. `time`."
59+
]
60+
},
61+
{
62+
"cell_type": "code",
63+
"execution_count": null,
64+
"metadata": {},
65+
"outputs": [],
66+
"source": [
67+
"ax = df['activity'].plot()\n",
68+
"# ax.set_ylim(30, 40)"
69+
]
70+
},
71+
{
72+
"cell_type": "markdown",
73+
"metadata": {},
74+
"source": [
75+
"**Discuss:** \n",
76+
"\n",
77+
"- For the scenario that we're in, what is a plausible equation that links time to activity?\n",
78+
"- What are the key parameters that we need to worry about?\n",
79+
"- What might be justifiable priors for them?\n",
80+
"\n",
81+
"**Exercise:** Implement the model."
82+
]
83+
},
84+
{
85+
"cell_type": "code",
86+
"execution_count": null,
87+
"metadata": {},
88+
"outputs": [],
89+
"source": [
90+
"with pm.Model() as model:\n",
91+
" A = pm.HalfNormal('A', sd=100)\n",
92+
" tau = pm.Exponential('tau', lam=1)\n",
93+
" C = pm.Normal('C', sd=100)\n",
94+
" \n",
95+
" sd = pm.HalfCauchy('sd', beta=1)\n",
96+
" \n",
97+
" link = A * np.exp(-df['t'].values / tau) + C\n",
98+
" \n",
99+
" like = pm.Normal('activity', mu=link, sd=sd, observed=df['activity'].values)"
100+
]
101+
},
102+
{
103+
"cell_type": "markdown",
104+
"metadata": {},
105+
"source": [
106+
"Sample from the posterior."
107+
]
108+
},
109+
{
110+
"cell_type": "code",
111+
"execution_count": null,
112+
"metadata": {},
113+
"outputs": [],
114+
"source": [
115+
"with model:\n",
116+
" trace = pm.sample(2000, tune=2000)\n",
117+
" # Note: Sampler may pause for a while after finishing"
118+
]
119+
},
120+
{
121+
"cell_type": "markdown",
122+
"metadata": {},
123+
"source": [
124+
"Check that sampling has converged."
125+
]
126+
},
127+
{
128+
"cell_type": "code",
129+
"execution_count": null,
130+
"metadata": {},
131+
"outputs": [],
132+
"source": [
133+
"traces = pm.traceplot(trace)\n",
134+
"despine_traceplot(traces)"
135+
]
136+
},
137+
{
138+
"cell_type": "markdown",
139+
"metadata": {},
140+
"source": [
141+
"# Summary\n",
142+
"\n",
143+
"- In lieu of showing you a \"straight curve\" (line) fit, you've now seen an arbitrary curve fit.\n",
144+
"- As long as you can find a way to parameterize the curve with a function, you can perform inference on the curve's parameters.\n",
145+
"- The function that you are modelling is the \"link function\" that provides the link between the parameters, data and the output.\n",
146+
"\n",
147+
"More generally, if\n",
148+
"\n",
149+
"$$y = f(x, \\theta)$$\n",
150+
"\n",
151+
"where $\\theta$ are merely a set of parameters, then you can perform inference on the curve's parameters $\\theta$. To make this clear:\n",
152+
"\n",
153+
"| curve name | functional form | parameters $\\theta$ |\n",
154+
"|------------|-----------------|---------------------|\n",
155+
"| exponential decay | $y = Ae^{-t/\\tau} + C$ | $A$, $\\tau$, $C$|\n",
156+
"| linear regression | $y = mx + c$ | $m$, $c$ |\n",
157+
"| logistic regression | $y = L(mx + c)$ | $m$, $c$ |\n",
158+
"| 4-parameter IC50 | $y = \\frac{a - i}{1 + 10^{\\beta(log(\\tau) - x)}} + i$ | $a$, $i$, $\\tau$, $\\beta$ |\n",
159+
"| deep learning | $y = f(x, \\theta)$ | $\\theta$ |"
160+
]
161+
},
162+
{
163+
"cell_type": "code",
164+
"execution_count": null,
165+
"metadata": {},
166+
"outputs": [],
167+
"source": []
168+
}
169+
],
170+
"metadata": {
171+
"kernelspec": {
172+
"display_name": "bayesian-modelling-tutorial",
173+
"language": "python",
174+
"name": "bayesian-modelling-tutorial"
175+
},
176+
"language_info": {
177+
"codemirror_mode": {
178+
"name": "ipython",
179+
"version": 3
180+
},
181+
"file_extension": ".py",
182+
"mimetype": "text/x-python",
183+
"name": "python",
184+
"nbconvert_exporter": "python",
185+
"pygments_lexer": "ipython3",
186+
"version": "3.6.6"
187+
}
188+
},
189+
"nbformat": 4,
190+
"nbformat_minor": 2
191+
}

notebooks/09-bayesian-curve-regression.ipynb

Lines changed: 0 additions & 113 deletions
This file was deleted.

notebooks/data.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -55,8 +55,8 @@ def load_kruschke():
5555

5656
# Constants for load_decay
5757
tau = 71.9 # indium decay half life
58-
A = 60 # starting magnitude
59-
C = 2 # measurement error
58+
A = 42 # starting magnitude
59+
C = 21 # measurement error
6060
noise_scale = 1
6161

6262

0 commit comments

Comments
 (0)