Known-fate mark-recapture models in R

In this tutorial, I cover:

How known-fate mark-recapture models work
Fitting basic known-fate mark-recapture models with the ‘RMark’ Package in R.

The code from this tutorial is available on my mark-recapture workshop GitHub repo.

Note: my other mark-recapture tutorials use the ‘marked’ package, but that package does not fit known-fate survival models. Using RMark requires you to have MARK installed and accessible. See: http://www.phidot.org/software/mark/rmark/

Known-fate Survival Models

Mark-recapture models are for systems that mark (or can identify) individuals, which are usually characterized by imperfect detection. However, sometimes the system allows us to have (nearly) perfect detection. This is when to use known-fate survival models. For example, in tracking studies (e.g., radio or GPS telemetry), working with sessile animals like barnacles, or estimating survival of plants.

Known-fate models are simplifications of other mark-recapture models because there is no detection sub-model. Individuals enter the study between time intervals (capture events) and then exit the study when they die (with certainty), or exit with for other reasons like having a tracking device removed. Individuals that are not yet in the study are censored and not included until their fate is known at the beginning and end of an interval.

The known-fate models fit with RMark and MARK are extensions of the Kaplan-Meier survival curves made famous in medical literature and applied widely in other fields, including ecology.

Kaplan-Meier survival curves estimate survival (S) at time t based on the number of individuals alive and dead: \[ S_t = \frac{Alive_t - Died_t}{Alive_t}\] By reframing the model as a binomial estimator, we can use likelihood-based estimates similar to other mark-recapture models based on detection data (series of 0’s and 1’s to code for alive and dead). That is a little math-y, but the key take-away is that we can model survival (S) using a linear model with a logit-link and can use the same framework of model fitting, comparison, prediction, and model-averaging as other mark-recapture models.

Let’s work through an example with a dataset included in the RMark package on American Black Ducks.

Fitting known-fate mark-recapture models

The workflow is similar to other mark-recapture models:

Create or load detection history data
Process data
Make design data
Fit one or several models

The input data are similar to the format of other detection histories, but require two digits for every capture interval:

“00” = censored (not yet in study, or already exited/removed for some reason)
“10” = alive at beginning and end of the interval
“11” = alive at the beginning of the interval, but animal died before the end of the interval

library(dplyr) # for data organization
library(ggplot2) # for plotting
library(RMark) # for mark-recapture models

# Load known-fate mark-recapture data on American Black Ducks
data("Blackduck") # 48 observations of 5 variables

# Look at first few rows to see structure
head(Blackduck)

##                 ch BirdAge Weight Wing_Len condix
## 1 1100000000000000       1   1.16     27.7   4.19
## 2 1011000000000000       0   1.16     26.4   4.39
## 3 1011000000000000       1   1.08     26.7   4.04
## 4 1010000000000000       0   1.12     26.2   4.27
## 5 1010000000000000       1   1.14     27.7   4.11
## 6 1010110000000000       1   1.20     28.3   4.24

Blackduck is a dataframe with 48 rows (one per individual) and the detection data are in the ch column. There are 16 characters in each detection history, so there are 8 capture intervals.

Let’s process the data and create design data, which works similarly to the marked package I use in other mark-recapture tutorials. Think of the these steps as creating the framework of variables (groups, individual covariates, age, cohorts, and time) to build the models. It is very helpful when setting up more complex linear models, such as events that happen in specific intervals (e.g., large climatic events like flooding or fire).

# Process data. model = "Known" for known-fate models.
# See function details for other arguments, such as unequal time intervals and age variables
Blackduck.process <- process.data(Blackduck,
                             model = "Known", 
                             groups = c("BirdAge"))

# Create design data
Blackduck.ddl <- make.design.data(Blackduck.process)

# Look at first part of design data for the only parameter "S"
head(Blackduck.ddl$S)

##   par.index model.index group age time Age Time BirdAge
## 1         1           1     0   1    2   1    0       0
## 2         2           2     0   2    3   2    1       0
## 3         3           3     0   3    4   3    2       0
## 4         4           4     0   4    5   4    3       0
## 5         5           5     0   5    6   5    4       0
## 6         6           6     0   6    7   6    5       0

Known-fate mark-recapture models use S for denoting survival between intervals, and we can build linear formulas to test what affects S. Remember there is no separate submodel for detection probability, and we are removing intervals with imperfect detection (when a fate is unknown) by censoring.

In our example, we’ll test whether survival is affected by wing length (Wing_Len) or condition index at time of marking (condix).

Blackduck_models <- function() {
S.Wing_Len <- list(formula =~Wing_Len)
S.condix <- list(formula =~condix)

#create a list of all possible models using possibilities above.  
cml <- create.model.list("Known")
results <- mark.wrapper(cml, 
                        data = Blackduck.process, 
                        ddl = Blackduck.ddl, 
                        adjust = TRUE, output = FALSE)
# adjust = TRUE since some known S parameters could be at boundaries
return(results)
}

# Execute the function
Blackduck_results <- Blackduck_models()

Running the function will fit each of the possible models, and you’ll see output and some messages flash by on your screen. Next, let’s look at the model comparison table and the summary of the best-supported model.

# View results
Blackduck_results

##          model npar     AICc DeltaAICc    weight Deviance
## 1   S(~condix)    2 136.5835   0.00000 0.6902717 132.5408
## 2 S(~Wing_Len)    2 138.1862   1.60278 0.3097283 134.1435

# Look at model summary for most-supported model
summary(Blackduck_results$S.condix)

## Output summary for Known model
## Name : S(~condix) 
## 
## Npar :  2
## -2lnL:  132.5408
## AICc :  136.5835
## 
## Beta
##                 estimate        se        lcl      ucl
## S:(Intercept) -1.4613710 3.3187394 -7.9661003 5.043358
## S:condix       0.9746188 0.7843054 -0.5626199 2.511857
## 
## 
## Real Parameter S
##                        2         3         4         5         6         7
## Group:BirdAge0 0.9384871 0.9384871 0.9384871 0.9384871 0.9384871 0.9384871
## Group:BirdAge1 0.9384871 0.9384871 0.9384871 0.9384871 0.9384871 0.9384871
##                        8         9
## Group:BirdAge0 0.9384871 0.9384871
## Group:BirdAge1 0.9384871 0.9384871

In the model summary we can see the coefficient estimates and uncertainty, plus the “real” (back-transformed) estimates of S. In this case, the real parameter estimates are all the same because the only additional coefficient in the model is an individual covariate (but group averages are presented in the print-out). The summary() function shows real estimates for each group and time interval combination, even with an intercept-only model.

Next, we’ll predict and then plot the effect of condix on survival.

predicted_S <- predict_real(model = Blackduck_results$S.condix,
                            parameter = "S",
                            df = Blackduck.ddl$S,
                            replicate = TRUE,
                            data = data.frame(condix = seq(from = min(Blackduck$condix),
                                                         to = max(Blackduck$condix),
                                                         by = 0.01),
                                              Wing_Len = mean(Blackduck$Wing_Len),
                                              BirdAge = "1",
                                              Weight = mean(Blackduck$Weight)),
                            se = TRUE)

ggplot(data = predicted_S, aes(x = condix, y = real)) +
  geom_ribbon(aes(ymin = lcl, ymax = ucl), fill = "gray80") +
  geom_smooth(col = "black", formula = y~x, method = "loess") +
  labs(x = "Body Condition Index", y = "Predicted survival") +
  theme_classic()

# Last step: clean temporary and unused files associated with RMark and MARK in working directory
# These are written whenever fitting models in MARK via RMark.
# Includes .inp, .out, .res, and .vcv
RMark::cleanup(ask = FALSE)

And that’s it! We covered how known-fate mark-recapture models work, and then fit some using the RMark package in R. Thanks for reading, and checkout the sources below for more information.

Known-fate Survival Models

Fitting known-fate mark-recapture models

Further reading and references