A model for presence-only data

This is the set-up of a conditional auto-logistic regressive model (CAR) for predicting species presence using a sample signal and a presence-only data.

Set-up

Let $latex Sp $ be a species and $latex Y $ and $latex X $ two random variables corresponding to the events of: $latex Sp $ is in location $latex x_i$ and: location $latex x_i $ has been sampled. (The variable $latex X$ and $latex x_i$ are not related)

$latex Y $ and $latex X $ are consider to be independent binary (Bernoulli) variables conditional to the latent processes $latex S $ and $latex P$ respectively.

Latent processes

These variables are modeled as this:

$$ logit(S_k) = \beta_s^t d_s(x_k) + \psi_k +O_k $$

$$ logit(P_k) = \beta_p^t d_p(x_k) + \psi_k + O_k $$

Where $latex O_k$ is an offset term, $latex d_p(x_k), d_s(x_k)$ are the covariates for p and s respectively; and $latex \psi_k$ is modeled as a Gaussian Markov Random Field.

Common Gaussian Markov Field (CGMRF)

This process is modeled as this:

$$ \psi_k = \phi_k + \theta_k $$

$$ \phi_k | \phi_{-k}, \mathbb{W},\tau^2 \sim N \left( \frac{\sum_{i=1}^{K} w_{ki} \phi_i}{\sum_{i=1}^{K}w_{ki}}, \frac{\tau^2}{\sum_{i=1}^{K}w_{ki}}\right) $$
$$\theta_k \sim N\left(0, \sigma^2\right)$$

$$\tau^2, \sigma^2 \sim^{iid} Inv.Gamma(a,b) $$

Where $latex \mathbb{W}$ is the adjacency matrix of the lattice, $latex \theta_k$ is an independent noise term with constant variance. $latex sigma^2$ and $latex \tau^2$ are independent and identically distributed hyperparameters sampled from an inverse gamma distribution.

The corresponding Directed acyclic Graph can be seen in the next figure.

Implementation

A current implementation of this model can be found here:
Of particular interest is the file: joint.binomial.bymCAR.R where you can find the joint sample between line 113 and 149.


Published by

Juan Escamilla Mólgora

I'm a mathematical and computational statistical ecologist working at the intersection of Spatial Statistics, Software Development, Machine Learning and Cloud Computing. I'm researching novel methods for integration, harmonization and modelling of big environmental data. I developed the Wild Fire Alert and Monitoring System for Mexico and Central America

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.