A model for presence-only data

This is the set-up of a conditional auto-logistic regressive model (CAR) for predicting species presence using a sample signal and a presence-only data.

Set-up

Let $latex Sp $ be a species and $latex Y $ and $latex X $ two random variables corresponding to the events of: $latex Sp $ is in location $latex x_i$ and: location $latex x_i $ has been sampled. (The variable $latex X$ and $latex x_i$ are not related)

$latex Y $ and $latex X $ are consider to be independent binary (Bernoulli) variables conditional to the latent processes $latex S $ and $latex P$ respectively.

Latent processes

These variables are modeled as this:

$$ logit(S_k) = \beta_s^t d_s(x_k) + \psi_k +O_k $$

$$ logit(P_k) = \beta_p^t d_p(x_k) + \psi_k + O_k $$

Where $latex O_k$ is an offset term, $latex d_p(x_k), d_s(x_k)$ are the covariates for p and s respectively; and $latex \psi_k$ is modeled as a Gaussian Markov Random Field.

Common Gaussian Markov Field (CGMRF)

This process is modeled as this:

$$ \psi_k = \phi_k + \theta_k $$

$$ \phi_k | \phi_{-k}, \mathbb{W},\tau^2 \sim N \left( \frac{\sum_{i=1}^{K} w_{ki} \phi_i}{\sum_{i=1}^{K}w_{ki}}, \frac{\tau^2}{\sum_{i=1}^{K}w_{ki}}\right) $$
$$\theta_k \sim N\left(0, \sigma^2\right)$$

$$\tau^2, \sigma^2 \sim^{iid} Inv.Gamma(a,b) $$

Where $latex \mathbb{W}$ is the adjacency matrix of the lattice, $latex \theta_k$ is an independent noise term with constant variance. $latex sigma^2$ and $latex \tau^2$ are independent and identically distributed hyperparameters sampled from an inverse gamma distribution.

The corresponding Directed acyclic Graph can be seen in the next figure.

Implementation

A current implementation of this model can be found here:
Of particular interest is the file: joint.binomial.bymCAR.R where you can find the joint sample between line 113 and 149.