Species-Presence exercise 6 - Two-species-single-season example
This exercise is designed to show how to run programs PRESENCE to compute
species presence, detectability, and co-occurrence estimates
from 'presence-absence' data which includes covariates.
Input data consists of 'detection-histories' of two individual species
at potential owl territories. Sample covariates have also been included.
Running the program
Start a new project in program PRESENCE.
Click menu 'File'
click menu item 'New Project'
Enter detection data...
click 'Data Input Form' button
Using Excel, open sample spreadsheet, 'sp_ba_owl.xlsx' in 'sample_data' folder
Click 'ReadMe' tab and read
Click 'spA' tab and copy all data to clipboard (Ctrl-A, Ctrl-C)
go back to PRESENCE and click in first data cell
In 'Edit' menu, select 'Paste', then 'Paste w/both'
Enter sample covariate data...
In the input cell below '#samp covs', change the value to '4' and hit 'Enter'
copy/paste 1st sample covariate data...
Click the 'SampCov1', then click in the 1st cell.(All values in table should be '1'.)
Go back to Excel and click the 'nite' tab.
Select all data except the 1st row and 1st column, then copy to clipboard.
go back to PRESENCE and click in first data cell
In 'Edit' menu, select 'Paste', then 'Paste values'
type 'nite' when the dialog box asks for a covariate name.
copy/paste 2nd sample covariate data...
Click the 'SampCov2', then click in the 1st cell.(All values in table should be '2'.)
Go back to Excel and click the 'spotcall' tab.
Select all data except the 1st row and 1st column, then copy to clipboard.
go back to PRESENCE and click in first data cell
In 'Edit' menu, select 'Paste', then 'Paste values'
type 'spotcall' when the dialog box asks for a covariate name.
copy/paste 3rd sample covariate data...
Click the 'SampCov3', then click in the 1st cell.(All values in table should be '3'.)
Go back to Excel and click the 'contcall' tab.
Select all data except the 1st row and 1st column, then copy to clipboard.
go back to PRESENCE and click in first data cell
In 'Edit' menu, select 'Paste', then 'Paste values'
type 'contcall' when the dialog box asks for a covariate name.
copy/paste 4th sample covariate data...
Click the 'SampCov4', then click in the 1st cell.(All values in table should be '4'.)
Go back to Excel and click the 'visual' tab.
Select all data except the 1st row and 1st column, then copy to clipboard.
go back to PRESENCE and click in first data cell
In 'Edit' menu, select 'Paste', then 'Paste values'
type 'visual' when the dialog box asks for a covariate name.
Save the data using the 'File/Save as' menu.
Click 'No' when asked about using the last col of data as frequency counts.
Enter a title (eg., Spotted Owls and Barred owls).
Select a foler to save the data file to and enter a filename.
close the data input form window (using the X in upper-right corner, or File/Close menu).
click the 'OK' button to create a project folder.
You're now presented with a 'Results Browser' window where a summary of each
model will be saved.
To run our first model:
click menu 'Run'
click menu item 'Run Analysis:single-season'
click menu item 'Two-species'
When the 'Setup Numerical Estimation Run' window appears,
a design matrix window will appear. The parameters are grouped
by 'Occupancy' or 'Detection'. The Occuancy tab will contain 3
parameters:
psiA: Pr(site is occupied by species A, regardless of occupancy of species B)
psiBA: Pr(site is occupied by species B, given species A is present)
psiBa: Pr(site is occupied by species B, given species A is not present)
Another way of parameterizing this model is to estimate these parameters:
psiA: Pr(site is occupied by species A, regardless of occupancy of species B)
psiB: Pr(site is occupied by species B, regardless of occupancy of species A)
phi: Species Interaction Factor (SIF = psiAB/(psiA*psiB))
By default, the design matrix is set up to estimate these 3 parameters
independently. This would allow an interaction in occupancy of
the two species (non-independent occupancy). To change the model such
that occupancy of the two species is independent, simply constrain
psiBA=psiBa (or fix phi=1.0).
The Detection tab will contain 5 sets of parameters (indexed by sample):
pA: Pr(species A is detected, given only species A is present)
pB: Pr(species B is detected, given only species B is present)
rA: Pr(species A is detected, given both species are present)
rBA: Pr(species B is detected, given both species are present and species A is detected)
rBa: Pr(species B is detected, given both species are present and species A is not detected)
if the 2nd parameterization is chosen, the following 2 parameters would
be estimated in place of the last 2 above:
rB: Pr(species B is detected, given both species are present)
delta: detection interactioni factor
Run model, "psiA(.),psiBA(.)=psiBa(.),pA(.)=rA(.),pB(.)=rBA(.)=rBa(.)"
This model is one where occupancy and detection of the two species are
independent (no interaction). Since we'll be setting psiBA=psiBa, that library(fatalityCMR); ?example.search.csv
means occupancy of species B is the same whether species A is present or not.
To do this in PRESENCE we need to
delete the last column in the Occupancy design matrix,
left-click in 1st row of column to be deleted, then right-click and select 'delete column'.
and enter a '1' in the last row, 2nd column. The design matrix
should look like this:
The design matrix for detection should look like this:
Before running this model, change the model name
to "psiA(.),psiBA(.)=psiBa(.),pA(.)=rA(.),pB(.)=rBA(.)=rBa(.)".
Click 'OK to Run' to run this model.
After the analysis is complete, click 'yes' to append the output to the
results browser. The output from this model
should match the output you would get if you ran each species
separately in a single-season model.
Two species parameterisations
There are currently 3 different parameterisations available for the
two species model, which simply differ in how they quantify the level
of any co-occurrence: the underlying modeling framework is identical
in each case. In all cases we can visualise the problem of which
species are present at a unit using the following Venn diagram, where
psiA and psiB are the overall probabilities species A and B being
present at a unit (the left and right ellipses) respectively.
Questions regarding to the co-occurrence of the species essential
examine the degree of overlap of the ellipses. Using the basic rules
of probability we have that the two species are independent if
psiA * psiB = psiAB , or
alternatively
psiAB/psiA = (psiB-psiAB)/(1-psiA) (i.e.,
Pr(species B present | species A is present) = Pr(species B present | species A is absent).
The first (original) parameterisation uses the first definition of
independence to therefore calculate
psaAB as
psiAB = phi*psiA*phiB
where 'phi' is the species
interaction factor (SIF). Values of phi = 1 implies independence; < 1 imply the
species are less likely to occur together than expected; and > 1 co-occur more
often than expected. This parameterisation is reasonable for many applications
without covariates, but can cause some numerical issues once covariates are
introduced. The quantities that are estimated by PRESENCE in this
parameterisation are logit(psiA), logit(psiB), and log(phi).
The second parameterisation uses the second definition of independence which
is essentially comparing the proportion of the left ellipse that is overlapped
by the right ellipse with the proportion of everything outside the left
ellipse that is overlapped with the remainder of the right ellipse. Here there
are two probabilities of occupancy for species B, depending on whether species
A is also present (psiBA) or absent (psiBa). Under this parameterisation
there is no SIF as
such (although it could be derived), although if psiBA < psiBa implies
avoidance; psiBA = psiBa
implies independence; and psiBA > psiBa implies species co-occur more
often than expected.
The quantities that are estimated by PRESENCE in this parameterisation are
logit(psiA), logit(psiBA), and logit(psiBa).
The final parameterisation is a combination of the first two, where we wish to
estimate a SIF, although this is done in terms of an odds ratio for psiBA and
psiBa (i.e.,
the ratio of the two proportions of overlap described above). That is, a SIF
('nu') is defined as
psiBA/(psiA-psiBA) psiAB/psiA-psiAB
nu = --------------------------- = --------------------------------
psiBa/(1-psiA-psiBa) (psiB-psiAB)/(1-psiA-psiB+psiAB)
The main advantages of this parameterisation is that it
is more numerically stable than the first parameterisation, and also sits
naturally in how one might assess questions of co-occurrence using logistic
regression. Using logistic regression, one might be tempted to use the
presence/absence of species A as a predictor variable (or covariate) for the
presence of species B (the response variable): log(nu) is exactly analogous to the
resulting logistic regression coefficient that would be obtained, although in
this framework at has been corrected for imperfect detection of both species A
and B. Interpretation of nu is the same as phi in the first parameterisation. The
quantities that are estimated by PRESENCE in this parameterisation are
logit(psiA), logit(psiBa) and log(nu).
Which parameterization to use?
The answer to this will depend on the issue you're trying to address.
Both parameterizations will (usually) give the same results. Using
some algebra, estimates from one parameterization can be converted to
estimates in the other. For example, if the 1st parameterization is used,
the psiB parameter in the 2nd parameterization can be computed as:
psiB = psiA*psiBA + (1-psiA)*psiBa
and the phi parameter can be computed as:
phi = psiA*psiB/psiAB (where psiAB=psiA*psiBA)
So, if the parameters from the 2nd parameterization can be computed
using estimates from the 1st parameterization, why even bother with
the 2nd parameterization? The main reason would be that you may be
interested in modeling one of those parameters in the 2nd parameterization
directly as a function of covariates. This cannot be done if the
1st parameterization is used.
Note about the word 'usually' above: With the first parameterization, all
parameters are estimated as probabilities (range= 0 - 1). Regardless of
the values taken by the parameters, (psiA, psiBA, psiBa), valid values
of the parameters, (psiA, psiB, phi) will result. However, there are
values of (psiA, psiB, phi) which will result in implausible values
of (psiA, psiBA, psiBa). For example, if
So, the 2nd parameterization might produce estimates which have a
higher likelihood, but have parameter estimates which are out of
range. PRESENCE will take steps to try to avoid this, but some
data-sets may be problematic due to this.
Another parameterization
In an attempt to avoid the trouble mentioned above with using the
(psiA,psiB,phi) parameterization, another parameterization was developed
using an 'odds-ratio' for the SIF. The parameters for this would be
(psiA, psiBa, nu), where nu is the log-odds ratio of how occupancy
of species B changes with the prsence of species A.