Single-Season study design exercise using GENPRES

This exercise is designed to show how to generate or simulate occupancy data according to 'Design2' in the last lecture. Here is a diagram showing the sampling scheme for the different study designs:

Example:Distribution of sampling effort across 4 bi-weekly survey periods

Design 1                 Design 2                 Design 3
______________________   ______________________   ______________________
Num                      Num                      Num
of                       of                       of
Sites   1   2   3   4    Sites   1   2   3   4    Sites   1   2   3   4
12     xx  --  --  --     6     xx  xx  xx  xx     6     xx  --  xx  --
12     --  xx  --  --     6     xx  --  --  --     6     --  xx  --  xx
12     --  --  xx  --     6     --  xx  --  --     6     xx  --  --  --
12     --  --  --  xx     6     --  --  xx  --     6     --  xx  --  --
                          6     --  --  --  xx     6     --  --  xx  --
                                                   6     --  --  --  xx
                                                   
Total    s=48 sites               s=30 sites               s=36 sites

In Design2, 6 sites were sampled on all 8 surveys. 6 other sites were sampled only on the first two surveys, 6 others were only sampled on surveys 3 and 4, 6 others were only sampled on surveys 5 and 6, and 6 others were only sampled on surveys 7 and 8, giving a total of 30 different sites surveyed.

From a pilot study, we assume p(biweek) = (0.30 0.53 0.75 0.88), and ψ = 0.60.

Running the program

Start program GENPRES and change the model-type to 'single-season'. Next, change the number of surveys to 8, change ψ to 0.6, and change the number of sites to 6.

Next, change detection probabilities (P(i)) to .30 for the first two surveys, .53 for surveys 3 and 4, .75 for surveys 5 and 6, and .88 for surveys 7 and 8.

Your screen should look like this:

Next, Add the other 4 groups of sites:

The screen should now look like this:

Now, select a model to analyze the data with, generate data and run MARK to calculate estimates under the chosen model.

Output

After MARK finishes, a notepad window should appear with the program output. If you did not deselect the first model and ended up choosing two models for the analysis, GENPRES will run both models and produce a spreadsheet file with the results.

Here is the output for the model we just ran.

  Program  MARK  - Survival Rate Estimation with Capture-Recapture Data
   Compaq(Win32) Vers. 5.0 Dec 2007     17-Apr-2008 15:26:34    Page  001
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
 

 * *  WARNING  * *   Lines per page set to 50. 

note input parameters for Group1 in following line:
  INPUT --- proc title simulated data 8 6 .6 .3 .3 .53 .53 .75 .75 .88 
  INPUT --- .88 0 0 0 0 0 0 0 0 0 0 0 0 0 0;

     Time in seconds for last procedure was 0.00


  INPUT --- proc chmatrix occasions=8 groups=1 etype=Occupancy hist=272;

  INPUT ---    glabel(1)=Group 1;

  INPUT ---    time interval 1 1 1 1 1 1 1;
  INPUT ---       /* 8 6 .6 .3 .3 .53 .53 .75 .75 .88 .88 0 0 0 0 0 0 0 0 0 0 
  INPUT ---       0 0 0 0 */
  INPUT ---       00000000 2.400351;
  INPUT ---       00000001 0.002572;
  INPUT ---       00000010 0.002572;
  INPUT ---       00000011 0.018860;
  INPUT ---       00000100 0.001052;
  INPUT ---       00000101 0.007715;
  INPUT ---       00000110 0.007715;
  :   :            :       :
  :   :            :       :
  INPUT ---       11111100 0.000737;
  INPUT ---       11111101 0.005406;
  INPUT ---       11111110 0.005406;
  INPUT ---       11111111 0.039645;
note input parameters for Group2 in following line:
  INPUT ---       /* 8 6 .6 .3 .3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 */
  INPUT ---       00...... 4.164000;
  INPUT ---       01...... 0.756000;
  INPUT ---       10...... 0.756000;
  INPUT ---       11...... 0.324000;
note input parameters for Group3 in following line:
  INPUT ---       /* 8 6 .6 0 0 .53 .53 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 */
  INPUT ---       ..00.... 3.195240;
  INPUT ---       ..01.... 0.896760;
  INPUT ---       ..10.... 0.896760;
  INPUT ---       ..11.... 1.011240;
note input parameters for Group4 in following line:
  INPUT ---       /* 8 6 .6 0 0 0 0 .75 .75 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 */
  INPUT ---       ....00.. 2.625000;
  INPUT ---       ....01.. 0.675000;
  INPUT ---       ....10.. 0.675000;
  INPUT ---       ....11.. 2.025000;
note input parameters for Group5 in following line:
  INPUT ---       /* 8 6 .6 0 0 0 0 0 0 .88 .88 0 0 0 0 0 0 0 0 0 0 0 0 0 0 */
  INPUT ---       ......00 2.451840;
  INPUT ---       ......01 0.380160;
  INPUT ---       ......10 0.380160;
  INPUT ---       ......11 2.787840;

      Number of unique encounter histories read was 272.

      Number of individual covariates read was 0.
      Time interval lengths are all equal to 1.

      Data type is Occupancy Estimation with Detection < 1.

     Time in seconds for last procedure was 0.16

Output - MARK model definition

  INPUT --- proc estimate link=Sin NOLOOP varest=2ndPart;


  INPUT --- model={psi,p(t)};

  INPUT ---    group=1 Psi rows=1 cols=1 Square;
  INPUT ---        1;

  INPUT ---    group=1 p Session 1 rows=1 cols=8 Square;
note: different number below for each of the 8 surveys, indicating survey-specific detection probabilities.
  INPUT ---        2 3 4 5 6 7 8 9;

  INPUT ---    design matrix constraints=9 covariates=9 identity;

Output - MARK model parameter estimates

                          Real Function Parameters of {psi,p(t)}
                                                              95% Confidence Interval
 Parameter                  Estimate       Standard Error      Lower           Upper
 -------------------------  --------------  --------------  --------------  --------------
note: estimates virtually identical to values which generated the data -> bias=0.
    1:Psi                   0.5999997       0.1073033       0.3844160       0.7827516                           
    2:p(1)                  0.3000005       0.1778780       0.0753348       0.6927267                           
    3:p(2)                  0.3000005       0.1778780       0.0753348       0.6927267                           
    4:p(3)                  0.5300007       0.1979084       0.1920009       0.8425540                           
    5:p(4)                  0.5300006       0.1979083       0.1920009       0.8425539                           
    6:p(5)                  0.7499998       0.1699285       0.3367734       0.9465931                           
    7:p(6)                  0.7499997       0.1699285       0.3367733       0.9465931                           
    8:p(7)                  0.8800013       0.1246124       0.4205634       0.9866835                           
    9:p(8)                  0.8800014       0.1246124       0.4205634       0.9866835                           

  
Note that the bias of the estimates is zero (calculated estimates equal to input parameter values). This was one of the questions addressed in the Bailey et. al. paper. Another question was regarding the bias when the incorrect model was chosen to analyze the data. If we chose model 'Phi,p(.)' instead of 'Phi,p(t)', we could examine the bias of this model (which doesn't correspond to how the data are generated).

The main question, however, was which design produced the 'best' estimates (estimates with the smallest standard errors). This was answered by duplicating this exercise for Design1 and Design3, and comparing the output.

Another use would be to check the confidence-interval coverage of the estimates. To do that, we would simulate data (instead of analyze w/ expected values), under a model, say 500 times, then use the output spreadsheet to count the number of times the input parameter value was greater than the lower 95% bound and less than the upper 95% bound.