This exercise is designed to show how to generate or simulate occupancy data according to 'Design2' in the last lecture. Here is a diagram showing the sampling scheme for the different study designs:
Design 1 Design 2 Design 3 ______________________ ______________________ ______________________ Num Num Num of of of Sites 1 2 3 4 Sites 1 2 3 4 Sites 1 2 3 4 12 xx -- -- -- 6 xx xx xx xx 6 xx -- xx -- 12 -- xx -- -- 6 xx -- -- -- 6 -- xx -- xx 12 -- -- xx -- 6 -- xx -- -- 6 xx -- -- -- 12 -- -- -- xx 6 -- -- xx -- 6 -- xx -- -- 6 -- -- -- xx 6 -- -- xx -- 6 -- -- -- xx Total s=48 sites s=30 sites s=36 sites
In Design2, 6 sites were sampled on all 8 surveys. 6 other sites were sampled only on the first two surveys, 6 others were only sampled on surveys 3 and 4, 6 others were only sampled on surveys 5 and 6, and 6 others were only sampled on surveys 7 and 8, giving a total of 30 different sites surveyed.
From a pilot study, we assume p(biweek) = (0.30 0.53 0.75 0.88), and ψ = 0.60.
Next, change detection probabilities (P(i)) to .30 for the first two surveys, .53 for surveys 3 and 4, .75 for surveys 5 and 6, and .88 for surveys 7 and 8.
Your screen should look like this:
Next, Add the other 4 groups of sites:
Now, select a model to analyze the data with, generate data and run MARK to calculate estimates under the chosen model.
Here is the output for the model we just ran.
Program MARK - Survival Rate Estimation with Capture-Recapture Data Compaq(Win32) Vers. 5.0 Dec 2007 17-Apr-2008 15:26:34 Page 001 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - * * WARNING * * Lines per page set to 50.note input parameters for Group1 in following line:
INPUT --- proc title simulated data 8 6 .6 .3 .3 .53 .53 .75 .75 .88 INPUT --- .88 0 0 0 0 0 0 0 0 0 0 0 0 0 0; Time in seconds for last procedure was 0.00 INPUT --- proc chmatrix occasions=8 groups=1 etype=Occupancy hist=272; INPUT --- glabel(1)=Group 1; INPUT --- time interval 1 1 1 1 1 1 1; INPUT --- /* 8 6 .6 .3 .3 .53 .53 .75 .75 .88 .88 0 0 0 0 0 0 0 0 0 0 INPUT --- 0 0 0 0 */ INPUT --- 00000000 2.400351; INPUT --- 00000001 0.002572; INPUT --- 00000010 0.002572; INPUT --- 00000011 0.018860; INPUT --- 00000100 0.001052; INPUT --- 00000101 0.007715; INPUT --- 00000110 0.007715; : : : : : : : : INPUT --- 11111100 0.000737; INPUT --- 11111101 0.005406; INPUT --- 11111110 0.005406; INPUT --- 11111111 0.039645;note input parameters for Group2 in following line:
INPUT --- /* 8 6 .6 .3 .3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 */ INPUT --- 00...... 4.164000; INPUT --- 01...... 0.756000; INPUT --- 10...... 0.756000; INPUT --- 11...... 0.324000;note input parameters for Group3 in following line:
INPUT --- /* 8 6 .6 0 0 .53 .53 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 */ INPUT --- ..00.... 3.195240; INPUT --- ..01.... 0.896760; INPUT --- ..10.... 0.896760; INPUT --- ..11.... 1.011240;note input parameters for Group4 in following line:
INPUT --- /* 8 6 .6 0 0 0 0 .75 .75 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 */ INPUT --- ....00.. 2.625000; INPUT --- ....01.. 0.675000; INPUT --- ....10.. 0.675000; INPUT --- ....11.. 2.025000;note input parameters for Group5 in following line:
INPUT --- /* 8 6 .6 0 0 0 0 0 0 .88 .88 0 0 0 0 0 0 0 0 0 0 0 0 0 0 */ INPUT --- ......00 2.451840; INPUT --- ......01 0.380160; INPUT --- ......10 0.380160; INPUT --- ......11 2.787840; Number of unique encounter histories read was 272. Number of individual covariates read was 0. Time interval lengths are all equal to 1. Data type is Occupancy Estimation with Detection < 1. Time in seconds for last procedure was 0.16
INPUT --- proc estimate link=Sin NOLOOP varest=2ndPart; INPUT --- model={psi,p(t)}; INPUT --- group=1 Psi rows=1 cols=1 Square; INPUT --- 1; INPUT --- group=1 p Session 1 rows=1 cols=8 Square;note: different number below for each of the 8 surveys, indicating survey-specific detection probabilities.
INPUT --- 2 3 4 5 6 7 8 9; INPUT --- design matrix constraints=9 covariates=9 identity;
Real Function Parameters of {psi,p(t)} 95% Confidence Interval Parameter Estimate Standard Error Lower Upper ------------------------- -------------- -------------- -------------- --------------note: estimates virtually identical to values which generated the data -> bias=0.
1:Psi 0.5999997 0.1073033 0.3844160 0.7827516 2:p(1) 0.3000005 0.1778780 0.0753348 0.6927267 3:p(2) 0.3000005 0.1778780 0.0753348 0.6927267 4:p(3) 0.5300007 0.1979084 0.1920009 0.8425540 5:p(4) 0.5300006 0.1979083 0.1920009 0.8425539 6:p(5) 0.7499998 0.1699285 0.3367734 0.9465931 7:p(6) 0.7499997 0.1699285 0.3367733 0.9465931 8:p(7) 0.8800013 0.1246124 0.4205634 0.9866835 9:p(8) 0.8800014 0.1246124 0.4205634 0.9866835Note that the bias of the estimates is zero (calculated estimates equal to input parameter values). This was one of the questions addressed in the Bailey et. al. paper. Another question was regarding the bias when the incorrect model was chosen to analyze the data. If we chose model 'Phi,p(.)' instead of 'Phi,p(t)', we could examine the bias of this model (which doesn't correspond to how the data are generated).
The main question, however, was which design produced the 'best' estimates (estimates with the smallest standard errors). This was answered by duplicating this exercise for Design1 and Design3, and comparing the output.
Another use would be to check the confidence-interval coverage of the estimates. To do that, we would simulate data (instead of analyze w/ expected values), under a model, say 500 times, then use the output spreadsheet to count the number of times the input parameter value was greater than the lower 95% bound and less than the upper 95% bound.