Bayesian state-space models with multiple CPUE data : the case of a mullet fishery

1 Universidade do Vale do Itajaí. Rua Uruguai, 458, Bloco E2 Sala 114 Centro, Itajaí, Santa Catarina, Brazil. (RS) (Corresponding author) E-mail: rsantana@univali.br. ORCID-iD: http://orcid.org/0000-0003-2252-6014 (PRS) E-mail: schwingel@univali.br. ORCID-iD: http://orcid.org/0000-0003-2524-9550 2 Universidade Federal de Rio Grande. Av. Itália, km 8, Carreiros, Rio Grande, Rio Grande do Sul, Brazil. (PGK) E-mail: pgkinas@gmail.com. ORCID-iD: http://orcid.org/0000-0002-1294-9518 (JPC) E-mail: castellojpc@gmail.com. ORCID-iD: http://orcid.org/0000-0002-1962-6471 (JPV) E-mail: vieira@mikrus.com.br. ORCID-iD: http://orcid.org/0000-0002-0669-9444 3 Instituto de Pesca de São Paulo, Av. Bartolomeu de Gusmão, 192, Aparecida, Santos, São Paulo, Brazil. (LVM) E-mail: lauravillwockmiranda@gmail.com. ORCID-iD: http://orcid.org/0000-0002-1750-8973


INTRODUCTION
Stochastic versions of biomass dynamic models are called state-space models.These models have a hierarchical structure which simultaneously accounts for uncertainties in the time and space dynamics of biomass production and for errors in the observational process of some abundance indices (e.g.Catch per Unit Effort, CPUE) that relate data to the (unknown or latent) biomass.
Surplus Production (SP) models are simple but robust non-linear models for stock assessment that are widely used to model biomass dynamics in state-space models for exploited fish populations (Millar and Meyer 2000).A useful feature of SP models is that they do not require analytic detailing about specific biological characteristics of target stocks under survey (Gulland 1983).This is important because detailed information about population dynamics may not be available for analysis of stock sustainability (Hilborn andWalters 1992, Chen andAndrew 1998).Furthermore, application of SP models seeks to determine optimal levels of fishing effort that can reach predefined goals within a scenario of sustainability (Gulland 1983, Hilborn and Walters 1992, Sparre and Venema 1997).
The search for improvements in SP models has grown over the last few years.Revisions have pointed towards limitations related mostly to nonconformities in utilized data rather than the predictive capability of the underlying model (Hilborn and Walters 1992).Hilborn (1979) observed that the disparity between fishing effort and stock abundance, often caused by problems in data gathering, increased the uncertainties in analyses with SP or more elaborate age-structured models.However, SP models have been shown to be capable of providing better estimates of relevant reference points than analyses based on age-structured models, even when important growth and vulnerability parameters are known in advance (Ludwig andWalters 1985, 1989).
It is an essential premise in the observational component of state-space models that CPUE has a known proportionality relation to stock biomass.However, stock concentration profile, changes in fishing power, gear type, season and fishing ground can all affect this relation over time and space and seriously bias biomass dynamic predictions if not properly accounted for (Clark 1985).Standardization of CPUE is one recommended path to reconcile data (Hilborn and Walters 1992), but this requires data to be collected at the same time and location to make transformations possible.
Another path has been to ignore fishing effort all together and rely on catch-only data (Vasconcellos and Cochrane 2005).Although very attractive, this approach needs to replace missing effort information with strong premises about effort patterns over time.
One motivation for this study is to try a new alternative for dealing with fisheries assessment in data-limited situations.It consists in retaining CPUEs of various fishing fleets (characterized by differences in gear type, fishing operation and fishing ground) and model them simultaneously as multiple but integrated observation models.The only requirement is that all fleets exploit the same stock biomass, even if time-windows within a year or over time do not coincide.
The Bayesian approach to perform inference in fishery assessment and management is very appealing because, conditional on the proposed model, it provides direct estimates of biological reference points while automatically retaining and integrating all sources of (data and process) uncertainties (Kinas 1996).This is achieved by presenting the output of estimation and prediction in the form of posterior distributions which are usually easy to read and very convenient elements in the process of decision making under uncertaintyan essential premise in effective risk analysis.Since the state-space model is structured as a hierarchical Bayesian model, posterior distributions can be obtained with relative ease using Markov Chain Monte Carlo (MCMC) stochastic simulation.
The lebranche mullet Mugil liza is a pelagic species that is abundant in coastal marine and estuarine environments and sometimes aggregates into dense schools (Menezes et al. 2003).A population of M. liza genetically identified as the "southern population" (Mai et al. 2014) is distributed from the coast of the State of São Paulo to Argentina but most catches occur off the Brazilian States of Rio Grande do Sul and Santa Catarina (Vieira 1991).This species is often considered to be catadromous, owing to its predictable migrations from freshwater and estuarine habitats into marine spawning areas, with its reproductive period from April to July and peak spawning in June in the southern states of Brazil (Santa Catarina and Paraná) (Vieira andScalabrin 1991, Lemos et al. 2014).This is also one of the most frequent and abundant fish species in the south and southeast regions of Brazil, representing an important cultural and historical artisanal fishery for this region (Seckendorff and Azevedo 2007, Vieira et al. 2008, Lemos et al. 2014).Because of the high value of mullet gonads in the international market, the industrial and artisanal fisheries have directed their fishing effort towards supplying these markets (Pina and Chaves 2005, Miranda et al. 2006, 2011).Furthermore, these efforts are mainly applied on highly aggregated schools during the reproductive migration (Vieira et al. 2008, Garbin et al. 2014), thus posing a significant threat to conservation and sustainability.
Since 2004 the species has been classified as overexploited by the Brazilian Government (MMA 2004).However, the current management plan for M. liza is controversial due to lack of basic information (Lemos et al. 2016), which is available only in the form of timediscontinuous official landings and unreliable information about the overall fishing effort directed at this resource.Given these multiple sources of uncertainty, it is important to collect all available data, integrating them as effectively as possible with other relevant knowledge so that appropriate biological reference points can be estimated as well as possible.We believe that this strategy currently seems to be the fastest and most feasible path to producing useful recommendations for the management of this important fishing resource in the short term.
This paper assesses the current status of the lebranche mullet southern population distributed along the southern and southeastern shelf regions of Brazil, using hierarchically structured Bayesian state-space models.Given the need to integrate various data sources within a single structure, we divide this goal into three steps: (i) develop a robust hierarchical Bayesian state-space biomass model capable of integrating multiple CPUEs; (ii) use this model to estimate the reference points together with the associated uncertainties; and (iii) pro-vide a modelling structure to make predictions that can help integrate management actions among all fisheries targeting the mullet (Mugil liza).

Model structure
The proposed state-space models were based on the Bayesian approach, considering the integration of uncertainties involved in the latent process dynamic (biomass) and errors associated with the observational component consisting of various CPUE data collected from a multitude of fishery statistics (see Meyer and Millar 1999).
The deterministic component of the process dynamic was defined in discrete time (annual) variation, where the biomass (B t ) at the start of year t, is a known function of the previous year's biomass (B t−1 ) and total catch (C t−1 ), parameterized by the intrinsic population growth rate (r), the average unfished stock size or support capacity (K) and a shape parameter (z).
This equation is known as the Pella and Tomlinson SP model (Pella and Tomlinson 1969).If z=1, the equation reduces to the classic Schaefer (1957) SP model.Parameter (z) determines the level of asymmetry in the SP function, and is interpreted as a measure of densitydependence of the population (Chaloupka and Balazs 2007).A value of z between 0 and 1 (0<z<1) causes the SP to reach its maximum at some biomass below half the support capacity K/2.A z>1 shifts this maximum to values above K/2.For computational convenience and to reduce parameter confounding, the model was reparametrized in terms of relative abundance (B/K=P) (Meyer andMillar 1999, Brodziak andIshmura 2011).This maximizes the efficiency of the MCMC algorithm that will be used to estimate model parameters.Stochastic components µ t were further included as independent, identically distributed (iid) Gaussian random variables with mean zero and process variance σ µ 2 .
( ) The observational component in the state-space model was kept explicit for the CPUE (I ti ) information provided by each fishery i in year t.These fisheries, each with a specific gear and fishing strategy but all exploiting the same stock biomass, were classified in terms of gear (purse seine and bottom gill nets), area of operation (estuaries and sea), and type of labour relation (artisanal and industrial).Hence, it was assumed that all observed CPUEs relate to the same overall stock, B t , but each with a fishery-specific catchability coefficient, q i , for i = 1, 2, ..., F, where F is the total number of fisheries.
( ) A stochastic component was further included with the random quantities v ti which, conditional on B t , are assumed iid Gaussian with mean zero and process variance σ 2 .The lognormal multiplicative structure used in both process and observation models has been used in this form by other authors as well (McAllister and Kirkwood 1998, Meyer and Millar 1999, Brodziak and Ishmura 2011).
All model parameters were estimated using MCMC implemented in BUGS code, using the rjags package (Plummer 2013) in the R statistical computing environment (R Core Team 2015).Reference points for the Schaeffer model were estimated based on Hilborn and Walters (1992) and generalized to the Pella and Tomlinson model by Brodziak and Ishmura (2011).Models were compared with the deviance information criterion (DIC) (Spiegelhalter et al. 2002), with smaller values indicating a better fit.

Data sources
All information considered is composed of data gathered from official and unofficial bulletins published by governmental and private research agencies, as follows: the Research and Management Centre of Fishing Resources in South and Southeast Coastline (CEPSUL/ICMBio), the Federation of Fishermen of Santa Catarina (FEPESC), the Research and Management Centre of Estuary and Lagoon Fishery Resources (CEPERG), the Fishery Institute of São Paulo (IP/SP), the Fishery Studies Group from the University of Vale do Itajaí (UNIVALI/GEP) and the IBAMA Regional Office of Rio de Janeiro (IBAMA/RJ).
The time window for this study covers an overall period of 13 years, from 2000 to 2012.Annual total catches come from different fishing methods in industrial (purse seine and bottom gill net

Prior distributions
A key component in a Bayesian analysis is the inclusion of previous knowledge about model parameters in the form of informed prior distributions.Regarding the SP model parameters, the required priors consist of the probability distribution for the support capac-ity (K), the maximum intrinsic growth rate (r) and the shape parameter (z).Based on recommendations by McAllister and Kirkwood (1998) and Millar and Meyer (2000), vague priors for all parameters of the model were assumed, as shown in Table 1.A vague prior is a probability distribution carrying very little information, which is selected primarily for structural convenience (e.g.Gaussian) to warrant model stability and convergence rather than convey strong influence over the posterior distribution.
The prior distribution for K has its lower limit fixed at 15000 t, since this is close to the historical maximum reported landings of 13375 t.A prior 95% credibility interval ranging from about 15500 to 60400 t (and a mean at 28600 t) covers the range of most plausible support capacity from a biological standpoint.
Similarly, for r, the proposed prior 95% credibility intervals ranges from about 0.6 to 3.9 (with a mean about r=1.7) and covers all reasonable growth rates for this species.The uniform priors for q i only establish the order of magnitude for the relation between CPUEs (measured in tonnes per unit effort) and biomass (measured in thousands of tonnes).Finally, uniform prior distributions are defined for the precision parameters rather than variances (i.e.precision = 1/variance) because this is the parameterization used in JAGS.In all cases, posterior estimates remained well within the prior windows, away from the extremes, suggesting that these prior specifications did not conflict with information provided by the data likelihood.

RESULTS
The Schaefer (1957) and Pella and Tomlinson (1969) SP models were both adjusted to evaluate their predictive performance and sensitivity to the partially informative priors.Overall, both models showed rather similar behaviours, with close values of DIC (∆DIC<2) and with pD changing less than half a unit when the shape parameter z is included (Table 2).Differences this small indicate that the Pella and Tomlinson generalization is unable to substantially improve the model fit.Therefore, we concentrate further descriptions mostly on the conventional Schaeffer model only, but retain both outputs in tables and figures for comparison.
The documented time series of lebranche mullet catches is summarized in Figure 1, which shows an increasing trend until its maximum in 2007, followed by a decreasing trend from then.Analysing the time history of CPUE for all seven fisheries (Fig. 2), we observe a reduction between the first and last reported year in five fisheries (C, D, E, F and G), while two (A and B) show no apparent trend.
In Figures 3 and 4, some convergence diagnostics are displayed to show that acceptable convergence criteria have been met.After some preliminary tests, we settled on running the MCMC algorithm with 4 (250000) parallel chains using a burn-in of 5000 steps to eliminate the influence of starting values.A thinning of 50 steps was chosen to reduce autocorrelation in the posterior sample.
Posterior predictions of CPUE (Fig. 5) help to further evaluate model performance.In line with DIC, the distinction between the two models is minimal and of no practical relevance.In terms of coverage by the predictive 95% credibility intervals, the overall performance is acceptable for all fisheries, but only for E and G is this coverage complete.In all other cases, the models were unable to produce accurate predictions for most extreme data points.
Posterior distributions for model parameters (Fig. 6, Table 3) show strong agreement for the virgin biomass, K, in both models, while the intrinsic growth rate, r, favours somewhat larger values for the Schaeffer model.The Pella and Tomlinson model is incapable of giving a precise estimate for the shape parameter z; however, Catchability Uniform (0, 0.00001) Process error variance Inverse-Uniform (0, 5) Observation error variance Inverse-Uniform (0, 5)     the mode near unity suggests high plausibility for the Schaeffer model.Estimated reference points are obtained in the form of marginal posterior distributions, which are summarized and listed in Table 3.While the support capacity (K) is estimated to be around 30000 t, the maximum sustainable yield (MSY) is close to 7000 t.Estimates are, however, highly uncertain, as can be seen from the wide 95% credibility intervals or high coefficients of variation (CV): 0.36 for K and 0.45 for MSY (Schaeffer model).
To assess the impact of all fisheries combined on the historic evolution of the stock, the time dynamics of the ratios B t /B MSY (Fig. 7) and F t /F MSY (Fig. 8) are of central interest.The first ratio is of concern whenever it falls below one, while the second is of concern when values are above unity.Regarding the biomass ratio (Fig. 7), both models show a decreasing trend from 2007.While posterior biomass was estimated to be below B MSY from 2010 for the Pella and Tomlinson model (posterior means as point estimates), the threshold is estimated to have been crossed only in the last year of the observed series for the Schaeffer model.However, there are wide credibility intervals covering the threshold line in both cases.This indicates high uncertainty and weak statistical evidence to support this conclusion in either of them.Regarding the estimated exploitation rate, both models suggest that in recent years these ratios are lower than one.However, these conclusions are also weak since marginal credibility intervals are wide and do not support conclusive statistical evidence.Hilborn and Walters (1992) alert to the fact that estimates of reference points with biomass dynamic models can be misleading when the available time series of data is statistically uninformative.For CPUE data, lack of information can have multiple sources.For instance, it can originate from short time series and from changes in fishing power over space or time that are not accounted for.In fitting state-space models, an especially difficult feature arises when, over time, fishing data show a steady increase in effort linked to a steady decrease in CPUE.This "one-way-trip" time series is, however, very common with fishing data since this is the usual way in which almost any fishery evolves.

DISCUSSION
Analysing the plots of CPUE versus effort for the seven fisheries (not shown), we might identify weak one-way-trip-type behaviour in fisheries E, F and G, while in fisheries A and B no such pattern is apparent.This evaluation is, of course, quite subjective.It would further be hard to justify among stakeholders why data from some particular fisheries should by ignored in any assessment exercise.
To partially circumvent limitations due to the relatively short time window (13 years) and also to minimize unreliability in estimation due to the one-way-trip phenomenon, we chose to incorporate seven different fisheries simultaneously into a single model.Although all these fisheries explore the same stock, each has its own characteristics, so the impact of some of the unwanted features is expected to be diminished.
For illustrative purposes, let us suppose for a moment that the requirements for a regression-based CPUE standardization among fisheries are not met.Furthermore, let us suppose that a preliminary visualization of all available CPUE series shows a decreasing pattern over time for many, while some are stable and a few others show a reversed increasing pattern.Which CPUE series should we select for assessment?This would be a difficult call to make and to justify politically.
By simply including all CPUEs, as we did in our proposed model, the dominant pattern will eventually drive the fitting process and biomass dynamic predictions.Discrepancies among different CPUE patterns will affect the uncertainties (i.e.posterior variance) for estimated parameters and derived reference points such as MSY.The more discrepancies, the larger posterior variances will become, and vice-versa.All very much in line with what common sense would dictate.
The historic development of lebranche mullet exploitation is summarized with the phase plot in Figure 9, starting with a large biomass (B t >B MSY ) and a low exploitation rate (F t <F MSY ) in year 2000.With the exception of 2001 and 2002, when biomass showed a sudden (unexpected) decrease, the stock continued to display a large biomass until 2009, while the exploitation rate increased steadily.From 2009 to 2012, biomass stayed below B MSY , with the exploitation rate remaining high (F t >F MSY ) until 2012, when it fell below F MSY .This suggests that this reduction in exploitation rate was possibly linked mostly to economic considerations rather than to low stock density.Otherwise, since stock density has been falling, one would expect a still further increase in the exploitation rate in order to sustain catches.The phase plot suggests that care must be taken to guarantee sustainability for the leb-  ranche mullet stock.If low exploitation rates could be maintained for a longer period, one might expect stock biomass to gradually move again towards the region in the graph were it was in the year 2000.Since mullet performs reproductive aggregations at the time of its fishing season, CPUE can remain artificially stable while the stock is in fact being fished down.This phenomenon, known as hyperstability, can misinform about the stock status.It becomes a risk for stock collapse in cases where the density of stock aggregation remains above the economic threshold (Rose andKulka 1999, Erisman et al. 2011).It is important to consider that hyperstability may be occurring when M. liza stock is assessed.For instance, this might be the occurring in a very efficient fishery aiming mostly at high-valued mullet roe but less so in other fisheries targeting the fish for local consumption.The erratic behaviour of CPUE mentioned above for fisheries A and B could be an expression of some level of hyperstability, whereas it is absent from the other fisheries.
Once again, our attempt to prevent hyperstability from contaminating the estimates reinforces the argument for including CPUEs for different fisheries simultaneously into the model.After all, if we combine seven mullet fisheries from various regions, some using different gears, fishing practices and fishing grounds but all exploring the same stock, a Bayesian hierarchical model becomes a very handy tool for integrating them all into a single, consistent structure for estimating biomass time dynamics.By doing so, we expect to be able to produce more robust and reliable reference point estimates to guide fisheries management.In this specific case study, the wide posterior credibility intervals on key reference points indicate that conclusive evidence cannot be drawn yet, although the estimated trends suggest reasons for concern.
) and artisanal (bottom gill net) fisheries.Based on reliability considerations, we only used data from the industrial and artisanal fishery landings monitored in the states of São Paulo and Santa Catarina as follows: A, industrial purse seine fleet from Santa Catarina State; B, Industrial fishery fleets from Santa Catarina State (except purse seine); C, all industrial and artisanal fishery fleet from São Paulo State (except the fleet classified as D); D, all the industrial and artisanal fishery fleet from São Paulo State that operates only off the southeastern and southern coast of Brazil; E, the artisanal fishery fleet from São Paulo State that operates only in estuarine waters (except the fleet classified as G); F, the industrial and artisanal gill net fleet from São Paulo State that operates only off the southeastern and southern coast of Brazil; and G, the Artisanal gill net fleet from São Paulo State that operates only in estuarine waters.

Fig. 1 .
Fig. 1. -Time series of mullet (Mugil liza) total landings off the southeastern and southern coast of Brazil.

Fig. 2 .
Fig. 2. -CPUE time series of the mullet (Mugil liza) from different fisheries operating off the southeastern and southern coast of Brazil.A, industrial purse seine fleet from Santa Catarina State (t/fishing day); B, industrial fishery fleets from Santa Catarina State (except purse seine) (t/fishing trip); C, all industrial and artisanal fishery fleet from São Paulo State (except fleet classified as D) (t/fishing trip); D, all industrial and artisanal fishing fleets that operated in the coastal zone and landed on the coast of São Paulo (t/fishing trip); E, all artisanal fishing fleets that operated and landed only in estuarine waters from São Paulo State (t/fishing trip); F, industrial and artisanal fishing fleet of gillnets that operated in the coastal zone and landed on the coast of São Paulo (t/fishing day); G, artisanal fishing fleet of gillnets that operated and landed only in estuarine waters of São Paulo State (t/fishing day).

Fig. 3 .
Fig. 3. -Trace plot diagnostic for assessing convergence of MCMC Chains for the two Bayesian state-space surplus production models.Top, Pella-Tomlinson model; bottom, Schaeffer model.

Fig. 4 .
Fig. 4. -Gelman plot diagnostic for assessing convergence of MCMC Chains for the two Bayesian state-space surplus production models.Top, Pella-Tomlinson model; bottom, Schaeffer model.

Fig. 7 .
Fig. 7. -Time series of ratios between estimated biomass and biomass at maximum sustainable yield (B t /B MSY ) for two Bayesian state-space surplus production models.Left, Pella-Tomlinson model; Right, Schaeffer model.

Fig. 8 .
Fig. 8. -Time series of ratios between estimated exploitation rates and exploitation rate at maximum sustainable yield (F t /F MSY ) for two Bayesian state-space surplus production models.Left, Pella-Tomlinson model; Right, Schaeffer model.

Fig. 9 .
Fig. 9. -Phase plot for the mullet (Mugil liza) fishery off the southeastern and southern coast of Brazil, considering only the Pella-Tomlinson model and the seven fisheries simultaneously.

Table 1 .
-Summary of prior distributions for all model parameters in the Schaeffer and Pella-Tomlinson Bayesian state-space surplus production models.

Table 2 .
-Deviance information criteria (DIC) and effective number of estimated parameters (pD) for the Bayesian state-space surplus production models

Table 3 .
-Posterior mean, standard deviation and selected percentiles of parameters and management reference points for the Bayesian statespace surplus production models.