Skip to contents

Morefi Morefi website

Shield: CC BY 4.0

Morefi © 2024 by Hugo Aguirre Villaseñor is licensed under a Creative Commons Attribution 4.0 International License.

CC BY 4.0

Morefi

The Morefi package: Morphological Relationships Fitted by Robust Regression. It is a methodological package developed in R to analyze the submitted article:

Aguirre-Villaseñor, H., Morales-Bojórquez, E., Cisneros-Mata (FISH13711). Biometric relationships as a fisheries management tool: A case study of the bullseye puffer (Sphoeroides annulatus. Tetraodontidae). Fisheries Research.

In fisheries monitoring, body length is the most commonly measured parameter because it is quick and easy to obtain. In contrast, measuring weight requires a level and stable scale, which can be difficult to secure in field sampling. Biometric relationships are crucial in fisheries biology. When accurately calculated, these relationships can be very useful for management purposes, especially for estimating an organism’s total length or weight based on other body measurements.

Many species are marketed through artisanal fishing in various commercial forms. However, there are currently no biometric relationships that allow for predicting live weight (the total weight of the fish) from the different categories of landed weight, such as fillet weight, gutted weight, or frozen weight.

The objective of this package is to provide quantitative models for various morphological relationships that help predict: a) the expected live weight of different landed weight categories, b) the expected fillet yield from various commercial presentations, and c) testing the suitability of fillet yield as a reference point in managing the target species.

For this purpose, some functions and a vignette were created to explain the process step by step. Its implementation streamlines the methodology and enhances the clarity and impact of both the results and graphical presentations (tables and figures are personalized).

The functions included in the Morefi package enable the evaluation of length-length, weight-weight and length-weight biometric relationships using data that exhibit high variability and do not meet the assumptions necessary for adjustment via least squares. Given this variability, a robust regression method is employed for analysis. The “robustbase” package (version 0.99-2) was utilized to fit robust regression models, using the functions lmrob() for linear and nlrob() for non-linear regression (Maechler et al., 2024).

Installation

You can install the development version of Morefi from GitHub using one of the following options:

Using the pak package

# install.packages("pak")
pak::pak("Macrurido/Morefi")

or using the devtools package

library(devtools)
install_github("Macrurido/Morefi")

Data available in the package

Bullseye puffer measures

To demonstrate how the package functions, we utilize the dataset botete, containing 1,397 fish across 7 variables: the total length(LT), standard length (SL), body trunk length (LB), total weight (WT), body trunk weight (WB), and fillet weight (Wfi) of the bullseye puffer (Sphoeroides annulatus), collected from the Eastern Central Pacific. In this dataset, the landing category is included in the “Fleet” variable, which is categorized as follows: 1 indicates Fresh, while 2 denotes Frozen-thawed.

To access the data file, the data frame is stored in an object, such as ‘mydata’.

mydata <- Morefi::Botete

Bullseye puffer fish landings

A second dataset Botete_land, provided the Mexican fishing records of bullseye puffer landed on the Pacific coast in 2023 and their live weight corresponding live weight for each weight category (kg): total (WT), body trunk (WB) or fillet (Wfi), either Fresh or Frozen-thawed (SIPESCA, 2024).

To access the data file, the data frame is stored in catch.

catch <- Morefi::Botete_land

Morefi functions

The Morefi package includes functions that facilitate data analysis and ensure reproducibility of results.

fn_ARSS

The function fn_ARSS perform the Coincident Curves Test, to determine if there are significant differences between the fitted curves for each database. It is based on the Analysis of the Residual Sum of Squares (ARSS) (Chen et al. 1992).

F=RSSpRSSs3(K1)RSSsN3KF=\frac{\frac{RSS_{p}-RSS_{s}}{3\bullet \left( K-1 \right)}}{\frac{RSS_{s}}{N-3\bullet K}}

RSSp=RSSRSS_{p}= RSS of each regression fitted by pooled data, RSSsRSS_{s}= sum of the RSSRSS of each regression fitted for each individual sample, NN= total sample size, and KK = number of samples in the comparison.

The residual sum of squares RSSRSS and the degrees of freedom DFDF for each fitted regression are previously stored in the List_TCCT list. For each regression, the calculations are stored in a data frame T1, which is stored iteratively using a loop for in a list T.

Inside the function, the RSS and DF for the joined sample are calculated to perform the F test for two tails α/2\alpha/2. The decision criteria is performed: “*” if pvalueαp-value\le \alpha or “NS” if the pval>αp-val\gt \alpha.

The function requires defining:

  • List_TCCT: A list with fitted regression results.
  • i: An integer value indicating the ith regression analyzed.
  • alfa: A numerical value that defines the significance level. The default number is 0.05.

Examples

The Total length (LT) - Total weight (WT) was estimated for the bullseye puffer Sphoeroides annulatus for landed categories: Fresh, Frozen-thawed (Frozen), Total (All sample) and Joined (sum of values of Fresh and Frozen). The Residual Sum of Squares (RSS) and the degrees of freedom (DF) are provided for each data source. In the table the first row displays the Analysis of Residual Sum of Squares (ARSS), the p-value (p), and the decision criteria for the ARSS test (Criteria).

The adjusted models show the following data: Fresh SSR= and DF= 742; Frozen SSR= 1280131.81 and DF= 651; and the total sample SSR= 6115874.53 and DF= 1395. Values are stored in the table Table_CC, this is stored in a list, and the name of each item is built with the acronyms of the model variables (e.g. LTWT).

Table_CC <- data.frame(matrix(NA,nrow=4,ncol=8))
Table_CC[1,1] <- "Lt-WT" 
Table_CC[,2] <- c("Fresh","Frozen","Total","Joined")
colnames(Table_CC) <- c("Model","Category","RSS","DF","ARSS","F-table","p-value","Criteria")

Table_CC[1,3] <- 4424418.33
Table_CC[1,4] <-  742 
Table_CC[2,3] <- 1280131.81
Table_CC[2,4] <-  651 
Table_CC[3,3] <- 6115874.53
Table_CC[3,4] <-  1395 

List_ARSS <- list(LTWT=Table_CC)

i <- 1

ARSS <- fn_ARSS(List_ARSS, i,  alfa= 0.05)
Table of the Coincident Curve Test.
Model Category RSS DF ARSS F-table p-value Criteria
LT vs. WT Fresh 4424418 742 0.0721 1.0921 0.05007 NS
NA Frozen 1280132 651 NA NA NA NA
NA Total 6115875 1395 NA NA NA NA
NA Joined 5704550 1393 NA NA NA NA

fn_dfa

This function uses the augment() function from the broom package to extract the observed values of the independent variable (x) and dependent variable (y), along with the weights (wi), fitted values (fitt), and residuals (ei) from the summary of the fitted model. It then turns these components into tidy tibbles.

The function augment() does not provide the weights column for the lmrob() function. The function fn_dfa() contains a conditional statement that includes this variable in the output data frame of the linear adjustments.

In order to homogenizes the results, the columns names were renamed as “y”, “wi”,“x”,“fitt”,and “ei”.

An additional column has been included that codes errors using a scale based on weighted values: unweighted (u), weighted (w), and outliers (o) dfa$scale <- ifelse(df$wi < 0.25, "o", ifelse(df$wi<1, "w", "u")).

The function requires defining:

  • eq: Summary of the equation fitted.

Examples

For the bullseye puffer Sphoeroides annulatus, the length-weight relationship is modeled using the robustbase::nlm function, with an ordinate input value of a = 0.1 and a slope of b = 3. The data frame mydata is sourced from the Morefi::botete package.

df <- mydata[,c(1,4)]
colnames(df) <- c("x1", "y1")

a=0.01
b=3

eq <- robustbase::nlrob(y1 ~ a*x1^b, data= df,
                        start = list(a= a, b= b),
                        trace = FALSE)

dfa <- fn_dfa(eq)
The initial six rows of the table are displayed.
y wi x fitt ei
189.1 1.0000000 21.0 209.81549 -20.715491
213.4 1.0000000 22.0 242.19890 -28.798897
33.4 1.0000000 12.0 37.32321 -3.923211
236.7 1.0000000 21.0 209.81549 26.884508
36.0 1.0000000 12.0 37.32321 -1.323211
457.0 0.5098231 28.4 532.50610 -75.506104

fn_dfa:

fn_fig_cs: This function creates an individual plot displaying the fitted model alongside the observed values, which are colored according to a weighted color scale.

fn_fig_e: The function creates a graph that displays residuals on the vertical axis and either the independent variable or predicted values on the horizontal axis, as determined by the researcher. The residuals are color-coded using a weighted scale.

fn_fig_fw: The fitted values of the models for a landed presentation category were displayed as a multi-panel plot. The observed data points for each fitted relationship were categorized according to a weighted color scale.

fn_fig_w: The residual structure was analyzed by graphing
residuals against weighted values. A custom multi-panel plot illustrates the structure of each fitted relationship, categorized by a color-weighted scale of values.

fn_figs: Creates a customized scatter plot with the observed values (points), fitted regression (solid line), and its confidence interval (shaded area).

fn_freq: The function calculates a frequency distribution of data.

fn_freqw This is a methodological function to calculate the percentage frequencies of weights by model adjusted using the robust regression approach.

fn_fyield: Calculates the fillet yield by dividing a defined weight reference point bm by the mean, lower, and upper confidence interval of 95% (IC95%) of the estimated fillet weight, respectively.

fn_intervals: Calculates a non-parametric confidence and predicted intervals using the function predFit() from the package investr (version 1.4.2).

fn_R2RV: Calculates a robust version of the coefficient of determination RRV2R^{2}_{RV} (Renaud & Victoria-Feser, 2010).

fn_summary: Customizes and stores the summary of each fitted regression.

fn_Wlive: The live weight, which is the total weight of an organism, is estimated based on the weights of different landing categories, such as eviscerated weight and fillet weight. These estimates are derived using regression parameters that relate total weight to the weight of each landing category.

fn_xseq: Generates a data frame with a sequence for independent variables (including minimum and maximum values) and selects it according to the model.

More details

For more information, please refer to the respective vignettes, which offer detailed descriptions of each function, its operations, and examples.

Example

To demonstrate how the package works, please follow the step-by-step process outlined in the “Morefi_steps” vignette, which reconstructs the results presented in the article by Aguirre-Villaseñor et al. (FISH13711).

References

Aguirre-Villaseñor, H., Morales-Bojórquez, E., Cisneros-Mata (FISH13711). Biometric relationships as a fisheries management tool: A case study of the bullseye puffer (Sphoeroides annulatus. Tetraodontidae). Fisheries Research.

Chen, Y., Jackson, D. A., Harvey, H. H. 1992. A comparison of von Bertalanffy and polynomial functions in modelling fish growth data. Canadian Journal of Fisheries and Aquatic Sciences 49(6): 1228–1235. https://doi.org/10.1139/f92-13

Maechler M, Rousseeuw P, Croux C, Todorov V, Ruckstuhl A, Salibian-Barrera M, Verbeke T, Koller M, Conceicao EL, Anna di Palma M (2024). robustbase: Basic Robust Statistics. R package version 0.99-4-1, http://robustbase.r-forge.r-project.org/.

Renaud, O., Victoria-Feser, M. P. (2010). A robust coefficient of determination for regression. Journal of Statistical Planning and Inference. 140(7), 1852-1862. doi: 10.1016/j.jspi.2010.01.008.

SIPESCA. 2024. Sistema de Información de Pesca y Acuacultura – SIPESCA. Comisión Nacional de Pesca y Acuacultura. https://sipesca.conapesca.gob.mx (accessed 7 February 2024).