
Morefi
25 June 2025
Morefi.Rmd
Morefi © 2024 by Hugo Aguirre Villaseñor is licensed under a Creative Commons Attribution 4.0 International License.
Morefi
The Morefi package: Morphological Relationships Fitted by Robust Regression. It is a methodological package developed in R to analyze the submitted article:
Aguirre-Villaseñor, H., Morales-Bojórquez, E., Cisneros-Mata (FISH13711). Biometric relationships as a fisheries management tool: A case study of the bullseye puffer (Sphoeroides annulatus. Tetraodontidae). Fisheries Research.
In fisheries monitoring, body length is the most commonly measured parameter because it is quick and easy to obtain. In contrast, measuring weight requires a level and stable scale, which can be difficult to secure in field sampling. Biometric relationships are crucial in fisheries biology. When accurately calculated, these relationships can be very useful for management purposes, especially for estimating an organism’s total length or weight based on other body measurements.
Many species are marketed through artisanal fishing in various commercial forms. However, there are currently no biometric relationships that allow for predicting live weight (the total weight of the fish) from the different categories of landed weight, such as fillet weight, gutted weight, or frozen weight.
The objective of this package is to provide quantitative models for various morphological relationships that help predict: a) the expected live weight of different landed weight categories, b) the expected fillet yield from various commercial presentations, and c) testing the suitability of fillet yield as a reference point in managing the target species.
For this purpose, some functions and a vignette were created to explain the process step by step. Its implementation streamlines the methodology and enhances the clarity and impact of both the results and graphical presentations (tables and figures are personalized).
The functions included in the Morefi package enable the evaluation of length-length, weight-weight and length-weight biometric relationships using data that exhibit high variability and do not meet the assumptions necessary for adjustment via least squares. Given this variability, a robust regression method is employed for analysis. The “robustbase” package (version 0.99-2) was utilized to fit robust regression models, using the functions lmrob() for linear and nlrob() for non-linear regression (Maechler et al., 2024).
Installation
You can install the development version of Morefi from GitHub using one of the following options:
Using the pak package
# install.packages("pak")
pak::pak("Macrurido/Morefi")
or using the devtools package
Data available in the package
Bullseye puffer measures
To demonstrate how the package functions, we utilize the dataset
botete
, containing 1,397 fish across 7 variables: the total
length(LT), standard length (SL), body trunk length (LB), total weight
(WT), body trunk weight (WB), and fillet weight (Wfi) of the bullseye
puffer (Sphoeroides annulatus), collected from the Eastern
Central Pacific. In this dataset, the landing category is included in
the “Fleet” variable, which is categorized as follows: 1 indicates
Fresh, while 2 denotes Frozen-thawed.
To access the data file, the data frame is stored in an object, such as ‘mydata’.
mydata <- Morefi::Botete
Bullseye puffer fish landings
A second dataset Botete_land
, provided the Mexican
fishing records of bullseye puffer landed on the Pacific coast in 2023
and their live weight corresponding live weight for each weight category
(kg): total (WT), body trunk (WB) or fillet (Wfi), either Fresh or
Frozen-thawed (SIPESCA, 2024).
To access the data file, the data frame is stored in
catch
.
catch <- Morefi::Botete_land
Morefi functions
The Morefi package includes functions that facilitate data analysis and ensure reproducibility of results.
fn_ARSS
The function fn_ARSS
perform the Coincident Curves Test,
to determine if there are significant differences between the fitted
curves for each database. It is based on the Analysis of the Residual
Sum of Squares (ARSS) (Chen et al. 1992).
of each regression fitted by pooled data, = sum of the of each regression fitted for each individual sample, = total sample size, and = number of samples in the comparison.
The residual sum of squares
and the degrees of freedom
for each fitted regression are previously stored in the
List_TCCT
list. For each regression, the calculations are
stored in a data frame T1
, which is stored iteratively
using a loop for in a list T
.
Inside the function, the RSS and DF for the joined sample are calculated to perform the F test for two tails . The decision criteria is performed: “*” if or “NS” if the .
The function requires defining:
-
List_TCCT
: A list with fitted regression results.
-
i
: An integer value indicating the ith regression analyzed. -
alfa
: A numerical value that defines the significance level. The default number is 0.05.
Examples
The Total length (LT) - Total weight (WT) was estimated for the bullseye puffer Sphoeroides annulatus for landed categories: Fresh, Frozen-thawed (Frozen), Total (All sample) and Joined (sum of values of Fresh and Frozen). The Residual Sum of Squares (RSS) and the degrees of freedom (DF) are provided for each data source. In the table the first row displays the Analysis of Residual Sum of Squares (ARSS), the p-value (p), and the decision criteria for the ARSS test (Criteria).
The adjusted models show the following data: Fresh SSR= and DF= 742;
Frozen SSR= 1280131.81 and DF= 651; and the total sample SSR= 6115874.53
and DF= 1395. Values are stored in the table Table_CC
, this
is stored in a list, and the name of each item is built with the
acronyms of the model variables (e.g. LTWT).
Table_CC <- data.frame(matrix(NA,nrow=4,ncol=8))
Table_CC[1,1] <- "Lt-WT"
Table_CC[,2] <- c("Fresh","Frozen","Total","Joined")
colnames(Table_CC) <- c("Model","Category","RSS","DF","ARSS","F-table","p-value","Criteria")
Table_CC[1,3] <- 4424418.33
Table_CC[1,4] <- 742
Table_CC[2,3] <- 1280131.81
Table_CC[2,4] <- 651
Table_CC[3,3] <- 6115874.53
Table_CC[3,4] <- 1395
List_ARSS <- list(LTWT=Table_CC)
i <- 1
ARSS <- fn_ARSS(List_ARSS, i, alfa= 0.05)
Model | Category | RSS | DF | ARSS | F-table | p-value | Criteria |
---|---|---|---|---|---|---|---|
LT vs. WT | Fresh | 4424418 | 742 | 0.0721 | 1.0921 | 0.05007 | NS |
NA | Frozen | 1280132 | 651 | NA | NA | NA | NA |
NA | Total | 6115875 | 1395 | NA | NA | NA | NA |
NA | Joined | 5704550 | 1393 | NA | NA | NA | NA |
fn_dfa
This function uses the augment()
function from the
broom
package to extract the observed values of the
independent variable (x) and dependent variable (y), along with the
weights (wi), fitted values (fitt), and residuals (ei) from the summary
of the fitted model. It then turns these components into tidy
tibbles.
The function augment()
does not provide the weights
column for the lmrob()
function. The function
fn_dfa()
contains a conditional statement that includes
this variable in the output data frame of the linear adjustments.
In order to homogenizes the results, the columns names were renamed as “y”, “wi”,“x”,“fitt”,and “ei”.
An additional column has been included that codes errors using a
scale based on weighted values: unweighted (u), weighted (w), and
outliers (o)
dfa$scale <- ifelse(df$wi < 0.25, "o", ifelse(df$wi<1, "w", "u"))
.
The function requires defining:
-
eq
: Summary of the equation fitted.
Examples
For the bullseye puffer Sphoeroides annulatus, the
length-weight relationship is modeled using the
robustbase::nlm
function, with an ordinate input value of a
= 0.1 and a slope of b = 3. The data frame mydata
is
sourced from the Morefi::botete package.
df <- mydata[,c(1,4)]
colnames(df) <- c("x1", "y1")
a=0.01
b=3
eq <- robustbase::nlrob(y1 ~ a*x1^b, data= df,
start = list(a= a, b= b),
trace = FALSE)
dfa <- fn_dfa(eq)
y | wi | x | fitt | ei |
---|---|---|---|---|
189.1 | 1.0000000 | 21.0 | 209.81549 | -20.715491 |
213.4 | 1.0000000 | 22.0 | 242.19890 | -28.798897 |
33.4 | 1.0000000 | 12.0 | 37.32321 | -3.923211 |
236.7 | 1.0000000 | 21.0 | 209.81549 | 26.884508 |
36.0 | 1.0000000 | 12.0 | 37.32321 | -1.323211 |
457.0 | 0.5098231 | 28.4 | 532.50610 | -75.506104 |
fn_dfa
:
fn_fig_cs
: This function creates an individual plot
displaying the fitted model alongside the observed values, which are
colored according to a weighted color scale.
fn_fig_e
: The function creates a graph that displays
residuals on the vertical axis and either the independent variable or
predicted values on the horizontal axis, as determined by the
researcher. The residuals are color-coded using a weighted scale.
fn_fig_fw
: The fitted values of the models for a landed
presentation category were displayed as a multi-panel plot. The observed
data points for each fitted relationship were categorized according to a
weighted color scale.
fn_fig_w
: The residual structure was analyzed by
graphing
residuals against weighted values. A custom multi-panel plot illustrates
the structure of each fitted relationship, categorized by a
color-weighted scale of values.
fn_figs
: Creates a customized scatter plot with the
observed values (points), fitted regression (solid line), and its
confidence interval (shaded area).
fn_freq
: The function calculates a frequency
distribution of data.
fn_freqw
This is a methodological function to calculate
the percentage frequencies of weights by model adjusted using the robust
regression approach.
fn_fyield
: Calculates the fillet yield by dividing a
defined weight reference point bm by the mean, lower,
and upper confidence interval of 95% (IC95%) of the estimated fillet
weight, respectively.
fn_intervals
: Calculates a non-parametric confidence and
predicted intervals using the function predFit() from the package
investr (version 1.4.2).
fn_R2RV
: Calculates a robust version of the coefficient
of determination
(Renaud & Victoria-Feser, 2010).
fn_summary
: Customizes and stores the summary of each
fitted regression.
fn_Wlive
: The live weight, which is the total weight of
an organism, is estimated based on the weights of different landing
categories, such as eviscerated weight and fillet weight. These
estimates are derived using regression parameters that relate total
weight to the weight of each landing category.
fn_xseq
: Generates a data frame with a sequence for
independent variables (including minimum and maximum values) and selects
it according to the model.
More details
For more information, please refer to the respective vignettes, which offer detailed descriptions of each function, its operations, and examples.
Example
To demonstrate how the package works, please follow the step-by-step process outlined in the “Morefi_steps” vignette, which reconstructs the results presented in the article by Aguirre-Villaseñor et al. (FISH13711).
References
Aguirre-Villaseñor, H., Morales-Bojórquez, E., Cisneros-Mata (FISH13711). Biometric relationships as a fisheries management tool: A case study of the bullseye puffer (Sphoeroides annulatus. Tetraodontidae). Fisheries Research.
Chen, Y., Jackson, D. A., Harvey, H. H. 1992. A comparison of von Bertalanffy and polynomial functions in modelling fish growth data. Canadian Journal of Fisheries and Aquatic Sciences 49(6): 1228–1235. https://doi.org/10.1139/f92-13
Maechler M, Rousseeuw P, Croux C, Todorov V, Ruckstuhl A, Salibian-Barrera M, Verbeke T, Koller M, Conceicao EL, Anna di Palma M (2024). robustbase: Basic Robust Statistics. R package version 0.99-4-1, http://robustbase.r-forge.r-project.org/.
Renaud, O., Victoria-Feser, M. P. (2010). A robust coefficient of determination for regression. Journal of Statistical Planning and Inference. 140(7), 1852-1862. doi: 10.1016/j.jspi.2010.01.008.
SIPESCA. 2024. Sistema de Información de Pesca y Acuacultura – SIPESCA. Comisión Nacional de Pesca y Acuacultura. https://sipesca.conapesca.gob.mx (accessed 7 February 2024).