Modelos lineales mix

Author

Jefferson Ivan Lalangui

Pasos para análisis de datos

1.Crear mi diseño experimental => modelo estadístico 2.Colectar los datos 3.Importar y verificar los factores. 4.Análisis de Variancia (aov() o lm()) 5.Comparación de medias 6.Gráfico

Librerias

library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.1     ✔ stringr   1.5.1
✔ ggplot2   4.0.0     ✔ tibble    3.3.0
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(googlesheets4)
library(lme4)

Cargando paquete requerido: Matrix

Adjuntando el paquete: 'Matrix'

The following objects are masked from 'package:tidyr':

    expand, pack, unpack

library(emmeans)

Welcome to emmeans.
Caution: You lose important information if you filter this package's results.
See '? untidy'

Cargar base de datos

https://docs.google.com/spreadsheets/d/15r7ZwcZZHbEgltlF6gSFvCTFA-CFzVBWwg3mFlRyKPs/edit?gid=1263018298#gid=1263018298

url <- "https://docs.google.com/spreadsheets/d/15r7ZwcZZHbEgltlF6gSFvCTFA-CFzVBWwg3mFlRyKPs/edit?gid=172957346#gid=172957346"

gs <- url %>% 
  as_sheets_id()

fb <- gs %>% 
  range_read("fb") %>% 
  mutate(across(1:bloque, ~as.factor(.)))

! Using an auto-discovered, cached token.

  To suppress this message, modify your code or options to clearly consent to
  the use of a cached token.

  See gargle's "Non-interactive auth" vignette for more details:

  <https://gargle.r-lib.org/articles/non-interactive-auth.html>

ℹ The googlesheets4 package is using a cached token for
  '7636340121@untrm.edu.pe'.

✔ Reading from "LA MOLINA 2014 POTATO WUE (FB)".

✔ Range ''fb''.

str(fb)

tibble [150 × 18] (S3: tbl_df/tbl/data.frame)
 $ riego  : Factor w/ 2 levels "irrigado","sequia": 2 2 1 2 1 1 1 1 2 2 ...
 $ geno   : Factor w/ 15 levels "G01","G02","G03",..: 1 2 1 2 3 4 1 5 6 5 ...
 $ block  : Factor w/ 5 levels "1","2","3","4",..: 2 4 3 1 2 5 1 4 2 1 ...
 $ bloque : Factor w/ 5 levels "I","II","III",..: 2 4 3 1 2 5 1 4 2 1 ...
 $ spad_29: num [1:150] 56.3 52.7 49.2 55.5 58.2 43.5 57.4 56.1 61 60.3 ...
 $ spad_83: num [1:150] 41.1 47.9 41.6 44.2 32.6 37.8 42.5 35.9 57.5 41.8 ...
 $ rwc_84 : num [1:150] 61.5 63.2 67.7 64.9 74.5 ...
 $ op_84  : num [1:150] -2.43 -3.03 -2.5 -2.4 -2.27 ...
 $ leafdw : num [1:150] 13.28 9.42 18.22 8.84 14.55 ...
 $ stemdw : num [1:150] 14.87 8.63 24.19 6.58 12.63 ...
 $ rootdw : num [1:150] 3.83 2.1 3.16 2 1.83 2.83 2.28 3.65 4.04 4.17 ...
 $ tubdw  : num [1:150] 19.8 17.7 38 13.5 51.1 ...
 $ biomdw : num [1:150] 51.8 37.8 83.6 30.9 80.2 ...
 $ hi     : num [1:150] 0.45 0.43 0.455 0.437 0.638 ...
 $ ttrans : num [1:150] 4.5 3.54 8.39 2.9 7.37 ...
 $ wue    : num [1:150] 11.51 10.69 9.97 10.65 10.88 ...
 $ twue   : num [1:150] 4.4 4.99 4.53 4.65 6.94 ...
 $ lfa    : num [1:150] 2900 2619 7579 2450 5413 ...

Modelo

\[ tubdw = \mu + bloque + riego + geno + riego*geno + e\]

Analisis de variancia

md <- lmer(formula = twue ~ 0 + (1|bloque) + riego*geno
           , data = fb)

anova(md)

Analysis of Variance Table
           npar Sum Sq Mean Sq  F value
riego         2 497.54 248.769 367.2518
geno         14 412.99  29.500  43.5494
riego:geno   14  16.06   1.147   1.6933

library(car)

Cargando paquete requerido: carData


Adjuntando el paquete: 'car'

The following object is masked from 'package:dplyr':

    recode

The following object is masked from 'package:purrr':

    some

Anova(md)

Analysis of Deviance Table (Type II Wald chisquare tests)

Response: twue
             Chisq Df Pr(>Chisq)    
riego      202.675  2     <2e-16 ***
geno       609.692 14     <2e-16 ***
riego:geno  23.706 14     0.0497 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

#> chatgtp completo

library(lme4)
library(emmeans)
library(multcomp)

Cargando paquete requerido: mvtnorm

Cargando paquete requerido: survival

Cargando paquete requerido: TH.data

Cargando paquete requerido: MASS


Adjuntando el paquete: 'MASS'

The following object is masked from 'package:dplyr':

    select


Adjuntando el paquete: 'TH.data'

The following object is masked from 'package:MASS':

    geyser

library(multcompView)
library(ggplot2)
library(dplyr)

# Modelo mixto
md <- lmer(twue ~ 0 + (1|bloque) + riego*geno, data = fb)

# ANOVA tipo II o III según interés
library(car)
Anova(md, type = 3)

Analysis of Deviance Table (Type III Wald chisquare tests)

Response: twue
             Chisq Df Pr(>Chisq)    
riego      199.999  2     <2e-16 ***
geno       261.502 14     <2e-16 ***
riego:geno  23.706 14     0.0497 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# Medias ajustadas para la interacción riego:geno
emm <- emmeans(md, ~ riego:geno)

# Comparación de medias con ajuste de Tukey
comp <- pairs(emm, adjust = "tukey")

# Generación de letras de significancia
letras <- multcomp::cld(emm, adjust = "tukey", Letters = letters)

Note: adjust = "tukey" was changed to "sidak"
because "tukey" is only appropriate for one set of pairwise comparisons

letras

 riego    geno emmean    SE   df lower.CL upper.CL .group           
 sequia   G06    1.52 0.409 58.3    0.171     2.86  a               
 irrigado G06    2.11 0.409 58.3    0.760     3.45  ab              
 irrigado G13    2.91 0.409 58.3    1.566     4.26  abc             
 sequia   G08    3.31 0.409 58.3    1.960     4.65  abcd            
 sequia   G13    3.42 0.409 58.3    2.070     4.76  abcde           
 irrigado G08    3.77 0.409 58.3    2.428     5.12   bcdef          
 irrigado G14    3.96 0.409 58.3    2.610     5.30   bcdefg         
 sequia   G12    3.98 0.409 58.3    2.630     5.32   bcdefg         
 irrigado G02    4.01 0.409 58.3    2.664     5.36   bcdefg         
 sequia   G02    4.28 0.409 58.3    2.930     5.62    cdefgh        
 irrigado G01    4.31 0.409 58.3    2.968     5.66    cdefgh        
 irrigado G12    4.57 0.409 58.3    3.226     5.92    cdefghi       
 sequia   G01    4.61 0.409 58.3    3.265     5.96    cdefghi       
 sequia   G14    4.63 0.409 58.3    3.284     5.98    cdefghij      
 irrigado G10    5.07 0.409 58.3    3.727     6.42     defghijk     
 sequia   G05    5.27 0.409 58.3    3.929     6.62     defghijkl    
 irrigado G05    5.35 0.409 58.3    4.004     6.70      efghijklm   
 sequia   G04    5.42 0.409 58.3    4.073     6.77       fghijklm   
 irrigado G04    5.52 0.409 58.3    4.174     6.87       fghijklmn  
 sequia   G09    5.78 0.409 58.3    4.431     7.12        ghijklmno 
 irrigado G09    6.13 0.409 58.3    4.788     7.48         hijklmno 
 irrigado G07    6.53 0.409 58.3    5.185     7.88          ijklmno 
 sequia   G10    6.61 0.409 58.3    5.267     7.96           jklmnop
 irrigado G03    6.73 0.409 58.3    5.388     8.08            klmnop
 sequia   G15    7.10 0.409 58.3    5.750     8.44             lmnop
 sequia   G07    7.12 0.409 58.3    5.779     8.47             lmnop
 irrigado G11    7.32 0.409 58.3    5.971     8.66              mnop
 irrigado G15    7.47 0.409 58.3    6.124     8.82               nop
 sequia   G03    7.64 0.409 58.3    6.297     8.99                op
 sequia   G11    8.59 0.409 58.3    7.241     9.93                 p

Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
Conf-level adjustment: sidak method for 30 estimates 
P value adjustment: tukey method for comparing a family of 30 estimates 
significance level used: alpha = 0.05 
NOTE: If two or more means share the same grouping symbol,
      then we cannot show them to be different.
      But we also did not show them to be the same.

# Limpieza del data frame con letras
letras_df <- letras %>%
  as.data.frame() %>%
  mutate(riego = factor(riego, levels = unique(fb$riego)),
         geno = factor(geno, levels = unique(fb$geno)))

ggplot(letras_df, aes(x = geno, y = emmean, fill = riego)) +
  geom_col(position = position_dodge(width = 0.8)) +
  geom_errorbar(aes(ymin = emmean - SE, ymax = emmean + SE),
                width = 0.2, position = position_dodge(width = 0.8)) +
  geom_text(aes(label = .group, y = emmean + SE + 0.05),
            position = position_dodge(width = 0.8),
            size = 3, vjust = 0, angle = 90) +
  labs(x = "Genotipos", y = "TWUE (eficiencia del uso del agua total)",
       fill = "Riego") +
  theme_classic(base_size = 14) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

library(inti)

Cargando paquete requerido: shiny

letras_df %>% 
  plot_smr(x = "geno", y = "emmean", group = "riego"
           , error = "SE", sig = ".group")

Warning in geom_col(position = position_dodge2(preserve = "single"), colour =
"black", : Ignoring unknown parameters: `size`