/r/rstats

Photograph via snooOG

A subreddit for all things related to the R Project for Statistical Computing. Questions, news, and comments about R programming, R packages, RStudio, and more.

PLEASE READ THIS BEFORE POSTING

Welcome to /r/rstats - the subreddit for all things R (the programming language)!

For code problems, Stack Overflow is a better platform. For short questions, Twitter #rstats tag is a good place. For longer questions or discussions, RStudio Community is another great resource.

If your account is new, your post may be automatically flagged and removed. If you don't see your post show up, please message the mods and we'll manually approve it.

Rules:

  1. Be polite and good to each other.
  2. Post only R-related content. This also means no "Why is Other Language better than R?" threads
  3. No blatant self-promotion ("subscribe to my channel!"). This includes affiliate links!
  4. No memes (for that, go to /r/rstatsmemes/)

You can also check out our sister sub /r/Rlanguage

/r/rstats

87,273 Subscribers

0

Hi, everyone! In my work, I have a problem that I need to solve. Basically, I have to determine what is the minimum size of past events to classify a new event with certain.

Imagine we have had only three events like that in four years, and all events had the same cause. If I only consider classical probability, I would classify every event as we had. However, that doesn't seem right to me, because I have a small number of events to base that conclusion on. Do you have any theoretical material to help me with this question?

0 Comments
2024/11/09
21:50 UTC

5

GG earth integration into a shiny app

Hi everyone! Is there a Rshiny fan who can help? Is it possible to integrate a Google Earth window into a shiny application to view kml files?

2 Comments
2024/11/08
15:34 UTC

2

How to create new column based on a named vector lookup?

Say I have a dataframe and I'd like to add a column to based on a mapping I already have e.g.:

df <-data.frame(Col1 = c(1.1, 2.3, 5.4, 0.4), Col2 = c('A','B','C','D'))
vec = c('A' = 4, 'B' = 3, 'C' = 2, 'D' = 1)

What I'd like to get is this:

> df
Col1 Col2 Col3
1 1.1 A 4
2 2.3 B 3
3 5.4 C 2
4 0.4 D 1

I know I can use case_when() in dplyr, but that seems long-winded. Is there a more efficient way by using the named vector? I'm sure there must be but google is failing me.

edit: formatting

4 Comments
2024/11/08
10:27 UTC

3

help with ggplot reordering columns

Hello! I'm trying to order a set of stacked columns in a ggplot plot and I can't figure it out, everywhere online says to use a factor, which only works if your plot draws on one data set as far as i can tell :(. Can anyone help me reorder these columns so that "Full Group" is first and "IMM" is last? Thank you!

Here is the graph I'm trying to change and the code:

https://preview.redd.it/f2fzkg7zzhzd1.png?width=865&format=png&auto=webp&s=416254b96348a7f5b1bcf972bc989595fccfb6de

 print(ggplot()
    + geom_col(data = C, aes(y = Freq/sum(Freq), x = Group, color = Var1, fill = Var1))

    + geom_col(data = `C Split`[["GLM"]], aes(y = Freq/sum(Freq), x = Var2, color = Var1, fill = Var1))

    + geom_col(data = `C Split`[["PLM"]], aes(y = Freq/sum(Freq), x = Var2, color = Var1, fill = Var1))

    + geom_col(data = `C Split`[["PLF"]], aes(y = Freq/sum(Freq), x = Var2, color = Var1, fill = Var1))

    + geom_col(data = `C Split`[["IMM"]], aes(y = Freq/sum(Freq), x = Var2, color = Var1, fill = Var1))

    + xlab("Age/Sex Category")
    + ylab("Frequency")
    + labs(fill = "Behaviour")
    + labs(color = "Behaviour")
    + ggtitle("C Group Activity")
    + scale_fill_manual(values = viridis(n=5))
    + scale_color_manual(values = viridis(n=5))
    + theme_classic()
    + theme(plot.title = element_text(hjust = 0.5))
    + theme(text=element_text(family = "crimson-pro"))
    + theme(text=element_text(face = "bold"))
    + scale_y_continuous(limits = c(0,1), expand = c(0, 0)))
6 Comments
2024/11/07
15:30 UTC

1

R: VGLM to Fit a Partial Proportional Odds model, unable to specify which variable to hold to proportional odds

Hi all,

My dependent variable is an ordered factor, gender is a factor of 0,1, main variable of interest (first listed) is my primary concern, and assumptions hold for only it when using Brent test.

When trying to fit using VGLM and specifying that it be treated as holding to prop odds, but not the others, I've had no joy.

> logit_model <- vglm(dep_var ~ primary_indep_var + 
+                       gender + 
+                       var_3 + var_4 + var_5,
+                     
+                     family = cumulative(parallel = c(TRUE ~ 1 + primary_indep_var), 
+                                         link = "cloglog"), 
+                     data = temp)

Error in x$terms %||% attr(x, "terms") %||% stop("no terms component nor attribute") : 
  no terms component nor attribute

Any help would be appreciated!

With thanks

0 Comments
2024/11/07
14:08 UTC

4

lavaan probit model: calculation of simple slopes and std. error

I am trying to implement the calculation for simple slopes estimation for probit models in lavaan as it is currently not support in semTools (I will cross-post).

The idea is to be able to plot the slope of a regression coefficient and the corresponding CI. So far, we can achieve this in lavaan + emmeans using a linear probability model.


library(semTools)
library(lavaan)
library(emmeans)
library(ggplot2)
# Load the PoliticalDemocracy dataset
data("PoliticalDemocracy")

# Create a binary indicator for dem60 (e.g., using median as a threshold)
PoliticalDemocracy$dem60_bin <- ifelse(PoliticalDemocracy$y1 >= mean(PoliticalDemocracy$y1), 1, 0)


### LPM

model <- '
  # Latent variable definition
  ind60 =~ y1 + y2 + y3 + y4
  # Probit regression with ind60 predicting dem60_bin
  dem60_bin ~ b*ind60

'

# Fit the model using WLSMV estimator for binary outcomes
fit <- sem(model, data = PoliticalDemocracy, meanstructure=TRUE)

summary(fit)

slope <- emmeans(fit, "ind60", 
                 lavaan.DV = "dem60_bin", 
                 at = list(ind60 = seq(-2, 2, 0.01)),
                 rg.limit = 60000)|> data.frame()

# Plot the marginal effect of the latent variable ind60 with standard errors
ggplot(slope, aes(x = ind60, y = emmean)) +
  geom_line(color = "blue") +
  geom_ribbon(aes(ymin = asymp.LCL, ymax = asymp.UCL), 
              alpha = 0.2, fill = "lightblue") +
  labs(
    title = "Marginal Effect of ind60 (Latent Variable) on the Predicted Probability of dem60_bin",
    x = "ind60 (Latent Variable)",
    y = "Marginal Effect of ind60"
  ) +
  theme_minimal()

However, semTools does not support any link function at this point so I have to relay on manual calculations to obtain the predicted probabilities. So far, I am able to estimate the change in probability for the slope and the marginal probabilities. However, I am pretty sure that the way I am calculating the SE is wrong as they too small compared to the lpm model. any advice on this is highly appreciated.


### PROBIT LINK 
# Define the probit model with ind60 as a latent variable
model <- '
  # Latent variable definition
  ind60 =~ y1 + y2 + y3 + y4
  # intercept/threeshold
  dem60_bin|"t1"*t1

  # Probit regression with ind60 predicting dem60_bin
  dem60_bin ~ b*ind60
  # the slope exprssed in probabilities
  slope :=  pnorm(-t1)*b

'

# Fit the model using WLSMV estimator for binary outcomes
fit <- sem(model, data = PoliticalDemocracy, ordered = "dem60_bin", estimator = "WLSMV")
summary(fit)
# Extract model coefficients
coef_fit <- coef(fit)
intercept <- coef_fit["t1"]
beta_ind60 <- coef_fit["b"]
params <- parameterEstimates(fit)
se_beta_ind60 <- params[params$label == "b", "se"]

# Define a range of ind60 values for the marginal effect calculation
# Here, we will use the predicted values from the latent variable
ind60_seq <- seq(-2, 2, length.out = 100)  # Assuming a standard range for latent variable

# Calculate marginal effects for each value of ind60 in ind60_seq
marginal_effects_ind60 <- sapply(ind60_seq, function(ind60_value) {
  # Linear predictor for given ind60_value
  linear_predictor <- -intercept + (beta_ind60 * ind60_value)
  pdf_value <- pnorm(linear_predictor)
  #marginal_effect <- pdf_value * beta_ind60
  return(pdf_value)
})


# Standard errors for marginal effects using the Delta Method
se_marginal_effects <- sapply(ind60_seq, function(ind60_value) {
  # Linear predictor for given ind60_value
  linear_predictor <- -intercept + beta_ind60 * ind60_value
  pdf_value <- dnorm(linear_predictor)
  
  # Delta Method: SE = |f'(x)| * SE(beta)
  marginal_effect_se <- abs(pdf_value) * se_beta_ind60
  return(marginal_effect_se)
})


plot_data <- data.frame(ind60 = ind60_seq, marginal_effect = marginal_effects_ind60, se_marginal_effect = se_marginal_effects)

# Plot the marginal effect of the latent variable ind60 with standard errors
ggplot(plot_data, aes(x = ind60, y = marginal_effect)) +
  geom_line(color = "blue") +
  geom_ribbon(aes(ymin = marginal_effect - 1.96 * se_marginal_effect, ymax = marginal_effect + 1.96 * se_marginal_effect), 
              alpha = 0.2, fill = "lightblue") +
  labs(
    title = "Marginal Effect of ind60 (Latent Variable) on the Predicted Probability of dem60_bin",
    x = "ind60 (Latent Variable)",
    y = "Marginal Effect of ind60"
  ) +
  theme_minimal()
0 Comments
2024/11/06
23:28 UTC

2

Saving a chunk as docx

Hi!

Is there any way I can code a chunk into a word doc? I've been googling but the only solution I find is to save the whole project as a doc in the output but that is not what I need. I just want the one chunk to become a word doc. TIA

8 Comments
2024/11/05
23:22 UTC

0

Keras and tensorflow in posit not working.

I'm trying to construct a neural network using keras and tensorflow in R but when try to create the model it tells me "valid installation of tensorflow not found". Anyone know any fixes or do I just need to try python?

9 Comments
2024/11/05
10:28 UTC

11

Going for my first useR group meeting - any advice?

I am going for my first useR meeting, and I am superhyped for meeting fellow nerds - but what “should I do”?

I am primarily there to find connections and like-minded people. Should I bring my laptop? Or are everybody there to expand their connections?

What was your experience the first time you were there if you ever went!

Best,

5 Comments
2024/11/05
07:26 UTC

0

R for android 14 Samsung tab s10+

Hi I'm looking for reliable R app for my Samsung tab s10+ for uni.

Does anyone know a good app?

4 Comments
2024/11/05
06:46 UTC

1

Extracting coordinates using Magick - Map is rotated?

Hey guys! See my code below.

I'm trying to have an ''automated'' script to get coordinates from sampling sites of various maps from different articles as I'm building a mega dataset for my Msc. I know I could use QGIS, but we're R lovers in the lab so it would be better to use R annnd.. well I find it easier and more intuitive. The pixel coordinates were found with GIMP (very straitghtfoward) and I simply 4 very identifiable points in the map for the references (such as the state line). I feel I am so so so close to having this perfect, but the points and output map are squished and inverted?

Please help :(

EDIT: It is indeed a ChatGPT code you can see below, as I wanted it to get rid of all superficial notes and other stuff I had in my code so it would be a more straightforward read for you guys. I'm not lazy, I worked hard on this and exhausted all ressources and mental energy before reaching out to Reddit. I was told to do a reprex, which I will, but in the meantime if anyone has any info that could help, please do leave a kind comment. Cheers!

# Load necessary libraries

library(imager)

library(dplyr)

library(ggplot2)

library(maps)

### Step 1: Load and display the map image ###

img <- load.image("C:/Users/Dell/Desktop/Raw_maps/Gauthier_Morone_saxatilis_georef.png")

# Get the image dimensions to set proper plot limits

image_width <- dim(img)[2] # Width of the image in pixels

image_height <- dim(img)[1] # Height of the image in pixels

# Step 2: Plot the image with correct aspect ratio

plot.new()

plot.window(xlim = c(0, image_width), ylim = c(0, image_height), asp = image_width / image_height)

rasterImage(as.raster(img), 0, 0, image_width, image_height)

### Step 3: Define 4 reference points with known lat/lon coordinates ###

# Reference points

pixel_1 <- c(208, 0) # Pixel coordinates

latlon_1 <- c(39.724106, -75.790962) # Latitude and longitude

pixel_2 <- c(95, 370) # Pixel coordinates

latlon_2 <- c(36.961640, -76.416772) # Latitude and longitude

pixel_3 <- c(307, 171) # Pixel coordinates

latlon_3 <- c(38.454054, -75.051513) # Latitude/longitude

pixel_4 <- c(27, 182) # Pixel coordinates

latlon_4 <- c(37.660713, -77.139555) # Latitude/longitude

# Step 4: Create a data frame for all four reference points

ref_points <- data.frame(

X = c(pixel_1[1], pixel_2[1], pixel_3[1], pixel_4[1]),

Y = c(pixel_1[2], pixel_2[2], pixel_3[2], pixel_4[2]),

Longitude = c(latlon_1[2], latlon_2[2], latlon_3[2], latlon_4[2]), # Longitudes

Latitude = c(latlon_1[1], latlon_2[1], latlon_3[1], latlon_4[1]) # Latitudes

)

### Step 5: Apply Pixel Scale Factor for 100 km ###

# Replace this with the pixel length of the 100 km scale bar measured in GIMP

scale_bar_pixels <- 124 # Replace with actual value from your map

km_per_pixel <- 100 / scale_bar_pixels # Calculate kilometers per pixel

# Latitude conversion: 1 degree of latitude is approximately 111 km

scale_lat <- km_per_pixel / 111 # Degrees of latitude per pixel

# Longitude conversion depends on latitude (adjust for the latitude of your region)

avg_lat <- mean(c(latlon_1[1], latlon_2[1], latlon_3[1], latlon_4[1])) # Average central latitude

km_per_lon_degree <- 111.32 * cos(avg_lat * pi / 180) # Adjust for Earth's curvature

scale_lon <- km_per_pixel / km_per_lon_degree # Degrees of longitude per pixel

### Step 6: Build affine transformation models for lat/lon ###

# Linear models for longitude and latitude transformation using the four reference points

lon_model <- lm(Longitude ~ X + Y, data = ref_points)

lat_model <- lm(Latitude ~ X + Y, data = ref_points)

### Step 7: Select sampling site coordinates on the image ###

cat("Click on the sampling sites on the image...\n")

# Click on the sampling sites to get their pixel coordinates

sampling_site_coords <- locator(n = 26) # Adjust the number of points as needed

# Store the sampling site pixel coordinates in a data frame

sampling_sites <- data.frame(

X = sampling_site_coords$x,

Y = sampling_site_coords$y

)

### Step 8: Apply the transformation models to get lat/lon for sampling sites ###

sampling_sites <- sampling_sites %>%

mutate(

Longitude = predict(lon_model, newdata = sampling_sites),

Latitude = predict(lat_model, newdata = sampling_sites)

)

# Print the georeferenced coordinates

print("Georeferenced sampling sites:")

print(sampling_sites)

### Step 9: Save the georeferenced sampling sites to a CSV file ###

write.csv(sampling_sites, "georeferenced_sampling_sites_with_scale_and_4_points.csv", row.names = FALSE)

cat("Georeferenced sampling sites saved to 'georeferenced_sampling_sites_with_scale_and_4_points.csv'.\n")

### Step 10: Validation Process (Plot and Check the Coordinates) ###

# Load the CSV file with georeferenced sampling sites

sampling_sites <- read.csv("georeferenced_sampling_sites_with_scale_and_4_points.csv")

# Inspect the data to make sure it's loaded correctly

print("Loaded sampling site data:")

print(head(sampling_sites))

# Automatically detect limits for latitude and longitude

lat_limits <- range(sampling_sites$Latitude, na.rm = TRUE)

lon_limits <- range(sampling_sites$Longitude, na.rm = TRUE)

# Print the detected limits for validation

cat("Detected latitude limits:", lat_limits, "\n")

cat("Detected longitude limits:", lon_limits, "\n")

# Plot the sampling sites on a world map using ggplot2 with auto-detected limits

world_map <- map_data("world")

ggplot() +

# Plot the world map

geom_polygon(data = world_map, aes(x = long, y = lat, group = group),

fill = "grey", color = "black") +

# Plot the sampling sites from the CSV file

geom_point(data = sampling_sites, aes(x = Longitude, y = Latitude),

color = "red", size = 2) +

# Customize the plot and set the detected limits

labs(title = "Georeferenced Sampling Sites on the World Map",

x = "Longitude", y = "Latitude") +

# Set the latitude and longitude limits based on the detected range

coord_fixed(ratio = 1.3, xlim = lon_limits, ylim = lat_limits)

https://preview.redd.it/lesngwj2kxyd1.png?width=1762&format=png&auto=webp&s=879b85d58f56c726180fba71bbe1b30d8a922b1c

12 Comments
2024/11/04
18:44 UTC

4

Under/over-dispersion in R with count data while fitting poisson’s regression

There are more than 200 data points but there are only 64 non-zero data points. There are 8 explanatory variables, and the data is over dispersed (including zeros). I tried zero inflated poisson regression but the output shows singularity. I tried generalized poisson regression using vgam package, but has hauk-donner effect on intercept and one variable. Meanwhile I checked vif for multicollinearity, the vif is less than 2 for all variables. Next thing I tried to drop 0 data points, and now the data is under dispersed, I tried generalized poisson regression, even though hauk-donner effect is not detected, the model output is shady. I’m lost,if you have any ideas please let me know. Thank you

11 Comments
2024/11/04
02:00 UTC

8

Pak vs Pacman

To manage packages in our reports and scripts, my team and I have historically been using the package pacman. However, I have recently been seeing a lot of packages and outside code seemingly using the pak package. Our team uses primarily PCs, but my grand-boss is a Mac user, and we are open to team members using either OS if they prefer.

Do these two packages differ meaningfully, or is this more of a preference thing? Are there advantages of one other the other?

6 Comments
2024/11/03
21:35 UTC

5

Open source us presidential election polling data api?

Anyone know of an open source (free) api to access historical polling data?

7 Comments
2024/11/02
21:50 UTC

6

Calculate date for weekday after weekend?

I've cobbled together a function that changes a date that falls on a weekend day to the next Monday. It seems to work, but I'm sure there is a better way. The use of sapply() bugs me a little bit.

Any suggestions?

Input: date, a vector of dates

Output: a vector of dates, where all dates falling on a Saturday/Sunday are adjusted to the next Monday.

adjust_to_weekday <- function(date) {
    adj <- sapply(weekdays(date), \(d) {
        case_when(
            d == "Saturday" ~ 2,
            d == "Sunday" ~ 1,
            TRUE ~ 0
        )
    })
    date + lubridate::days(adj)
}
4 Comments
2024/11/01
15:39 UTC

6

FedData R package removed from CRAN

Hi all,

Do people still use this FedData R package even though it was removed from R?

I appreciated the access to NLCD rasters and developed a few workflows that I thought were pretty good!

Should I spend time looking for a work around to the FedData package, or is it a robust option to use the archived version of the package?

Thanks

3 Comments
2024/10/31
19:20 UTC

1 Comment
2024/10/31
19:03 UTC

1

Network analysis Forest Plot

Hey guys, I am new to r and have a question if this is possible. I compare two medications an and b and want to show in a forest plot, which one is better. Problem is, I have studies that compare an and b, and some that compare a or b with placebo sham. So I guess network analysis is the right thing to do. Do you have a script that would do this? Thanks so much

1 Comment
2024/10/31
17:35 UTC

0

may any help me with my R code please?

ggplot() +

geom_polygon(data = states, aes(x = long, y = lat, group = group),

fill = "white", color = "black") +

filter(flights3, dest != "ANC", dest != "HNL" ) +

geom_point(data = flights3, aes(x = lon, y = lat, color = avg_delay_mean)) +

coord_map()

This code keeps giving me the error:

" Warning: Incompatible methods ("+.gg", "Ops.data.frame") for "+"
Error in ggplot() + geom_polygon(data = states, aes(x = long, y = lat, :
non-numeric argument to binary operator"

I'm not sure what I am doing wrong :(

9 Comments
2024/10/31
03:24 UTC

0

RStudio Coordinates Help

Hi there, I am working on a research project and I need to calculate the distance between the geographic location of a town's city center and their MLB stadium. I have lat/longs for every ballpark and city center that I need, but I don't know a good package to use. It would be great too if I don't have to enter them individually as I am calculating the distance for dozens of observations.

Does anyone know an efficient way to do this?

8 Comments
2024/10/31
01:50 UTC

1

Is it possible to nest tryCatch() with some errors and not with others?

TL;DR - I am trying to create a nested tryCatch, but the error I intentionally catch in the inner tryCatch is also being caught by the outer tryCatch unintentionally. Somewhat curiously, this seems to depend on the kind of error. Are there different kinds of errors and how do I treat them correctly to make a nested tryCatch work?

I have a situation where I want to nest two tryCatch() blocks without letting an error condition of the inner tryCatch() affect the execution of the outer one.

Some context: In my organization we have an R script that periodically runs a dummy query against a list of views in our data warehouse. We want to detect the views that have a problem (e.g., they reference tables that have been deleted since the view's creation). The script looks something like this:

con_prd <- DBI::dbConnect(...)
vectorOfViews <- c("db1.sampleview1", "db2.sampleview2", "db3.sampleview3")

checkViewErrorStatus <- function(view, connection) {
  tryCatch({
   DBI::dbGetQuery(
      conn = connection_to_dwh,
      paste("EXPLAIN SELECT TOP 1 1 FROM", view))
  return("No error")
  },
  error = function(e){
  return(e)
  }
}

vectorOfErrors <- map_chr(vectorOfViews, checkViewErrorStatus)

results <- data.frame(viewName = vectorOfViews, errorStatus = vectorOfErrors)

DBI::dbWriteTable(
  connection_to_dwh,
  SQL("mydb.table_with_view_errors"),
  results, 
  append = TRUE,
  overwrite=FALSE)

Instead of running this script directly, we use in a wrapper Rmd file that runs on our server. The purpose of the wrapper Rmd file, which is used for all of our R scripts, is to create error logs when a script didn't run properly.

tryCatch({
  source("checkViewsScript.R")
},
error = function(e){
  createErrorLog()
})

When checkViewErrorStatus() inside the checkViewsScript.R catches an error then this is intended. That's why I am using a tryCatch() in that function. However, when something else goes wrong, for example when DBI:dbConnect() fails for some reason, then that's a proper error that the outer tryCatch() should catch. Unfortunately, any error inside the checkViewsScript.R will bubble up and get caught be the outer tryCatch(), even if that error was triggered using another tryCatch() inside a function.

Here is the weird thing though: When I try to create a nested tryCatch() using stop() it works without any issues:

tryCatch(
{
message("The inner tryCatch will start")
tryCatch({stop("An inner error has occurred.")}, error = function(e) {message(paste("Inner error msg:" ,e))})
message("The inner tryCatch has finished.")
message("The outer error will be thrown.")
stop("An outer error has occurred.")
message("The script has finished.")
},
error = function(ee) {message(paste("Outer error msg:", ee))}
)

The inner tryCatch will start
Inner error msg: Error in doTryCatch(return(expr), name, parentenv, handler): An inner error has occurred.

The inner tryCatch has finished.
The outer error will be thrown.
Outer error msg: Error in doTryCatch(return(expr), name, parentenv, handler): An outer error has occurred.

When I look at the error thrown by DBI::dbGetQuery() I see the following:

List of 3
 $ message : chr "nanodbc/nanodbc.cpp:1526: 42S02: [Teradata][ODBC Teradata Driver][Teradata Database](-3807)Object 'XXXESTV_LAB_"| __truncated__
 $ call    : NULL
 $ cppstack: NULL
 - attr(*, "class")= chr [1:4] "nanodbc::database_error" "C++Error" "error" "condition"

By contrast, an error created through stop() looks like this:

> stop("this is an error") %>% rlang::catch_cnd() %>% str
List of 2
 $ message: chr "this is an error"
 $ call   : language force(expr)
 - attr(*, "class")= chr [1:3] "simpleError" "error" "condition"

So here is my question: Are there different types of errors? Is it possible that some errors bubble up when using a nested tryCatch whereas others don't?

2 Comments
2024/10/30
16:47 UTC

1

Mixed effects cox model coxme() confidence intervals (or equivalent)

Hi rstats,

I'm running different 'repeated events' cox models on some data, and I need some help with interpretation.

Using coxph() from the survival package, I can fairly easily obtain 95% confidence intervals, and I can run cox.zph() and/or plot residuals to see if and how badly I am violating proportional hazard assumptions. I am using coxph() to run the following 'flavours' of repeated events models (I have a reason do all of these: I favour a (?stratified) frailty model to answer my research question, and someone else would like to use the PWP-gap time model to answer a different question; etc etc).

  • Andersen-Gill (AG)
  • Marginal means/rates (I don't have time-varying covariates, so this gives the same as AG)
  • Prentice, Williams and Peterson (PWP) -total time
  • PWP -gap time

However, I saw that to run frailty aka random effects models, I should use coxme() for computational reasons, apparently, according to the survival package documentation. And I believe it - machine didn't like it much!

So using coxme() is fine, and I am returned the coefficients, hazard ratios, standard errors etc... but firstly, is there a way to extract confidence intervals from coxme(), or is that a really dumb thing to ask? Secondly, I guess I can plot residuals to visually check if I'm violating assumptions? But is there a way I should be interpreting randef() ? A giant printout of the matrix with [level of random effect] & [value] doesn't mean anything to me.

Many thanks in advance for helping out a physiologist who is trying their best :)

5 Comments
2024/10/30
04:25 UTC

115

This is Ridian: R in Obsidian

This is a beta release of R in Obsidian, a plugin to run R in Obsidian, its very much in development/unnstable only validated on my mackbook and win11 on my mackbook. Installation instructions are in the GitHub readme. Looking forward to all your crash/issue reports so I can this better.

https://github.com/MichelNivard/Ridian

15 Comments
2024/10/29
12:47 UTC

0

Downloading R/R studio on Mac Sequoia

I am struggling to download R and R Studio on my Mac with the Sequoia update. I have tried numerous versions and it keeps saying it can't be installed on this computer. Any help would be welcome. Thanks!

4 Comments
2024/10/28
13:05 UTC

8

In R, how has your experience been using an AMD GPU with ZLUDA/ROCm?

So I remember that Nvidia's CUDA has been the standard for few years for pytorch, but just wondering what your experience has been recently with reticulate related R coding.

Beyond that, what are some options without using reticulate?

3 Comments
2024/10/28
12:06 UTC

0

Seeking creativity and insight into modelling

I am using the lavaan package to do a SEM or really a path analysis (I think that is the better term, given I am not working with any latent factors, but I am open to using cfa/latent factors). N = 189. My data is as follows:

  • 3 diet patterns
  • 1 mediator (cerebral elasticity in the front of the brain)
  • 4 outcome variables (cognition)
  • 5 covariates (age + sex + education + moderate-to-vigorous physical activity + total energy intake)

All effects of diet attenuated in my most conservative model with all 5 covariates. I am wondering if I should consider a different approach to how to model them/introduce them. I also have a range of other data points around health and demographics. Kind of lacking in specialised support at the moment and feeling like I need something outside of the box!

model_adjusted2 <- '

# Mediator model

prefx_mca_avg ~ plant.diet + meat.diet + western.diet + age + sex + educationtotal + totalmvpa + kJwithDF

# Outcome models including the mediator and direct paths + age + sex + educationtotal + totalmvpa + kJwithDF

LongTermMem ~ prefx_mca_avg + plant.diet + meat.diet + western.diet + age + sex + educationtotal + totalmvpa + kJwithDF

ProcSpeed ~ prefx_mca_avg + plant.diet + meat.diet + western.diet + age + sex + educationtotal + totalmvpa + kJwithDF

ExecFunc ~ prefx_mca_avg + plant.diet + meat.diet + western.diet + age + sex + educationtotal + totalmvpa + kJwithDF

ShortTermMem ~ prefx_mca_avg + plant.diet + meat.diet + western.diet + age + sex + educationtotal + totalmvpa + kJwithDF

# Covariances among diet variables

plant.diet ~~ meat.diet + western.diet

meat.diet ~~ western.diet

'

9 Comments
2024/10/28
05:42 UTC

0

WillyWeather API

Having trouble getting this properly, as I’m not getting the correct response. Has anyone done API with WillyWeather in R before?

2 Comments
2024/10/28
01:52 UTC

3

[Q] multiple imputation and MICE in R

Has anyone managed to run a PCA on multiple imputed datasets using MICE?

mice.dat = mice(dat[-1], m = 50, seed = 27)

prcomp(complete(mice.dat))

This code works, but mice.dat includes more variables than the ones I want to use in the PCA, a lot of variables were just included as auxiliary for the MI. Does anyone know how to make this work?

I want to then extract participant level scores, so I not just concerned with averaging the loadings.

2 Comments
2024/10/27
01:54 UTC

3

NFL ML Co-Project

Hi, I am trying to learn machine learning and have been reading on it, but I always found I learn better by doing projects. Is anyone else learning and wanting to collaborate? I would love to do a collaborative project with someone. Would love to have fun with it. Please DM me if you are interested. Doing it purely for fun and trying to learn ML. Let me know!

1 Comment
2024/10/26
00:34 UTC

Back To Top