vignettes/tutorial/xtractor_error_handling.Rmd
xtractor_error_handling.Rmd
Use case:
Setup script:
library(fxtract) xtractor = Xtractor$new("xtractor") xtractor$add_data(iris, group_by = "Species")
Let’s write functions, which fail on different datasets:
fun1 = function(data) { if ("versicolor" %in% data$Species) stop("fun1 not compatible on versicolor") c(mean_sepal_length = mean(data$Sepal.Length), sd_sepal_length = sd(data$Sepal.Length)) } fun2 = function(data) { if ("virginica" %in% data$Species) stop("fun2 not compatible on virginica") c(mean_petal_length = mean(data$Petal.Length), sd_petal_length = sd(data$Petal.Length)) } xtractor$add_feature(fun1) xtractor$add_feature(fun2) xtractor$calc_features()
We can still get the resulting dataframe with missing values for failed calculations:
xtractor$results
## Species mean_sepal_length sd_sepal_length mean_petal_length
## 1 setosa 5.006 0.3524897 1.462
## 2 versicolor NA NA 4.260
## 3 virginica 6.588 0.6358796 NA
## sd_petal_length
## 1 0.173664
## 2 0.469911
## 3 NA
xtractor$error_messages
## feature_function id error_message
## 1 fun1 versicolor fun1 not compatible on versicolor
## 2 fun2 virginica fun2 not compatible on virginica
You can get the feature function and the dataset of the ID like this:
fun = xtractor$get_feature("fun1") df = xtractor$get_data("versicolor")
Now you can debug the function on the dataset on which the function failed.
fun1_fixed = function(data) { c(mean_sepal_length = mean(data$Sepal.Length), sd_sepal_length = sd(data$Sepal.Length)) } fun2_fixed = function(data) { c(mean_petal_length = mean(data$Petal.Length), sd_petal_length = sd(data$Petal.Length)) } xtractor$remove_feature(fun1) xtractor$remove_feature(fun2) xtractor$add_feature(fun1_fixed) xtractor$add_feature(fun2_fixed)