Use case:

  • When some features fail on some datasets, we can get individual error messages and fix the problem.

Setup script:

library(fxtract)
xtractor = Xtractor$new("xtractor")
xtractor$add_data(iris, group_by = "Species")

Let’s write functions, which fail on different datasets:

fun1 = function(data) {
  if ("versicolor" %in% data$Species) stop("fun1 not compatible on versicolor")
  c(mean_sepal_length = mean(data$Sepal.Length),
    sd_sepal_length = sd(data$Sepal.Length))
}

fun2 = function(data) {
  if ("virginica" %in% data$Species) stop("fun2 not compatible on virginica")
  c(mean_petal_length = mean(data$Petal.Length),
    sd_petal_length = sd(data$Petal.Length))
}

xtractor$add_feature(fun1)
xtractor$add_feature(fun2)
xtractor$calc_features()

We can still get the resulting dataframe with missing values for failed calculations:

xtractor$results
##      Species mean_sepal_length sd_sepal_length mean_petal_length
## 1     setosa             5.006       0.3524897             1.462
## 2 versicolor                NA              NA             4.260
## 3  virginica             6.588       0.6358796                NA
##   sd_petal_length
## 1        0.173664
## 2        0.469911
## 3              NA

Get Error Messages

xtractor$error_messages
##   feature_function         id                     error_message
## 1             fun1 versicolor fun1 not compatible on versicolor
## 2             fun2  virginica  fun2 not compatible on virginica

Debugging

You can get the feature function and the dataset of the ID like this:

fun = xtractor$get_feature("fun1")
df = xtractor$get_data("versicolor")

Now you can debug the function on the dataset on which the function failed.

Fix Features

fun1_fixed = function(data) {
  c(mean_sepal_length = mean(data$Sepal.Length),
    sd_sepal_length = sd(data$Sepal.Length))
}

fun2_fixed = function(data) {
  c(mean_petal_length = mean(data$Petal.Length),
    sd_petal_length = sd(data$Petal.Length))
}

xtractor$remove_feature(fun1)
xtractor$remove_feature(fun2)
xtractor$add_feature(fun1_fixed)
xtractor$add_feature(fun2_fixed)

Calculate Again

xtractor$calc_features()
xtractor$results
##      Species mean_sepal_length sd_sepal_length mean_petal_length
## 1     setosa             5.006       0.3524897             1.462
## 2 versicolor             5.936       0.5161711             4.260
## 3  virginica             6.588       0.6358796             5.552
##   sd_petal_length
## 1       0.1736640
## 2       0.4699110
## 3       0.5518947