Skip to content

predict.kproto() does not work on data with a single row #3

@galen-ft

Description

@galen-ft

Issue description

Predict does not work on data with a single row.

The cause is in the way distances for categorical variables are computed in predict.kproto() .

Whenever the input data x has only 1 row, the result of sapply simplifies to a logical vector instead of a matrix and the subsequent call to rowSums fails.

d2 <- sapply(which(catvars), function(j) return(x[,j] != rep(protos[i,j], nrows)) )
d2[is.na(d2)] <- FALSE
if(length(lambda) == 1) d2 <- lambda * rowSums(d2) # <---- rowSums fails

Reproducible example

library(clustMixType)

set.seed(123)
model <- kproto(x = iris, k = 4)
predict(model, iris[1, ])
# which yields
# Error in rowSums(d2) : 'x' must be an array of at least two dimensions

Suggested fix

One quick fix would be to convert d2 to matrix whenever x is a data.frame with one row only:

d2 <- sapply(which(catvars), function(j) return(x[,j] != rep(protos[i,j], nrows)) )
if (NROW(x) == 1) d2 <- matrix(data = d2, nrow = 1, byrow = TRUE, dimnames = list(NULL, names(x)[catvars])) # <- FIX
d2[is.na(d2)] <- FALSE
if(length(lambda) == 1) d2 <- lambda * rowSums(d2) # <---- rowSums now succeeds

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions