Skip to content

question about the evaluation of alfworld #9

@sunnychenxiwang

Description

@sunnychenxiwang

When I evaluated Alfworld using the vanilla React method, I found that the success rate of all models was 1. Moreover, even when the final trajectory did not actually complete the task, the environment feedback returned:

<tool_call_result>Nothing happens.

Congratulations! You have completed the task!
</tool_call_result>

This caused the final state to be marked as success. How can I resolve this issue?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions