feat: add as_generic_chat_history function to convert any Context to …#1007
feat: add as_generic_chat_history function to convert any Context to …#1007akihikokuroda wants to merge 5 commits intogenerative-computing:mainfrom
Conversation
…Message Signed-off-by: Akihiko Kuroda <[email protected]>
|
The PR description has been updated. Please fill out the template for your PR to be reviewed. |
| return c.parsed_repr | ||
| # Use value if parsed_repr is None or not a Message | ||
| content = ( | ||
| str(c.value) if c.parsed_repr is None else formatter(c.parsed_repr) |
There was a problem hiding this comment.
I think we also need to consider the case if c.parsed_repr is a string (I believe that's what a MOT will return after a completed CBlock action (?) )
quick way to test:
mot = ModelOutputThunk(value='reply text')
mot.parsed_repr = 'reply text'
as_generic_chat_history(ChatContext().add(Message('user','hi')).add(mot))
# → WARNING: Unknown component type str ...
There was a problem hiding this comment.
Yes, it was an issue. I fix it. Thanks!
There was a problem hiding this comment.
should we add a test case for it too?
Signed-off-by: Akihiko Kuroda <[email protected]>
|
there's a typo in the PR title: |
Signed-off-by: Akihiko Kuroda <[email protected]>
|
|
||
| def as_generic_chat_history( | ||
| ctx: Context, formatter: Callable[[Any], str] | None = None | ||
| ) -> list[Message]: |
There was a problem hiding this comment.
Minor / style: Small typing issue here — both _default_formatter and the formatter parameter use Any, which the project convention asks us to avoid unless an external library forces it (nothing does here). object is the right type: it accepts everything, is covariant, and keeps mypy happy with callers that pass typed formatters.
def _default_formatter(obj: object) -> str: ...
formatter: Callable[[object], str] | None = NoneThe test helpers already use obj: object, so it is just the library code that needs updating.
| # Use value if parsed_repr is None or some other type | ||
| content = ( | ||
| str(c.value) if c.parsed_repr is None else formatter(c.parsed_repr) | ||
| ) |
There was a problem hiding this comment.
Two things worth noting on this line — one minor, one more significant.
Minor / style: When parsed_repr is a dict or similar, the default formatter will log "Unknown component type dict" — but dict is not an unexpected component here, it is a parsed form of a ModelOutputThunk we already recognise. That message will be confusing to anyone monitoring logs. A dedicated log line would be clearer:
_logger.warning(
"ModelOutputThunk.parsed_repr is %s, not a Message; falling back to value.",
type(c.parsed_repr).__name__,
)More significant: If both parsed_repr and value are None (a ModelOutputThunk that has not been evaluated yet), str(None) silently produces the literal string "None" in the chat history with no warning logged. A backend receiving that will have no indication anything went wrong. Worth guarding explicitly:
if c.value is None:
raise ValueError("ModelOutputThunk has no value and no parsed_repr — was it evaluated?")| return Message(role="assistant", content=content) | ||
| case CBlock(): | ||
| return Message(role="user", content=str(c)) | ||
| case _: |
There was a problem hiding this comment.
More significant: One tricky edge case: ImageBlock inherits from CBlock (see core/base.py:79), so it will fall into this arm and produce Message(role="user", content=str(image_block)) — meaning a raw base64 PNG string ends up in the chat history as a text message. This is a reachable code path if a context with image content is passed in, and the corruption would be invisible downstream.
A simple guard fixes it:
case CBlock():
if type(c) is not CBlock:
return Message(role="user", content=formatter(c))
return Message(role="user", content=str(c))That way any CBlock subclasses not explicitly handled go through the formatter instead.
planetf1
left a comment
There was a problem hiding this comment.
Generally looks good - just came across 2 possible corruption issues. 1 is an edge case, the other affects images. 2 other minors.
Signed-off-by: Akihiko Kuroda <[email protected]>
Co-authored-by: Paul Schweigert <[email protected]>
Misc PR
Type of PR
Description
Add as_generic_chat_history() function to handle heterogeneous contexts with
configurable formatter. Unlike as_chat_history(), this gracefully handles
arbitrary component types by converting them to strings, with optional logging
for unknown types.
Testing
Attribution