Skip to content

Add __get_pydantic_core_schema__ to domain classes.#273

Draft
TeresiaOlsson wants to merge 1 commit into
mainfrom
pydantic-arbitrary-types
Draft

Add __get_pydantic_core_schema__ to domain classes.#273
TeresiaOlsson wants to merge 1 commit into
mainfrom
pydantic-arbitrary-types

Conversation

@TeresiaOlsson

Copy link
Copy Markdown
Contributor

I have made a PR to explore the option to use __get_pydantic_core_schema__ to make arbitrary types behave as if they are types supported by Pydantic.

So far I have only modified Element to test the idea.

Idea:

  • Make the constructor of the domain classes the source of truth for the attributes and type hints.
  • Make the domain classes support validation and JSON schema generation by adding __get_pydantic_core_schema__ using a decorator.

Examples:

Build the element without validation

from pyaml.common.element import Element

element1 = Element('Q3M2D1R','Q3M2D1R','test description')
>> Element(description='test description', name='Q3M2D1R', lattice_names='Q3M2D1R', peer=None)

element2 = Element(45687,'Q3M2D1R','test description')
>> Element(description='test description', name=45687, lattice_names='Q3M2D1R', peer=None)

Do validation before building the element

from pydantic import TypeAdapter

data1 = {
    "name": "Q3M2D1R",
    "description": "test description",
}

data2 = {
    "name": 45557,
    "description": "test description",
}

element_adapter = TypeAdapter(Element)

element1 = element_adapter.validate_python(data1)
>> Element(description='test description', name='Q3M2D1R', lattice_names='Q3M2D1R', peer=None)

element2 = element_adapter.validate_python(data2)
>> ValidationError...

Generate JSON schema

import json

schema = element_adapter.json_schema()
print(json.dumps(schema, indent=2))

>>
{
  "properties": {
    "name": {
      "title": "Name",
      "type": "string"
    },
    "lattice_names": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "title": "Lattice Names"
    },
    "description": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "title": "Description"
    }
  },
  "required": [
    "name"
  ],
  "type": "object"
}

This is what I have found so far:

Pros:

  • A ConfigModel is no longer needed. You can create an object of a domain class directly.

  • You can use the domain classes directly in nested models and the validation will automatically work as for types already supported by pydantic. For example:

    class Magnet:
        model: MagnetModel
  • Generating JSON schemas work as if the type would be already supported by pydantic.

  • The user can first write the domain class and when they are happy with the business logic they can think about the validation.

  • Automatically adding __get_pydantic_core_schema__ to a class is easy with the decorator. This also makes pydantic a loose dependency since you can choose to not use the decorator and the domain class will be pydantic free.

Cons:

  • For validation of nested structures to work in the same way as for other types __get_pydantic_core_schema__ doesn't only need to return the schema, but also do the validation and build an object of the class. It is possible to have __get_pydantic_core_schema__ to only return the schema but the class will then not fully behave as expected by pydantic. Instead of

    class Magnet:
        model: MagnetModel

    it will give

    class Magnet:
        model: dict

This means that __get_pydantic_core_schema__ doesn't only add validation functionality to the class but also starts acting as a factory and the separation of concerns is not good. It is possible to ignore this and just return the schema but it might become confusing to have some types following the principles of pydantic and some not.

  • The decorator relies on the constructor having annotations. It will throw an error if there is a field which doesn't have one but potentially this can be a source of bugs since it is only caught at runtime.

  • Not sure how well this will work for more complicated constructors. However, it is always possible for a user to not use the decorator and instead write their own __get_pydantic_core_schema__ in the class. But that requires some pydantic knowledge.

@TeresiaOlsson TeresiaOlsson marked this pull request as draft June 8, 2026 13:30
@TeresiaOlsson

TeresiaOlsson commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

@JeanLucPons and @gupichon Can you take a look at what I tested? I only made the changes for Element to get a limited case and see how well it works.

I think this approach has many advantages but there is one major disadvantage which I'm feeling might make it a bad idea. I discovered that for arbitrary types to act the same as types that are already supported by pydantic, you can't just return the schema but you need to also build an object of the class. Otherwise nested structures are not resolved. Validation and creation now becomes mixed together.

Also, I think it's quite confusing for the user because Element('Q3M2D1R','Q3M2D1R','test description') will give an Element object but element_adapter.validate_python(data1) will too. It's not easy to understand that and why it happens.

What do you think? My feeling at the moment is that it is better to keep validation and building objects as two separate concerns and not mix them together like this.

@JeanLucPons

Copy link
Copy Markdown
Contributor

What about dynamically creating 2 schema (one for validation, one for json schema) as aggregate of the class ?
As I did in the other PR ?
I should fulfill all requirements. However, it will be more difficult to handle Typed list (or dict) of arbitrary type but I think we can add limitation to our use cases.

@TeresiaOlsson

TeresiaOlsson commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

Yes, I think that is a better option. I don't really like the __get_pydantic_core_schema__ anymore. What I liked about your suggestion is also that I think it allows for flexibility. I think it can be implemented so you can dynamically create the schema from the constructor but also allow to have a separate schema class if you prefer. For example if you want to use some special validation functionality from pydantic. So dynamically create it can be the default and the other an optional possibility if needed.

What do you think about what I did for the __repr__? My thinking was to automatically generate it from the public attributes and the properties and in that way also make the class the source of truth for it.

@gupichon

gupichon commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Ok, there are some very good pros !
Getting rid of ConfigModel is a good thing in my opinion, but requiring the final element to be built for validation is very annoying to me.
Also, I'm not sure it's a good thing to rely on a Pydantic method such as __get_pydantic_core_schema__, which could become deprecated or subject to frequent behavior changes.
I completely agree with you about separating validation and object building. This will allow the configuration to live its own life and be easily adopted by users by being improved independently of the core. Also, it will make it possible to directly build the final object for facilities that already have models with all necessary data.

@TeresiaOlsson

Copy link
Copy Markdown
Contributor Author

With @JeanLucPons suggestion I think we can get rid of the ConfigModel and make the domain class the source of truth. We can have a register_schema decorator which dynamically generates the schema from the constructor if you give it no input and if you give it a class it registers a separate schema class. Then you can choose which option to go for depending on the use case for the specific class and how much custom validation it needs.

I will make a new PR tomorrow for that option.

@JeanLucPons

Copy link
Copy Markdown
Contributor

Yes, I think that is a better option....

Yes it allow flexibility, we get rid off ConfigModel, no requirement of pydantic knowledge for the users, easy to disable but:

  • Some potential difficulty to implement json schema especially for compound arbitrary type.
  • Possibility to loss completion in IPython. It relies on constructed object so the override of the constructor may end in a (*args, *kwargs) signature. To be tested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants