-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Description
Confirm this is an issue with the Python library and not an underlying OpenAI API
- This is an issue with the Python library
Describe the bug
Handling of pydantic v2 is incorrect for structured output.
To Reproduce
https://platform.openai.com/docs/guides/structured-outputs As an example the first example is this:
from openai import OpenAI
from pydantic import BaseModel
client = OpenAI()
class CalendarEvent(BaseModel):
name: str
date: str
participants: list[str]
response = client.responses.parse(
model="gpt-4o-2024-08-06",
input=[
{"role": "system", "content": "Extract the event information."},
{
"role": "user",
"content": "Alice and Bob are going to a science fair on Friday.",
},
],
text_format=CalendarEvent,
)
event = response.output_parsed
but if you run this when having pydantic 2 installed in your venv you get:
openai.BadRequestError: Error code: 400 - {'error': {'message': "Invalid schema for response_format 'CalendarEvent': In context=(), 'additionalProperties' is required to be supplied and to be false.", 'type': 'invalid_request_error', 'param': 'text.format.schema', 'code': 'invalid_json_schema'}}
I traced this down to the BaseModel
where there is this code:
if PYDANTIC_V1:
@property
@override
def model_fields_set(self) -> set[str]:
# a forwards-compat shim for pydantic v2
return self.__fields_set__ # type: ignore
class Config(pydantic.BaseConfig): # pyright: ignore[reportDeprecated]
extra: Any = pydantic.Extra.allow # type: ignore
@override
def __repr_args__(self) -> ReprArgs:
# we don't want these attributes to be included when something like `rich.print` is used
return [arg for arg in super().__repr_args__() if arg[0] not in {"_request_id", "__exclude_fields__"}]
else:
model_config: ClassVar[ConfigDict] = ConfigDict(
extra="allow", defer_build=coerce_boolean(os.environ.get("DEFER_PYDANTIC_BUILD", "true"))
)
That else
block looks wrong.
The workaround to get the example working is:
class CalendarEvent(BaseModel):
name: str
date: str
participants: list[str]
model_config = ConfigDict(
extra="forbid", defer_build=coerce_boolean(os.environ.get("DEFER_PYDANTIC_BUILD", "true"))
)
now that example can run.
I guess the code samples should all be run in two different environments during CI: pydantic v1 and v2.
Code snippets
OS
macOS
Python version
3.12.6
Library version
1.109.0