Chat Module

The Chat module provides a comprehensive chat completion API with history management, formatting, streaming capabilities, function calling, and multimodal support.

Core Classes

Chat Client

class lexilux.chat.client.Chat(*, base_url, api_key=None, model=None, timeout_s=60.0, connect_timeout_s=None, read_timeout_s=None, max_retries=0, headers=None, proxies=None, rate_limit=None)[source]

Bases: BaseAPIClient

Chat API client.

Provides a simple, function-like API for chat completions with support for both non-streaming and streaming responses.

Important: Chat is STATELESS - each call is independent. For multi-turn conversations, use ChatHistory to manage context and pass it via the history parameter.

Method Overview:
  • chat() / acall(): Single request (may be truncated)

  • stream() / astream(): Streaming response (may be truncated)

  • complete() / acomplete(): Auto-continue if truncated

  • complete_stream() / acomplete_stream(): Streaming + auto-continue

Related Classes:
  • ChatHistory: Manages conversation state (pass via history parameter)

  • Conversation: Low-level utility for handling truncated responses

    (use chat.complete() instead for simplicity)

Examples

>>> # Simple single-turn query
>>> chat = Chat(base_url="...", api_key="...", model="gpt-4")
>>> result = chat("Hello, world!")
>>> print(result.text)
>>> # Streaming
>>> for chunk in chat.stream("Tell me a joke"):
...     print(chunk.delta, end="")
>>> # Multi-turn conversation (use ChatHistory)
>>> from lexilux import ChatHistory
>>> history = ChatHistory(system="You are helpful")
>>> history.add_user("My name is Alice")
>>> result = chat(history.get_messages())
>>> history.add_assistant(result.text)
>>> history.add_user("What's my name?")
>>> result = chat(history.get_messages())  # AI remembers!
>>> # Long content (auto-continue)
>>> result = chat.complete("Write an essay", max_tokens=100)
__init__(*, base_url, api_key=None, model=None, timeout_s=60.0, connect_timeout_s=None, read_timeout_s=None, max_retries=0, headers=None, proxies=None, rate_limit=None)[source]

Initialize Chat client.

Parameters:
  • base_url (str) – Base URL for the API (e.g., “https://api.openai.com/v1”).

  • api_key (str | None) – API key for authentication (optional if provided in headers).

  • model (str | None) – Default model to use (can be overridden in __call__).

  • timeout_s (float) – Request timeout in seconds (default for both connect and read).

  • connect_timeout_s (float | None) – Connection timeout in seconds (overrides timeout_s).

  • read_timeout_s (float | None) – Read timeout in seconds (overrides timeout_s).

  • max_retries (int) – Maximum number of retries for failed requests (default: 0).

  • headers (dict[str, str] | None) – Additional headers to include in requests.

  • proxies (dict[str, str] | None) – Optional proxy configuration dict (e.g., {“http”: “http://proxy:port”}). If None, uses environment variables (HTTP_PROXY, HTTPS_PROXY). To disable proxies, pass {}.

  • rate_limit (tuple[int, float] | None) – Optional rate limiting as (max_rate, time_period) tuple. Example: (10, 60.0) for 10 requests per 60 seconds. Requires aiolimiter to be installed.

Note

Each HTTP request creates a new connection that closes after completion.

property timeout_s: float

Backward compatibility property for timeout.

Returns the timeout value (or read timeout if tuple).

__call__(messages, *, history=None, model=None, system=None, temperature=None, top_p=None, max_tokens=None, stop=None, presence_penalty=None, frequency_penalty=None, logit_bias=None, user=None, n=None, tools=None, tool_choice=None, parallel_tool_calls=None, params=None, extra=None, reasoning=None, return_raw=False)[source]

Make a single chat completion request.

History is read-only - used for context but never modified.

stream(messages, *, history=None, model=None, system=None, temperature=None, top_p=None, max_tokens=None, stop=None, presence_penalty=None, frequency_penalty=None, logit_bias=None, user=None, tools=None, tool_choice=None, parallel_tool_calls=None, params=None, extra=None, reasoning=None, include_usage=True, return_raw_events=False, include_reasoning=False)[source]

Stream a single chat completion response.

History is read-only - used for context but never modified.

async acall(messages, *, history=None, model=None, system=None, temperature=None, top_p=None, max_tokens=None, stop=None, presence_penalty=None, frequency_penalty=None, logit_bias=None, user=None, n=None, tools=None, tool_choice=None, parallel_tool_calls=None, params=None, extra=None, reasoning=None, return_raw=False)[source]

Make an async chat completion request.

History is read-only - used for context but never modified.

async astream(messages, *, history=None, model=None, system=None, temperature=None, top_p=None, max_tokens=None, stop=None, presence_penalty=None, frequency_penalty=None, logit_bias=None, user=None, tools=None, tool_choice=None, parallel_tool_calls=None, params=None, extra=None, reasoning=None, include_usage=True, return_raw_events=False, include_reasoning=False)[source]

Stream an async chat completion response.

History is read-only - used for context but never modified.

complete(messages, *, history=None, max_continues=5, ensure_complete=True, continue_prompt='continue', on_progress=None, continue_delay=0.0, on_error='raise', on_error_callback=None, **params)[source]

Ensure a complete response, automatically handling truncation.

Behavior: Automatically continues generation if the response is truncated, ensuring the returned result is complete (or raises an exception).

History Immutability: If history is provided, a clone is created and used internally. The original history is never modified.

History Management: - If history is provided, uses it (for multi-turn conversations) - If history is None, creates a new history internally (for single-turn conversations) - The history is automatically updated with the prompt and response

Use this when: - You need a complete response (e.g., JSON extraction) - You cannot accept partial responses - Reliability is more important than performance

For single responses (even if truncated), use chat() instead.

Parameters:
  • messages (str | Sequence[str | dict[str, str] | dict[str, Any]]) – Input messages.

  • history (ChatHistory | None) – Optional ChatHistory instance. If None, creates a new one internally.

  • max_continues (int) – Maximum number of continuation attempts.

  • ensure_complete (bool) – If True, raises ChatIncompleteResponseError if result is still truncated after max_continues. If False, returns partial result.

  • continue_prompt (str | Callable) – User prompt for continuation requests. Can be a string or a callable with signature: (count: int, max_count: int, current_text: str, original_prompt: str) -> str

  • on_progress (Callable | None) – Optional progress callback function with signature: (count: int, max_count: int, current_result: ChatResult, all_results: List[ChatResult]) -> None

  • continue_delay (float | tuple[float, float]) – Delay between continue requests (seconds). Can be a float (fixed delay) or tuple (min, max) for random delay. Delay is only applied after the first continue.

  • on_error (str) – Error handling strategy: “raise” (default) or “return_partial”.

  • on_error_callback (Callable | None) – Optional error callback function with signature: (error: Exception, partial_result: ChatResult) -> dict

  • params (Any) – Additional parameters to pass to chat and continue requests.

Returns:

Complete ChatResult (never truncated, unless max_continues exceeded).

Raises:

ChatIncompleteResponseError – If ensure_complete=True and result is still truncated after max_continues.

Return type:

ChatResult

Examples

Single-turn conversation (no history needed): >>> result = chat.complete(“Write a long JSON”, max_tokens=100) >>> import json >>> json_data = json.loads(result.text) # Response is complete

Multi-turn conversation (provide history): >>> history = ChatHistory() >>> result1 = chat.complete(“First question”, history=history) >>> result2 = chat.complete(“Follow-up question”, history=history)

With progress tracking: >>> def on_progress(count, max_count, current, all_results): … print(f”Continuing generation {count}/{max_count}…”) >>> result = chat.complete(“Write JSON”, on_progress=on_progress)

complete_stream(messages, *, history=None, max_continues=5, ensure_complete=True, continue_prompt='continue', on_progress=None, continue_delay=0.0, on_error='raise', on_error_callback=None, **params)[source]

Stream a complete response, automatically handling truncation.

Behavior: Automatically continues streaming if the response is truncated, ensuring the final result is complete (or raises an exception).

History Immutability: If history is provided, a clone is created and used internally. The original history is never modified.

History Management: - If history is provided, uses it (for multi-turn conversations) - If history is None, creates a new history internally (for single-turn conversations) - The history is automatically updated with the prompt and response

Use this when: - You need a complete response with real-time output - You cannot accept partial responses - You want both streaming and completeness

For single streaming responses (even if truncated), use chat.stream() instead.

Parameters:
  • messages (str | Sequence[str | dict[str, str] | dict[str, Any]]) – Input messages.

  • history (ChatHistory | None) – Optional ChatHistory instance. If None, creates a new one internally.

  • max_continues (int) – Maximum number of continuation attempts.

  • ensure_complete (bool) – If True, raises ChatIncompleteResponseError if result is still truncated after max_continues. If False, returns partial result.

  • continue_prompt (str | Callable) – User prompt for continuation requests. Can be a string or a callable with signature: (count: int, max_count: int, current_text: str, original_prompt: str) -> str

  • on_progress (Callable | None) – Optional progress callback function with signature: (count: int, max_count: int, current_result: ChatResult, all_results: List[ChatResult]) -> None

  • continue_delay (float | tuple[float, float]) – Delay between continue requests (seconds). Can be a float (fixed delay) or tuple (min, max) for random delay. Delay is only applied after the first continue.

  • on_error (str) – Error handling strategy: “raise” (default) or “return_partial”.

  • on_error_callback (Callable | None) – Optional error callback function with signature: (error: Exception, partial_result: ChatResult) -> dict

  • params (Any) – Additional parameters to pass to chat and continue requests.

Returns:

Iterator that yields ChatStreamChunk objects from

initial request and all continue requests. Access accumulated result via iterator.result.

Return type:

StreamingIterator

Raises:

ChatIncompleteResponseError – If ensure_complete=True and result is still truncated after max_continues.

Examples

Single-turn conversation (no history needed): >>> iterator = chat.complete_stream(“Write a long JSON”, max_tokens=100) >>> for chunk in iterator: … print(chunk.delta, end=””, flush=True) >>> result = iterator.result.to_chat_result() >>> import json >>> json_data = json.loads(result.text) # Response is complete

Multi-turn conversation (provide history): >>> history = ChatHistory() >>> iterator1 = chat.complete_stream(“First question”, history=history) >>> iterator2 = chat.complete_stream(“Follow-up”, history=history)

async acomplete(messages, *, history=None, max_continues=5, ensure_complete=True, continue_prompt='continue', on_progress=None, continue_delay=0.0, on_error='raise', on_error_callback=None, **params)[source]

Async version of complete().

Ensure a complete response asynchronously, automatically handling truncation.

Behavior: Automatically continues generation if the response is truncated, ensuring the returned result is complete (or raises an exception).

History Immutability: If history is provided, a clone is created and used internally. The original history is never modified.

Parameters:
  • messages (str | Sequence[str | dict[str, str] | dict[str, Any]]) – Input messages.

  • history (ChatHistory | None) – Optional ChatHistory instance.

  • max_continues (int) – Maximum number of continuation attempts.

  • ensure_complete (bool) – If True, raises ChatIncompleteResponseError if result is still truncated after max_continues.

  • continue_prompt (str | Callable) – User prompt for continuation requests.

  • on_progress (Callable | None) – Optional progress callback function.

  • continue_delay (float | tuple[float, float]) – Delay between continue requests (seconds).

  • on_error (str) – Error handling strategy: “raise” (default) or “return_partial”.

  • on_error_callback (Callable | None) – Optional error callback function.

  • params (Any) – Additional parameters to pass to chat and continue requests.

Returns:

Complete ChatResult (never truncated, unless max_continues exceeded).

Return type:

ChatResult

Examples

>>> result = await chat.acomplete("Write a long JSON", max_tokens=100)
>>> import json
>>> json_data = json.loads(result.text)  # Response is complete
async acomplete_stream(messages, *, history=None, max_continues=5, ensure_complete=True, continue_prompt='continue', on_progress=None, continue_delay=0.0, on_error='raise', on_error_callback=None, **params)[source]

Async version of complete_stream().

Stream a complete response asynchronously, automatically handling truncation.

Behavior: Automatically continues streaming if the response is truncated, ensuring the final result is complete (or raises an exception).

History Immutability: If history is provided, a clone is created and used internally. The original history is never modified.

Parameters:
  • messages (str | Sequence[str | dict[str, str] | dict[str, Any]]) – Input messages.

  • history (ChatHistory | None) – Optional ChatHistory instance.

  • max_continues (int) – Maximum number of continuation attempts.

  • ensure_complete (bool) – If True, raises ChatIncompleteResponseError if result is still truncated after max_continues.

  • continue_prompt (str | Callable) – User prompt for continuation requests.

  • on_progress (Callable | None) – Optional progress callback function.

  • continue_delay (float | tuple[float, float]) – Delay between continue requests (seconds).

  • on_error (str) – Error handling strategy: “raise” (default) or “return_partial”.

  • on_error_callback (Callable | None) – Optional error callback function.

  • params (Any) – Additional parameters to pass to chat and continue requests.

Returns:

Async iterator that yields ChatStreamChunk objects.

Return type:

AsyncStreamingIterator

Examples

>>> async for chunk in await chat.acomplete_stream("Write JSON"):
...     print(chunk.delta, end="", flush=True)
>>> result = iterator.result.to_chat_result()
chat_with_history(history, message=None, **params)[source]

Make a chat completion request using history.

This is a convenience method. You can also use: >>> chat(message, history=history, **params)

Parameters:
  • history (ChatHistory) – ChatHistory instance to use.

  • message (str | dict | None) – Optional new message to add. If None, uses history as-is.

  • **params – Additional parameters to pass to __call__.

Returns:

ChatResult from the API call.

Return type:

ChatResult

Examples

>>> history = ChatHistory.from_messages("Hello")
>>> result = chat.chat_with_history(history, temperature=0.7)
>>> # Or with a new message:
>>> result = chat.chat_with_history(history, "Continue", temperature=0.7)
stream_with_history(history, message=None, **params)[source]

Make a streaming chat completion request using history.

This is a convenience method. You can also use: >>> chat.stream(message, history=history, **params)

Parameters:
  • history (ChatHistory) – ChatHistory instance to use.

  • message (str | dict | None) – Optional new message to add. If None, uses history as-is.

  • **params – Additional parameters to pass to stream().

Returns:

StreamingIterator for the streaming response.

Return type:

StreamingIterator

Examples

>>> history = ChatHistory.from_messages("Hello")
>>> iterator = chat.stream_with_history(history, temperature=0.7)
>>> # Or with a new message:
>>> iterator = chat.stream_with_history(history, "Continue", temperature=0.7)
>>> for chunk in iterator:
...     print(chunk.delta, end="")

Result Models

class lexilux.chat.models.ChatResult(*, text, usage, finish_reason=None, tool_calls=None, raw=None, reasoning=None)[source]

Bases: ResultBase

Chat completion result (non-streaming).

text

The generated text content.

tool_calls

List of function/tool calls initiated by the model.

finish_reason

Reason why the generation stopped. Possible values: - “stop”: Model stopped naturally or hit stop sequence - “length”: Reached max_tokens limit - “content_filter”: Content was filtered - “tool_calls”: Model initiated tool call(s) - None: Unknown or not provided

usage

Usage statistics.

raw

Raw API response.

Important Notes:
  • finish_reason is only available when the API successfully returns a response.

  • If network connection is interrupted, an exception will be raised (requests.RequestException, ConnectionError, TimeoutError, etc.) and no ChatResult will be returned.

  • To distinguish network errors from normal completion: * Network error: Exception is raised, no ChatResult returned * Normal completion: ChatResult returned with finish_reason set

  • Tool calls: When tool_calls is non-empty, text may be empty or contain supplementary text alongside the function calls.

Examples

>>> result = chat("Hello")
>>> print(result.text)
"Hello! How can I help you?"
>>> print(result.usage.total_tokens)
42
>>> print(result.finish_reason)
"stop"
>>> # Handling tool calls:
>>> result = chat("What's the weather in Paris?", tools=[get_weather_tool])
>>> if result.has_tool_calls:
...     for tc in result.tool_calls:
...         print(f"Call: {tc.name} with args: {tc.get_arguments()}")
>>> # Handling network errors:
>>> try:
...     result = chat("Hello")
...     print(f"Finished: {result.finish_reason}")
... except requests.RequestException as e:
...     print(f"Network error: {e}")
...     # No finish_reason available - connection failed
__init__(*, text, usage, finish_reason=None, tool_calls=None, raw=None, reasoning=None)[source]

Initialize ChatResult.

Parameters:
  • text (str) – Generated text content.

  • usage (Usage) – Usage statistics.

  • finish_reason (str | None) – Reason why generation stopped.

  • tool_calls (list[ToolCall] | None) – List of tool calls initiated by the model.

  • raw (dict[str, Any] | None) – Raw API response.

  • reasoning (str | None) – Reasoning/thinking content (for models with extended thinking).

property has_tool_calls: bool

Check if result contains tool calls.

Returns:

True if tool_calls is non-empty.

Examples

>>> result = chat("...", tools=[tool])
>>> if result.has_tool_calls:
...     # Handle tool calls
...     pass
property has_reasoning: bool

Check if result contains reasoning content.

Returns:

True if reasoning is non-empty.

Examples

>>> result = chat("...", reasoning=True)
>>> if result.has_reasoning:
...     print(result.reasoning)
__str__()[source]

Return the text content when converted to string.

__repr__()[source]

Return string representation.

class lexilux.chat.models.ChatStreamChunk(*, delta, usage, done, finish_reason=None, tool_calls=None, streaming_tool_calls=None, raw=None, reasoning_content=None, reasoning_tokens=None)[source]

Bases: ResultBase

Chat streaming chunk.

Each chunk in a streaming response contains:

  • delta: The incremental text content (may be empty)

  • tool_calls: Incremental tool call data (may be empty)

  • done: Whether this is the final chunk

  • finish_reason: Reason why generation stopped (only set when done=True).

    Possible values: - “stop”: Model stopped naturally or hit stop sequence - “length”: Reached max_tokens limit - “content_filter”: Content was filtered - “tool_calls”: Model initiated tool call(s) - None: Still generating (intermediate chunks), [DONE] message, or unknown

  • usage: Usage statistics (may be empty/None for intermediate chunks, complete only in the final chunk when include_usage=True)

delta

Incremental text content.

tool_calls

List of incremental tool call data (for streaming tool calls).

done

Whether this is the final chunk.

finish_reason

Reason why generation stopped (None for intermediate chunks).

usage

Usage statistics (may be incomplete for intermediate chunks).

raw

Raw chunk data.

Important Notes:
  • finish_reason is only available when the API successfully completes.

  • If network connection is interrupted, an exception will be raised (requests.RequestException, ConnectionError, TimeoutError, etc.) and no chunk with finish_reason will be received.

  • To distinguish network errors from normal completion: * Network error: Exception is raised, no done=True chunk received * Normal completion: done=True chunk received with finish_reason set * Incomplete stream: Exception raised after receiving some chunks

  • Tool calls in streaming: Tool call data is streamed incrementally. Multiple chunks may be needed to assemble complete tool calls.

Examples

>>> for chunk in chat.stream("Hello"):
...     print(chunk.delta, end="")
...     if chunk.done:
...         print(f"\nUsage: {chunk.usage.total_tokens}")
...         print(f"Finish reason: {chunk.finish_reason}")
>>> # Handling tool calls in streaming:
>>> for chunk in chat.stream("What's the weather?", tools=[tool]):
...     if chunk.has_tool_calls:
...         for tc in chunk.tool_calls:
...             print(f"Tool call: {tc.name}")
>>> # Handling network errors:
>>> try:
...     iterator = chat.stream("Hello")
...     for chunk in iterator:
...         if chunk.done:
...             break
... except requests.RequestException as e:
...     print(f"\nNetwork error: {e}")
__init__(*, delta, usage, done, finish_reason=None, tool_calls=None, streaming_tool_calls=None, raw=None, reasoning_content=None, reasoning_tokens=None)[source]

Initialize ChatStreamChunk.

Parameters:
  • delta (str) – Incremental text content.

  • usage (Usage) – Usage statistics.

  • done (bool) – Whether this is the final chunk.

  • finish_reason (str | None) – Reason why generation stopped.

  • tool_calls (list[ToolCall] | None) – List of complete tool calls (valid JSON arguments).

  • streaming_tool_calls (list[StreamingToolCall] | None) – List of streaming tool call states (may be incomplete).

  • raw (dict[str, Any] | None) – Raw chunk data.

  • reasoning_content (str | None) – Reasoning/thinking content (OpenAI o1/Claude 3.5/DeepSeek).

  • reasoning_tokens (int | None) – Token count for reasoning content.

property has_content: bool

Check if chunk contains text content.

Returns:

True if delta is non-empty.

Examples

>>> chunk = ChatStreamChunk(delta="Hello", usage=Usage(), done=False)
>>> chunk.has_content
True
property has_tool_calls: bool

Check if chunk contains complete tool call data.

Returns:

True if tool_calls is non-empty.

Examples

>>> chunk = ChatStreamChunk(
...     delta="",
...     usage=Usage(),
...     done=False,
...     tool_calls=[ToolCall(...)]
... )
>>> chunk.has_tool_calls
True
property reasoning: str

Get reasoning content delta (alias for reasoning_content).

Returns:

Reasoning delta string (empty if none).

Examples

>>> for chunk in chat.stream("...", reasoning=True):
...     if chunk.reasoning:
...         print(chunk.reasoning, end="")
property has_reasoning: bool

Check if chunk contains reasoning content.

Returns:

True if reasoning_content is non-empty.

Examples

>>> for chunk in chat.stream("...", reasoning=True):
...     if chunk.has_reasoning:
...         print(f"Reasoning: {chunk.reasoning}")
property has_streaming_tool_calls: bool

Check if chunk contains streaming tool call data.

Returns:

True if streaming_tool_calls is non-empty.

Examples

>>> for chunk in chat.stream(..., tools=[...]):
...     if chunk.has_streaming_tool_calls:
...         for stc in chunk.streaming_tool_calls:
...             print(f"Tool: {stc.name}, progress: {stc.arguments_length}")
__repr__()[source]

Return string representation.

class lexilux.chat.models.ToolCall(id, call_id, name, arguments)[source]

Bases: object

Represents a function/tool call initiated by the model.

When the model decides to call a function, it returns one or more ToolCall objects that specify which function to call and with what arguments.

Examples

>>> tool_call = ToolCall(
...     id="call_abc123",
...     call_id="call_abc123",
...     name="get_weather",
...     arguments='{"location": "Paris", "units": "celsius"}'
... )
>>> args = tool_call.get_arguments()
>>> args
{'location': 'Paris', 'units': 'celsius'}
id: str
call_id: str
name: str
arguments: str
get_arguments()[source]

Parse and return the arguments as a dictionary.

Returns:

Parsed arguments dictionary.

Raises:

json.JSONDecodeError – If arguments string is not valid JSON.

Return type:

dict[str, Any]

Examples

>>> tc = ToolCall(
...     id="call_1",
...     call_id="call_1",
...     name="get_weather",
...     arguments='{"location": "Paris"}'
... )
>>> tc.get_arguments()
{'location': 'Paris'}
to_dict()[source]

Convert to API format.

Returns:

Dictionary in OpenAI tool call format.

Return type:

dict[str, Any]

Examples

>>> tc = ToolCall(
...     id="call_1",
...     call_id="call_1",
...     name="get_weather",
...     arguments='{"location": "Paris"}'
... )
>>> tc.to_dict()
{'id': 'call_1', 'type': 'function', 'function': {'name': 'get_weather', 'arguments': '{"location": "Paris"}'}}
__init__(id, call_id, name, arguments)

Parameter Configuration

class lexilux.chat.params.ChatParams(temperature=0.7, top_p=1.0, max_tokens=None, stop=None, presence_penalty=0.0, frequency_penalty=0.0, logit_bias=None, user=None, n=1, tools=None, tool_choice=None, parallel_tool_calls=None, reasoning=None, extra=None, param_aliases=None)[source]

Bases: object

Standard parameters for chat completion requests.

This class defines the most commonly used parameters for OpenAI-compatible chat completion APIs. All parameters are optional and have sensible defaults.

temperature

What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. Default: 0.7

Type:

float

top_p

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. Range: 0.0 to 1.0. Default: 1.0

Type:

float

max_tokens

The maximum number of tokens to generate in the chat completion. The total length of input tokens and generated tokens is limited by the model’s context length. Default: None (no limit, up to model’s maximum)

Type:

int | None

stop

Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence. Can be a single string or a list of strings. Default: None

Type:

str | Sequence[str] | None

presence_penalty

Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics. Default: 0.0

Type:

float

frequency_penalty

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim. Default: 0.0

Type:

float

logit_bias

Modify the likelihood of specified tokens appearing in the completion. Accepts a dictionary mapping token IDs (integers) to an associated bias value from -100 to 100. Values around -100 should decrease the likelihood of the token appearing, while values around 100 should increase it. Default: None (empty dict)

Type:

dict[int, float] | None

user

A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. This is useful for tracking and rate limiting. Default: None

Type:

str | None

n

How many chat completion choices to generate for each input message. Note: Most implementations return only the first choice. This parameter is included for compatibility but may not be fully supported by all providers. Default: 1

Type:

int

tools

List of tools (functions) that the model may call. Enables function calling capabilities. When provided, the model can decide to call these functions instead of or in addition to generating text. Default: None (no tools)

Type:

list[Tool] | None

tool_choice

Controls when the model uses tools. Can be “auto” (model decides), “required” (must call tools), a specific tool name, or a ToolChoice object. Default: None (auto mode)

Type:

str | ToolChoice | None

parallel_tool_calls

Whether to enable parallel function calling. When True, the model may call multiple functions in a single turn. Default: None (provider default)

Type:

bool | None

extra

Additional custom parameters for OpenAI-compatible servers that may accept non-standard parameters. These will be merged into the request payload.

Common use cases: - Provider-specific experimental features - Custom provider options not in OpenAI standard - Specialized model behaviors (e.g., response format, seed)

For parameter name mapping, use param_aliases instead of extra when the provider uses standard OpenAI-compatible parameters with different keys.

Default: None (empty dict)

Type:

dict[str, Any] | None

param_aliases

Parameter name mapping for edge cases where providers use different names for standard OpenAI parameters. Most providers won’t need this feature.

Maps standard OpenAI parameter names to provider-specific names. Applied after standard parameter processing but before extra merging.

Example (rare cases):
>>> params = ChatParams(
...     temperature=0.7,
...     param_aliases={"temperature": "temp"}
... )
# Sends: {"temp": 0.7} instead of {"temperature": 0.7}

Default: None (no mapping needed for standard providers)

Type:

dict[str, str] | None

Examples

Basic usage with defaults: >>> params = ChatParams() >>> # temperature=0.7, top_p=1.0, etc.

Custom temperature and max_tokens: >>> params = ChatParams(temperature=0.5, max_tokens=100)

With stop sequences: >>> params = ChatParams(stop=[”nn”, “Human:”])

With penalties: >>> params = ChatParams( … presence_penalty=0.6, … frequency_penalty=0.3 … )

With tools: >>> from lexilux.chat.tools import FunctionTool >>> params = ChatParams( … tools=[ … FunctionTool( … name=”get_weather”, … description=”Get current weather”, … parameters={“type”: “object”, “properties”: {…}} … ) … ] … )

With custom provider features: >>> params = ChatParams( … temperature=0.8, … extra={ … “response_format”: {“type”: “json_object”}, … “seed”: 12345, … “logprobs”: True, … “top_logprobs”: 5 … } … )

With parameter aliases (rare cases): >>> params = ChatParams( … temperature=0.7, … param_aliases={“temperature”: “temp”} … )

temperature: float = 0.7
top_p: float = 1.0
max_tokens: int | None = None
stop: str | Sequence[str] | None = None
presence_penalty: float = 0.0
frequency_penalty: float = 0.0
logit_bias: dict[int, float] | None = None
user: str | None = None
n: int = 1
tools: list[Tool] | None = None
tool_choice: str | ToolChoice | None = None
parallel_tool_calls: bool | None = None
reasoning: bool | dict[str, Any] | None = None
extra: dict[str, Any] | None = None
param_aliases: dict[str, str] | None = None
to_dict(exclude_none=True)[source]

Convert parameters to dictionary for API request.

Parameters:

exclude_none (bool) – Whether to exclude None values from the output. Default: True

Returns:

Dictionary of parameters ready for API request.

Return type:

dict[str, Any]

Examples

Basic usage: >>> params = ChatParams(temperature=0.5, max_tokens=100) >>> params.to_dict() {‘temperature’: 0.5, ‘top_p’: 1.0, ‘max_tokens’: 100, …}

With parameter aliases: >>> params = ChatParams( … temperature=0.8, … param_aliases={“temperature”: “temp”} # Some providers use “temp” … ) >>> params.to_dict() {‘temp’: 0.8, ‘top_p’: 1.0, …}

__init__(temperature=0.7, top_p=1.0, max_tokens=None, stop=None, presence_penalty=0.0, frequency_penalty=0.0, logit_bias=None, user=None, n=1, tools=None, tool_choice=None, parallel_tool_calls=None, reasoning=None, extra=None, param_aliases=None)

History Management

class lexilux.chat.history.ChatHistory(messages=None, system=None)[source]

Bases: MutableSequence

Conversation history manager.

Implements MutableSequence protocol, allowing array-like operations: - Index access: history[0] - Slicing: history[1:5] (returns new ChatHistory) - Iteration: for msg in history - Length: len(history) - Membership: msg in history

ChatHistory can be automatically built from messages or Chat results, eliminating the need for manual history maintenance.

Examples

# Auto-extract from Chat call >>> result = chat(“Hello”) >>> history = ChatHistory.from_chat_result(“Hello”, result)

# Auto-extract from messages >>> messages = [{“role”: “user”, “content”: “Hello”}] >>> history = ChatHistory.from_messages(messages)

# Manual construction (optional) >>> history = ChatHistory(system=”You are helpful”) >>> history.add_user(“What is Python?”) >>> result = chat(history.get_messages()) >>> history.append_result(result)

# Array-like operations >>> msg = history[0] # Get first message >>> first_3 = history[:3] # Get first 3 messages (new ChatHistory) >>> for msg in history: # Iterate … print(msg) >>> len(history) # Get length >>> msg in history # Check membership

__init__(messages=None, system=None)[source]

Initialize conversation history.

Parameters:
  • messages (list[dict[str, str]] | None) – Message list (optional, can be extracted from anywhere).

  • system (str | None) – System message (optional).

Note

The messages list is deep copied to prevent external modifications.

system
messages: list[dict[str, str]]
metadata: dict[str, Any]
classmethod from_messages(messages, system=None)[source]

Automatically build from message list (supports all Chat-supported formats).

Parameters:
Returns:

ChatHistory instance.

Return type:

ChatHistory

Examples

>>> history = ChatHistory.from_messages("Hello")
>>> history = ChatHistory.from_messages([{"role": "user", "content": "Hello"}])
classmethod from_chat_result(messages, result)[source]

Automatically build complete history from Chat call and result.

Parameters:
Returns:

ChatHistory instance with complete conversation.

Return type:

ChatHistory

Examples

>>> result = chat("Hello")
>>> history = ChatHistory.from_chat_result("Hello", result)
classmethod from_dict(data)[source]

Deserialize from dictionary.

Parameters:

data (dict) – Dictionary containing history data.

Returns:

ChatHistory instance.

Return type:

ChatHistory

classmethod from_json(json_str)[source]

Deserialize from JSON string.

Parameters:

json_str (str) – JSON string containing history data.

Returns:

ChatHistory instance.

Return type:

ChatHistory

add_user(content)[source]

Add user message.

add_assistant(content)[source]

Add assistant message.

add_message(role, content)[source]

Add message with specified role.

add_system(content)[source]

Add system message (updates system attribute).

remove_last()[source]

Remove and return the last message.

Returns:

The removed message dict, or None if history is empty.

Return type:

dict[str, str] | None

remove_at(index)[source]

Remove and return message at specified index.

Parameters:

index (int) – Index of message to remove.

Returns:

The removed message dict, or None if index is out of range.

Return type:

dict[str, str] | None

replace_at(index, role, content)[source]

Replace message at specified index.

Parameters:
  • index (int) – Index of message to replace.

  • role (str) – New role.

  • content (str) – New content.

Raises:

IndexError – If index is out of range.

get_user_messages()[source]

Get all user messages.

Returns:

List of user message contents.

Return type:

list[str]

get_assistant_messages()[source]

Get all assistant messages.

Returns:

List of assistant message contents.

Return type:

list[str]

get_last_message()[source]

Get the last message.

Returns:

Last message dict, or None if history is empty.

Return type:

dict[str, str] | None

get_last_user_message()[source]

Get the last user message content.

Returns:

Last user message content, or None if no user messages exist.

Return type:

str | None

clone()[source]

Create a deep copy of this history.

Returns:

New ChatHistory instance with copied messages.

Return type:

ChatHistory

clear()[source]

Clear all messages (keep system message).

get_messages(include_system=True)[source]

Get messages list.

Parameters:

include_system (bool) – Whether to include system message.

Returns:

List of message dictionaries.

Return type:

list[dict[str, str]]

to_dict()[source]

Serialize to dictionary.

Returns:

Dictionary containing history data.

Return type:

dict[str, Any]

to_json(**kwargs)[source]

Serialize to JSON string.

Parameters:

**kwargs – Additional arguments for json.dumps.

Returns:

JSON string.

Return type:

str

count_tokens(tokenizer)[source]

Count total tokens in history.

This is a convenience method that returns only the total token count. For detailed analysis, use analyze_tokens() instead.

Parameters:

tokenizer (Tokenizer) – Tokenizer instance.

Returns:

Total token count across all messages (including system message).

Return type:

int

Examples

>>> from lexilux import ChatHistory, Tokenizer
>>> tokenizer = Tokenizer("Qwen/Qwen2.5-7B-Instruct")
>>> history = ChatHistory.from_messages("Hello")
>>> total = history.count_tokens(tokenizer)
>>> print(f"Total tokens: {total}")

See also

analyze_tokens() - For detailed token analysis

count_tokens_per_round(tokenizer)[source]

Count tokens per round.

This method returns a simple list of (round_index, total_tokens) tuples. For more detailed per-round analysis (including user/assistant breakdown), use analyze_tokens() instead.

Parameters:

tokenizer (Tokenizer) – Tokenizer instance.

Returns:

List of (round_index, total_tokens) tuples, where round_index is 0-based.

Return type:

list[tuple[int, int]]

Examples

>>> from lexilux import ChatHistory, Tokenizer
>>> tokenizer = Tokenizer("Qwen/Qwen2.5-7B-Instruct")
>>> history = ChatHistory.from_messages("Hello")
>>> history.add_assistant("Hi!")
>>> round_tokens = history.count_tokens_per_round(tokenizer)
>>> for idx, tokens in round_tokens:
...     print(f"Round {idx}: {tokens} tokens")

See also

analyze_tokens() - For detailed per-round analysis with role breakdown

count_tokens_by_role(tokenizer)[source]

Count tokens grouped by role (system, user, assistant).

Parameters:

tokenizer (Tokenizer) – Tokenizer instance.

Returns:

Dictionary mapping role to total token count for that role. Keys: “system”, “user”, “assistant”

Return type:

dict[str, int]

Examples

>>> from lexilux import ChatHistory, Tokenizer
>>> tokenizer = Tokenizer("Qwen/Qwen2.5-7B-Instruct")
>>> history = ChatHistory(system="You are helpful")
>>> history.add_user("Hello")
>>> history.add_assistant("Hi!")
>>> role_tokens = history.count_tokens_by_role(tokenizer)
>>> print(f"User tokens: {role_tokens['user']}")
>>> print(f"Assistant tokens: {role_tokens['assistant']}")
analyze_tokens(tokenizer)[source]

Perform comprehensive token analysis on conversation history.

This method provides detailed token statistics including: - Total tokens and breakdown by role - Per-message token counts with content previews - Per-round analysis with user/assistant breakdown - Statistical metrics (averages, min, max) - Token distribution by role

Parameters:

tokenizer (Tokenizer) – Tokenizer instance.

Returns:

TokenAnalysis object containing comprehensive token statistics.

Return type:

TokenAnalysis

Examples

Basic usage: >>> from lexilux import ChatHistory, Tokenizer >>> tokenizer = Tokenizer(“Qwen/Qwen2.5-7B-Instruct”) >>> history = ChatHistory(system=”You are helpful”) >>> history.add_user(“What is Python?”) >>> history.add_assistant(“Python is a programming language.”) >>> analysis = history.analyze_tokens(tokenizer) >>> print(f”Total: {analysis.total_tokens}”) >>> print(f”User: {analysis.user_tokens}, Assistant: {analysis.assistant_tokens}”)

Detailed analysis: >>> analysis = history.analyze_tokens(tokenizer) >>> # Per-message breakdown >>> for role, preview, tokens in analysis.per_message: … print(f”{role}: {preview[:30]}… ({tokens} tokens)”) >>> # Per-round breakdown >>> for idx, total, user, assistant in analysis.per_round: … print(f”Round {idx}: total={total}, user={user}, assistant={assistant}”) >>> # Distribution >>> print(f”Distribution: {analysis.token_distribution}”)

Export analysis: >>> analysis_dict = analysis.to_dict() >>> import json >>> print(json.dumps(analysis_dict, indent=2))

truncate_by_rounds(tokenizer, max_tokens, keep_system=True)[source]

Truncate by rounds, keeping the most recent rounds within max_tokens limit.

Parameters:
  • tokenizer (Tokenizer) – Tokenizer instance.

  • max_tokens (int) – Maximum token count.

  • keep_system (bool) – Whether to keep system message.

Returns:

New ChatHistory instance (does not modify original).

Return type:

ChatHistory

get_last_n_rounds(n)[source]

Get last N rounds.

Parameters:

n (int) – Number of rounds to get.

Returns:

New ChatHistory instance with last N rounds.

Return type:

ChatHistory

remove_last_round()[source]

Remove the last round (user + assistant pair).

append_result(result)[source]

Append ChatResult as assistant message.

update_last_assistant(content)[source]

Update the last assistant message content (useful for continue scenarios).

__len__()[source]

Return the number of messages.

__getitem__(key)[source]

Get message(s) by index or slice.

Parameters:

key (int | slice) – Index (int) or slice.

Returns:

Single message dict (index) or new ChatHistory instance (slice).

Return type:

dict[str, str] | ChatHistory

Examples

>>> history[0]  # Get first message
>>> history[1:3]  # Get messages at index 1-2, returns new ChatHistory
>>> history[:5]  # Get first 5 messages
>>> history[-3:]  # Get last 3 messages
__setitem__(key, value)[source]

Set message(s) by index or slice.

Parameters:
Raises:

TypeError – If value type is invalid.

__delitem__(key)[source]

Delete message(s) by index or slice.

Parameters:

key (int | slice) – Index (int) or slice.

insert(index, value)[source]

Insert message at specified index.

Parameters:
  • index (int) – Index to insert at.

  • value (dict[str, str]) – Message dict to insert.

Raises:

TypeError – If value is not a dict.

__iter__()[source]

Iterate over messages.

__contains__(item)[source]

Check if message is in history.

__add__(other)[source]

Merge two histories (concatenate messages).

Parameters:

other (ChatHistory) – Another ChatHistory instance.

Returns:

New ChatHistory instance with merged messages. System message from self is used.

Return type:

ChatHistory

Examples

>>> history1 = ChatHistory.from_messages("Hello")
>>> history2 = ChatHistory.from_messages("How are you?")
>>> combined = history1 + history2
__repr__()[source]

Return string representation.

class lexilux.chat.history.TokenAnalysis(total_tokens, system_tokens, user_tokens, assistant_tokens, total_messages, system_messages, user_messages, assistant_messages, per_message, per_round, average_tokens_per_message, average_tokens_per_round, max_message_tokens, min_message_tokens, token_distribution)[source]

Bases: object

Detailed token analysis result for conversation history.

Provides comprehensive token statistics including totals, per-role breakdown, per-message details, and per-round analysis.

total_tokens

Total number of tokens across all messages.

Type:

int

system_tokens

Number of tokens in system message (if present).

Type:

int

user_tokens

Total tokens in all user messages.

Type:

int

assistant_tokens

Total tokens in all assistant messages.

Type:

int

total_messages

Total number of messages analyzed.

Type:

int

system_messages

Number of system messages (0 or 1).

Type:

int

user_messages

Number of user messages.

Type:

int

assistant_messages

Number of assistant messages.

Type:

int

per_message

List of (role, content_preview, tokens) tuples for each message.

Type:

list[tuple[str, str, int]]

per_round

List of (round_index, round_tokens, user_tokens, assistant_tokens) tuples.

Type:

list[tuple[int, int, int, int]]

average_tokens_per_message

Average tokens per message.

Type:

float

average_tokens_per_round

Average tokens per round.

Type:

float

max_message_tokens

Maximum tokens in a single message.

Type:

int

min_message_tokens

Minimum tokens in a single message.

Type:

int

token_distribution

Dict mapping role to total tokens for that role.

Type:

dict[str, int]

Examples

>>> from lexilux import ChatHistory, Tokenizer
>>> tokenizer = Tokenizer("Qwen/Qwen2.5-7B-Instruct")
>>> history = ChatHistory.from_messages("Hello")
>>> analysis = history.analyze_tokens(tokenizer)
>>> print(f"Total tokens: {analysis.total_tokens}")
>>> print(f"User tokens: {analysis.user_tokens}")
>>> print(f"Assistant tokens: {analysis.assistant_tokens}")
total_tokens: int
system_tokens: int
user_tokens: int
assistant_tokens: int
total_messages: int
system_messages: int
user_messages: int
assistant_messages: int
per_message: list[tuple[str, str, int]]
per_round: list[tuple[int, int, int, int]]
average_tokens_per_message: float
average_tokens_per_round: float
max_message_tokens: int
min_message_tokens: int
token_distribution: dict[str, int]
__repr__()[source]

Return string representation.

to_dict()[source]

Convert to dictionary for serialization.

__init__(total_tokens, system_tokens, user_tokens, assistant_tokens, total_messages, system_messages, user_messages, assistant_messages, per_message, per_round, average_tokens_per_message, average_tokens_per_round, max_message_tokens, min_message_tokens, token_distribution)

Formatting

class lexilux.chat.formatters.ChatHistoryFormatter[source]

Bases: object

Chat history formatter.

Provides static methods to format ChatHistory into various output formats.

static to_markdown(history, *, show_round_numbers=True, show_timestamps=False, highlight_system=True)[source]

Format history as Markdown.

Parameters:
  • history (ChatHistory) – ChatHistory instance to format.

  • show_round_numbers (bool) – Whether to show round numbers. Default: True

  • highlight_system (bool) – Whether to highlight system message. Default: True

  • show_timestamps (bool) – Whether to show timestamps (if available). Default: False

Returns:

Markdown formatted string.

Return type:

str

Examples

>>> history = ChatHistory.from_chat_result("Hello", result)
>>> md = ChatHistoryFormatter.to_markdown(history)
>>> print(md)
static to_html(history, *, theme='default', show_round_numbers=True, show_timestamps=False)[source]

Format history as HTML (beautiful and clear).

Parameters:
  • history (ChatHistory) – ChatHistory instance to format.

  • theme (str) – Theme name (“default”, “dark”, “minimal”). Default: “default”

  • show_round_numbers (bool) – Whether to show round numbers. Default: True

  • show_timestamps (bool) – Whether to show timestamps (if available). Default: False

Returns:

HTML formatted string with embedded CSS.

Return type:

str

Examples

>>> history = ChatHistory.from_chat_result("Hello", result)
>>> html = ChatHistoryFormatter.to_html(history, theme="dark")
static to_text(history, *, show_round_numbers=True, width=80)[source]

Format history as plain text (console-friendly).

Parameters:
  • history (ChatHistory) – ChatHistory instance to format.

  • show_round_numbers (bool) – Whether to show round numbers. Default: True

  • width (int) – Text width for wrapping. Default: 80

Returns:

Plain text formatted string.

Return type:

str

Examples

>>> history = ChatHistory.from_chat_result("Hello", result)
>>> text = ChatHistoryFormatter.to_text(history, width=100)
static to_json(history, **kwargs)[source]

Format history as JSON (program-friendly).

Parameters:
  • history (ChatHistory) – ChatHistory instance to format.

  • **kwargs – Additional arguments for json.dumps (e.g., indent=2).

Returns:

JSON formatted string.

Return type:

str

Examples

>>> history = ChatHistory.from_chat_result("Hello", result)
>>> json_str = ChatHistoryFormatter.to_json(history, indent=2)
static save(history, filepath, format='auto', **options)[source]

Save history to file (automatically selects format based on extension).

Parameters:
  • history (ChatHistory) – ChatHistory instance to save.

  • filepath (str) – Path to save file.

  • format (str) – Format to use (“auto”, “markdown”, “html”, “text”, “json”). If “auto”, format is determined by file extension.

  • **options (Any) – Additional options for formatters.

Examples

>>> history = ChatHistory.from_chat_result("Hello", result)
>>> ChatHistoryFormatter.save(history, "conversation.md")
>>> ChatHistoryFormatter.save(history, "conversation.html", theme="dark")
>>> ChatHistoryFormatter.save(history, "conversation.txt", width=100)

Streaming

class lexilux.chat.streaming.StreamingResult[source]

Bases: object

Streaming accumulated result (can be used as ChatResult).

Automatically accumulates text during streaming, content updates automatically on each iteration. Can be used as a string, or converted to ChatResult.

__init__()[source]

Initialize accumulated result.

update(chunk)[source]

Update accumulated content (internal call).

property text: str

Get currently accumulated text (can be used as string).

property finish_reason: str | None

Get finish_reason.

property usage: Usage

Get usage.

property done: bool

Whether streaming is done.

set_result(text, finish_reason, usage)[source]

Set complete result directly (for merged streaming results).

This method properly sets all attributes according to __slots__, avoiding dynamic attribute creation.

Parameters:
  • text (str) – Complete text content.

  • finish_reason (str | None) – Reason why generation stopped.

  • usage (Usage) – Usage statistics.

to_chat_result()[source]

Convert to ChatResult (for history).

__str__()[source]

Use as string.

__repr__()[source]

Return string representation.

class lexilux.chat.streaming.StreamingIterator(chunk_iterator)[source]

Bases: object

Streaming iterator (wraps original iterator, provides accumulated result).

Automatically updates accumulated result on each iteration, user can access current state at any time.

__init__(chunk_iterator)[source]

Initialize.

__iter__()[source]

Iterate chunks.

property result: StreamingResult

Get currently accumulated result (accessible at any time).

Continue Functionality

lexilux.chat.conversation.Conversation

alias of _ResponseContinuer

Function Calling

class lexilux.chat.tools.FunctionTool(name, description, parameters=<factory>, strict=False, type='function')[source]

Bases: object

Function tool definition.

Represents a function that the model can call during chat completion.

Examples

>>> tool = FunctionTool(
...     name="get_weather",
...     description="Get current weather for a location",
...     parameters={
...         "type": "object",
...         "properties": {
...             "location": {
...                 "type": "string",
...                 "description": "City name, e.g. Paris"
...             }
...         },
...         "required": ["location"]
...     }
... )
name: str
description: str
parameters: dict[str, Any]
strict: bool = False
type: Literal['function'] = 'function'
to_dict()[source]

Convert to API request format.

Returns:

Dictionary in OpenAI tool format with nested ‘function’ field.

Return type:

dict[str, Any]

Examples

>>> tool = FunctionTool(name="get_weather", description="...", parameters={})
>>> tool.to_dict()
{'type': 'function', 'function': {'name': 'get_weather', 'description': '...', 'parameters': {}, 'strict': False}}
__init__(name, description, parameters=<factory>, strict=False, type='function')
class lexilux.chat.tools.ToolChoice(type, name=None, tools=None)[source]

Bases: object

Tool choice strategy configuration.

Controls when and how the model uses tools.

Examples

>>> # Auto mode (let model decide)
>>> choice = ToolChoice(type="auto")
>>>
>>> # Require tool calls
>>> choice = ToolChoice(type="required")
>>>
>>> # Force specific function
>>> choice = ToolChoice(type="function", name="get_weather")
>>>
>>> # Restrict to specific tools
>>> choice = ToolChoice(
...     type="allowed_tools",
...     tools=[FunctionTool(name="get_weather", ...)]
... )
type: Literal['auto', 'required', 'function', 'allowed_tools']
name: str | None = None
tools: list[FunctionTool] | None = None
to_dict()[source]

Convert to API request format.

Returns:

String for simple modes, dict for complex modes.

Return type:

dict[str, Any] | str

Examples

>>> ToolChoice(type="auto").to_dict()
'auto'
>>> ToolChoice(type="required").to_dict()
'required'
>>> ToolChoice(type="function", name="get_weather").to_dict()
{'type': 'function', 'function': {'name': 'get_weather'}}
__init__(type, name=None, tools=None)
class lexilux.chat.tool_helpers.ToolCallHelper(functions)[source]

Bases: object

Helper class for managing tool calling workflows.

Provides a higher-level abstraction for common tool calling patterns including automatic execution and conversation continuation.

functions

Dictionary mapping function names to callables.

Examples

>>> def get_weather(location: str) -> str:
...     return f"Weather in {location}: 22°C"
>>>
>>> helper = ToolCallHelper({"get_weather": get_weather})
>>>
>>> result = chat("What's the weather in Paris?", tools=[tool])
>>> if result.has_tool_calls:
...     final_result = helper.continue_conversation(
...         chat=chat,
...         messages=[{"role": "user", "content": "..."}],
...         tool_result=result,
...         tools=[tool]
...     )
__init__(functions)[source]

Initialize ToolCallHelper.

Parameters:

functions (dict[str, Callable]) – Dictionary mapping function names to callable functions.

execute_tool_calls(result)[source]

Execute tool calls from a chat result.

Parameters:

result (ChatResult) – ChatResult containing tool calls.

Returns:

List of tool response messages.

Return type:

list[dict[str, Any]]

continue_conversation(chat, messages, tool_result, tools=None)[source]

Continue conversation after tool execution.

Executes tool calls, builds conversation history, and sends a follow-up request to get the final response.

Parameters:
  • chat (Any) – Chat client instance.

  • messages (list[dict[str, Any]]) – Original messages that led to tool calls.

  • tool_result (ChatResult) – ChatResult containing tool calls.

  • tools (list[Any] | None) – Tools to pass to follow-up request.

Returns:

Final ChatResult after tool execution.

Return type:

ChatResult

Examples

>>> helper = ToolCallHelper({"get_weather": get_weather})
>>> result = chat("What's the weather?", tools=[tool])
>>> if result.has_tool_calls:
...     final = helper.continue_conversation(
...         chat=chat,
...         messages=[{"role": "user", "content": "What's the weather?"}],
...         tool_result=result,
...         tools=[tool]
...     )
>>> print(final.text)

Tool Helpers

lexilux.chat.tool_helpers.execute_tool_calls(result, functions)[source]

Execute tool calls from a chat result.

Takes a ChatResult that contains tool calls and executes them using the provided functions dictionary. Returns tool response messages that can be sent back to the model.

Parameters:
  • result (ChatResult) – ChatResult containing tool calls.

  • functions (dict[str, Callable]) – Dictionary mapping function names to callable functions. The function signature should match the arguments in the tool call.

Returns:

List of tool response messages (role=”tool”) with results.

Raises:

ValueError – If an unknown function name is encountered.

Return type:

list[dict[str, Any]]

Examples

>>> def get_weather(location: str, units: str = "celsius") -> str:
...     return f"Weather in {location}: 22°{units}"
>>>
>>> result = chat("What's the weather in Paris?", tools=[tool])
>>> tool_responses = execute_tool_calls(
...     result,
...     {"get_weather": get_weather}
... )
>>>
>>> # Continue conversation with tool results
>>> final_result = chat(
...     messages=history + tool_responses,
...     tools=[tool]
... )
lexilux.chat.tool_helpers.create_conversation_history(original_messages, tool_result, tool_outputs)[source]

Create complete conversation history including tool calls and responses.

Takes the original messages, the assistant result with tool calls, and the tool execution outputs, and combines them into a complete conversation history ready for the next API request.

Parameters:
  • original_messages (list[dict[str, Any]]) – The messages that led to the tool calls.

  • tool_result (ChatResult) – The ChatResult containing tool calls.

  • tool_outputs (list[dict[str, Any]]) – Tool response messages from execute_tool_calls().

Returns:

  • Original messages

  • Assistant message with tool_calls

  • Tool response messages

Return type:

Complete conversation history including

Examples

>>> messages = [{"role": "user", "content": "What's the weather?"}]
>>> result = chat(messages, tools=[get_weather_tool])
>>> if result.has_tool_calls:
...     tool_outputs = execute_tool_calls(result, {"get_weather": get_weather})
...     history = create_conversation_history(messages, result, tool_outputs)
...     final_result = chat(history, tools=[get_weather_tool])

Content Blocks (Multimodal)

lexilux.chat.content_blocks.ContentBlock Union[TextContentBlock, ImageContentBlock]

A content block for multimodal messages.

lexilux.chat.content_blocks.TextContentBlock TypedDict

Text content block with type=”text” and text field.

lexilux.chat.content_blocks.ImageContentBlock TypedDict

Image content block with type=”image_url” and image_url field.

lexilux.chat.content_blocks.ImageUrlDetail TypedDict

Image URL detail configuration with url and optional detail field.

Utility Functions

lexilux.chat.utils.normalize_messages(messages, system=None)[source]

Normalize messages input to a list of message dictionaries.

Supports multiple input formats with backward compatibility: - str: Converted to [{“role”: “user”, “content”: str}] - List[Dict[str, str]]: Used as-is (legacy format, content is string) - List[Dict[str, Any]]: Used as-is (supports multimodal content as list) - List[str]: Converted to [{“role”: “user”, “content”: str}, …]

Multimodal content is supported by passing content as a list of blocks: [{“type”: “text”, “text”: “…”}, {“type”: “image_url”, “image_url”: {…}}]

Parameters:
Returns:

Normalized list of message dictionaries.

Return type:

list[dict[str, Any]]

Examples

>>> # Simple string
>>> normalize_messages("hi")
[{'role': 'user', 'content': 'hi'}]
>>> # Legacy format (content as string)
>>> normalize_messages([{"role": "user", "content": "hi"}])
[{'role': 'user', 'content': 'hi'}]
>>> # Multimodal format
>>> normalize_messages([{
...     "role": "user",
...     "content": [
...         {"type": "text", "text": "What's in this image?"},
...         {"type": "image_url", "image_url": {"url": "https://..."}}
...     ]
... }])
>>> # With system message
>>> normalize_messages("hi", system="You are helpful")
[{'role': 'system', 'content': 'You are helpful'}, {'role': 'user', 'content': 'hi'}]

Type Aliases

lexilux.chat.models.Role Literal["system", "user", "assistant", "tool"]

Valid role types for chat messages.

lexilux.chat.models.MessageLike Union[str, dict[str, str]]

A single message in various formats.

lexilux.chat.models.MessagesLike Union[str, Sequence[MessageLike]]

Messages in various formats (string, list of strings, list of dicts).

See Also