Add op module for transaction#21
Conversation
…ks, queue creation, validation, and message handling - Replaced direct SQL execution with calls to PGMQOperation methods for checking pgmq and pg_partman extensions. - Consolidated queue creation, validation, and message sending methods to use PGMQOperation for both sync and async operations. - Updated method signatures to include optional session and commit parameters for better session management. - Removed redundant sync and async methods, streamlining the codebase and improving maintainability.
Summary of ChangesHello @jason810496, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a significant architectural enhancement by creating a dedicated Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request refactors the PGMQ database operations into a new pgmq_sqlalchemy.operation module. This is a great architectural improvement that centralizes database logic and improves separation of concerns. The PGMQueue class is updated to use this new module, and its methods now accept an optional session argument, giving users more control over transactions.
My review focuses on a few critical issues in the new operation module, including a SQL injection vulnerability and a bug that would cause a TypeError at runtime. I've also included a suggestion to reduce code duplication in the queue module. Addressing these points will significantly improve the security and robustness of the library.
| def _get_send_statement(queue_name: str, message: dict, delay: int) -> str: | ||
| """Get statement for send (no params, using f-string).""" | ||
| encoded_message = encode_dict_to_psql(message) | ||
| return f"select * from pgmq.send('{queue_name}',{encoded_message},{delay});" | ||
|
|
||
| @staticmethod | ||
| def _get_send_batch_statement( | ||
| queue_name: str, messages: List[dict], delay: int | ||
| ) -> str: | ||
| """Get statement for send_batch (no params, using f-string).""" | ||
| encoded_messages = encode_list_to_psql(messages) | ||
| return ( | ||
| f"select * from pgmq.send_batch('{queue_name}',{encoded_messages},{delay});" | ||
| ) |
There was a problem hiding this comment.
The methods _get_send_statement and _get_send_batch_statement use f-strings to construct SQL queries. This is a security risk as it can lead to SQL injection vulnerabilities through the queue_name parameter. You should use parameterized queries for all user-provided inputs to prevent this.
This change will also require updating the send, send_async, send_batch, and send_batch_async methods to handle parameters instead of a raw SQL string.
def _get_send_statement(
queue_name: str, message: dict, delay: int
) -> Tuple[str, Dict[str, Any]]:
"""Get statement and params for send."""
return (
"select * from pgmq.send(:queue_name, :message::jsonb, :delay);",
{
"queue_name": queue_name,
"message": __import__("json").dumps(message),
"delay": delay,
},
)
@staticmethod
def _get_send_batch_statement(
queue_name: str, messages: List[dict], delay: int
) -> Tuple[str, Dict[str, Any]]:
"""Get statement and params for send_batch."""
return (
"select * from pgmq.send_batch(:queue_name, :messages, :delay);",
{
"queue_name": queue_name,
"messages": [__import__("json").dumps(m) for m in messages],
"delay": delay,
},
)| return QueueMetrics( | ||
| queue_name=row[0], | ||
| queue_length=row[1], | ||
| newest_msg_age_sec=row[2], | ||
| oldest_msg_age_sec=row[3], | ||
| total_messages=row[4], | ||
| scrape_time=row[5], | ||
| ) |
There was a problem hiding this comment.
The QueueMetrics data class is being initialized with a scrape_time argument, but the class definition in pgmq_sqlalchemy/schema.py does not include this field. This will cause a TypeError at runtime. This issue is also present in metrics_async, metrics_all, and metrics_all_async methods. You should remove the scrape_time argument from the QueueMetrics constructor in all these methods.
return QueueMetrics(
queue_name=row[0],
queue_length=row[1],
newest_msg_age_sec=row[2],
oldest_msg_age_sec=row[3],
total_messages=row[4],
)| if self.is_async: | ||
| return self.loop.run_until_complete( | ||
| self._create_queue_async(queue_name, unlogged) | ||
| ) | ||
| return self._create_queue_sync(queue_name, unlogged) | ||
| if session is None: | ||
|
|
||
| def _create_partitioned_queue_sync( | ||
| self, | ||
| queue_name: str, | ||
| partition_interval: str, | ||
| retention_interval: str, | ||
| ) -> None: | ||
| """Create a new partitioned queue.""" | ||
| with self.session_maker() as session: | ||
| session.execute( | ||
| text( | ||
| "select pgmq.create_partitioned(:queue_name, :partition_interval, :retention_interval);" | ||
| ), | ||
| { | ||
| "queue_name": queue_name, | ||
| "partition_interval": partition_interval, | ||
| "retention_interval": retention_interval, | ||
| }, | ||
| ) | ||
| session.commit() | ||
| async def _create(): | ||
| async with self.session_maker() as s: | ||
| await PGMQOperation.create_queue_async( | ||
| queue_name, unlogged, session=s, commit=commit | ||
| ) | ||
|
|
||
| async def _create_partitioned_queue_async( | ||
| self, | ||
| queue_name: str, | ||
| partition_interval: str, | ||
| retention_interval: str, | ||
| ) -> None: | ||
| """Create a new partitioned queue.""" | ||
| async with self.session_maker() as session: | ||
| await session.execute( | ||
| text( | ||
| "select pgmq.create_partitioned(:queue_name, :partition_interval, :retention_interval);" | ||
| ), | ||
| { | ||
| "queue_name": queue_name, | ||
| "partition_interval": partition_interval, | ||
| "retention_interval": retention_interval, | ||
| }, | ||
| return self.loop.run_until_complete(_create()) | ||
| return self.loop.run_until_complete( | ||
| PGMQOperation.create_queue_async( | ||
| queue_name, unlogged, session=session, commit=commit | ||
| ) | ||
| ) | ||
| await session.commit() | ||
|
|
||
| if session is None: | ||
| with self.session_maker() as s: | ||
| return PGMQOperation.create_queue( | ||
| queue_name, unlogged, session=s, commit=commit | ||
| ) | ||
| return PGMQOperation.create_queue( | ||
| queue_name, unlogged, session=session, commit=commit | ||
| ) |
There was a problem hiding this comment.
There is a significant amount of boilerplate code for handling synchronous and asynchronous execution, as well as session management, duplicated across almost all methods in this class. This makes the code harder to read and maintain. Consider refactoring this logic into a private helper method or a decorator to reduce duplication.
For example, you could have a helper method like this:
def _execute(self, op_sync, op_async, session, commit, *args, **kwargs):
if self.is_async:
if session is None:
async def _run():
async with self.session_maker() as s:
return await op_async(*args, **kwargs, session=s, commit=commit)
return self.loop.run_until_complete(_run())
return self.loop.run_until_complete(op_async(*args, **kwargs, session=session, commit=commit))
if session is None:
with self.session_maker() as s:
return op_sync(*args, **kwargs, session=s, commit=commit)
return op_sync(*args, **kwargs, session=session, commit=commit)And then create_queue would become:
def create_queue(self, queue_name: str, unlogged: bool = False, *, session: Optional[SESSION_TYPE] = None, commit: bool = True) -> None:
self._execute(
PGMQOperation.create_queue,
PGMQOperation.create_queue_async,
session,
commit,
queue_name,
unlogged=unlogged,
)op module for transationop module for transaction
Codecov Report❌ Patch coverage is
🚀 New features to boost your workflow:
|
jason810496
left a comment
There was a problem hiding this comment.
- resolve all the code review comment
- finish tests/test_operation.py
|
@jason810496 I've opened a new pull request, #22, to work on those changes. Once the pull request is ready, I'll request review from you. |
…duce boilerplate (#22) * Initial plan * Fix SQL injection vulnerability and QueueMetrics bug in operation.py Co-authored-by: jason810496 <[email protected]> * Refactor queue.py to reduce boilerplate with _execute_operation helper Co-authored-by: jason810496 <[email protected]> * Add comprehensive tests for PGMQOperation class Co-authored-by: jason810496 <[email protected]> * Fix json import and send_batch parameter handling Co-authored-by: jason810496 <[email protected]> * Fix test exception handling to be more specific Co-authored-by: jason810496 <[email protected]> --------- Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: jason810496 <[email protected]>
jason810496
left a comment
There was a problem hiding this comment.
@copilot improve the test coverage:
https://app.codecov.io/gh/jason810496/pgmq-sqlalchemy/pull/21/blob/pgmq_sqlalchemy/queue.py
|
@jason810496 I've opened a new pull request, #23, to work on those changes. Once the pull request is ready, I'll request review from you. |
…on bugs (#23) * Initial plan * Add comprehensive test coverage for PGMQOperation methods - Added tests for delete_batch (sync and async) - Added tests for archive_batch (sync and async) - Added tests for purge (sync and async) - Added tests for read_with_poll (sync and async) - Added tests for drop_queue (sync and async) - Added tests for check_pg_partman_ext (sync and async) - Added tests for create_partitioned_queue (sync and async) - Added tests for time-based partitioned queues Note: Tests currently fail due to SQL parameterization bug in operation.py where mixed parameter styles (:param and %(param)s) cause syntax errors. This was introduced in commit 849a8f1. Co-authored-by: jason810496 <[email protected]> * Fix SQL parameterization bugs and improve partition interval validation - Fixed SQL injection fix by using CAST(:param AS type) instead of :param::type - Fixed delete_batch and archive_batch to return message IDs instead of boolean - Improved partition interval validation to accept numeric strings - Added validation to sync create_partitioned_queue for consistency - Fixed queue name length in partitioned queue tests - All new tests now pass successfully Co-authored-by: jason810496 <[email protected]> * Fix send_batch array formatting and validate_queue_name test - Fixed send_batch to properly escape JSON in PostgreSQL array literal format - Fixed test_validate_queue_name to handle both ProgrammingError and InternalError - All 103 operation tests now pass successfully Co-authored-by: jason810496 <[email protected]> * Address code review feedback - Move imports to top of file (time, InternalError) - Add documentation comment explaining array literal escaping approach - All tests still passing Co-authored-by: jason810496 <[email protected]> * Improve import ordering and documentation - Fix import grouping (stdlib imports together) - Enhance documentation for array literal escaping with security notes - All tests still passing Co-authored-by: jason810496 <[email protected]> --------- Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: jason810496 <[email protected]>
|
Current CI doesn't reflect the failure, I check the logs and it turn out that we got a lot of failure. |
|
@jason810496 I've opened a new pull request, #24, to work on those changes. Once the pull request is ready, I'll request review from you. |
* Initial plan * Fix CI configuration to properly report test failures Co-authored-by: jason810496 <[email protected]> * Refactor test fixtures to remove lazy fixtures and use dynamic parametrization Co-authored-by: jason810496 <[email protected]> --------- Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: jason810496 <[email protected]>
|
@copilot Fix the jsonb handling with from sqlalchemy import create_engine, text, bindparam
from sqlalchemy.dialects.postgresql import JSONB
import json
# Replace with your actual database URL
DATABASE_URL = "postgresql://user:password@localhost/mydatabase"
engine = create_engine(DATABASE_URL)
# The Python dictionary you want to insert or pass as a parameter
json_data = {"key": "value", "nested": [1, 2, 3]}
# Define the parameters with explicit type information for the JSONB field
params = {
'p1': 1,
'p2': json_data
}
# Construct the text query and bind the parameters
# Example: Calling a function that accepts an integer and a jsonb
# (adjust the SQL to your specific use case, e.g., INSERT or UPDATE)
sql_query = text("SELECT my_func(:p1, :p2)").bindparams(
bindparam('p2', type_=JSONB)
)
# Execute the query
with engine.connect() as connection:
result = connection.execute(sql_query, params)
# Process results if necessary
# print(result.scalar())Since we encounter the following error for |
|
@jason810496 I've opened a new pull request, #25, to work on those changes. Once the pull request is ready, I'll request review from you. |
#25) * Initial plan * Fix JSONB handling with bindparams for send and send_batch operations Co-authored-by: jason810496 <[email protected]> --------- Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: jason810496 <[email protected]>
|
@copilot Fix test error: |
|
@jason810496 I've opened a new pull request, #26, to work on those changes. Once the pull request is ready, I'll request review from you. |
* Initial plan * Add read_archive methods to PGMQOperation and PGMQueue Co-authored-by: jason810496 <[email protected]> * Fix partition interval validation for negative numeric strings Co-authored-by: jason810496 <[email protected]> * Apply ruff formatting fixes Co-authored-by: jason810496 <[email protected]> * Remove read_archive methods and tests per spec clarification Co-authored-by: jason810496 <[email protected]> --------- Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: jason810496 <[email protected]>
|
resolve ambiguous error for Example error: Spec |
|
@jason810496 I've opened a new pull request, #27, to work on those changes. Once the pull request is ready, I'll request review from you. |
…27) * Initial plan * Fix ambiguous column references in delete_batch and archive_batch Co-authored-by: jason810496 <[email protected]> * Add aliases to delete and archive statements to avoid ambiguity Co-authored-by: jason810496 <[email protected]> --------- Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: jason810496 <[email protected]>
Fix type hint
df20827 to
acfae2f
Compare
Fix typ hint
981bf2b to
6b2b845
Compare
jason810496
left a comment
There was a problem hiding this comment.
Fix fixture 'pgmq_all_variants' not found error in test_queue
|
@jason810496 I've opened a new pull request, #28, to work on those changes. Once the pull request is ready, I'll request review from you. |
* Initial plan * Fix pgmq_all_variants fixture import in test_queue.py Co-authored-by: jason810496 <[email protected]> --------- Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: jason810496 <[email protected]>
jason810496
left a comment
There was a problem hiding this comment.
@copilot improve test coverage for
- The following part for
_execute_operation
def _execute_operation(
# ...
return self.loop.run_until_complete(
op_async(*args, session=session, commit=commit, **kwargs)
)
# ....
return op_sync(*args, session=session, commit=commit, **kwargs)
- The case when
vtis not given forread_with_poll
|
@jason810496 I've opened a new pull request, #29, to work on those changes. Once the pull request is ready, I'll request review from you. |
* Initial plan * Add tests for _execute_operation with provided session and read_with_poll without vt Co-authored-by: jason810496 <[email protected]> * Add test for async _execute_operation path when session is None Co-authored-by: jason810496 <[email protected]> * Remove fragile line number references from test comments Co-authored-by: jason810496 <[email protected]> * Clarify comment in test_read_with_poll_without_vt Co-authored-by: jason810496 <[email protected]> * Address final code review feedback: improve test structure and add cleanup verification Co-authored-by: jason810496 <[email protected]> --------- Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: jason810496 <[email protected]>
No description provided.