Skip to content

feat!: complete deprecation and cleanup of multimodal blob APIs#16618

Open
shuoweil wants to merge 11 commits intomainfrom
shuowei-deprecate-blob-api
Open

feat!: complete deprecation and cleanup of multimodal blob APIs#16618
shuoweil wants to merge 11 commits intomainfrom
shuowei-deprecate-blob-api

Conversation

@shuoweil
Copy link
Copy Markdown
Contributor

@shuoweil shuoweil commented Apr 10, 2026

This PR completes the deprecation and cleanup of the public multimodal blob APIs in BigFrames. These APIs were not intended for general public use and are being internalized or removed to clean up the public API surface.

Key Changes:

  • API Internalization:

    • Renamed BlobAccessor to _BlobAccessor and internalized the .blob accessor on Series and DataFrame to ._blob.
    • Internalized Series.str.to_blob to Series.str._to_blob.
  • Deprecations:
    Deprecated Session.from_glob_path. It will now emit a warning suggesting users use read_gbq with a reference column instead.

  • Cleanup:

    • Removed bigframes/blob/_functions.py and associated large system tests.
    • Removed small system tests for blob operations in tests/system/small/blob/ (since they relied on the public .blob accessor).
    • Removed BlobAccessor from the API documentation reference in toc.yml.
    • Annotated cells in the Kaggle notebook vector-search-with-bigframes-over-national-jukebox.ipynb to note the deprecation of str.to_blob and removal of .audio_transcribe.

Fixes #<478952827> 🦕

@shuoweil shuoweil requested a review from GarrettWu April 10, 2026 23:02
@shuoweil shuoweil self-assigned this Apr 10, 2026
@shuoweil shuoweil requested review from a team as code owners April 10, 2026 23:02
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request transitions the blob accessor to an internal implementation (_blob) and removes or deprecates several public blob-related functions, including image and PDF processing utilities, in favor of bigframes.bigquery.obj functions. It also includes reformatting of various overloaded methods across the codebase. Feedback was provided to improve an error message in strings.py by making it dynamically reflect the method name to avoid confusion between public and internal methods.

Comment thread packages/bigframes/bigframes/operations/strings.py Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason still keeping this file?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed this file since TransformFuction was not used outside of it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to remove this function too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I've removed from_glob_path from Session, pandas/io/api.py, and pandas/init.py.

connection = session._create_bq_connection(connection=connection)
return self._data._apply_binary_op(connection, ops.obj_make_ref_op)

def to_blob(self, connection: Optional[str] = None) -> T:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wanna remove too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I've removed to_blob and _to_blob from StringMethods.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wanna remove too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I've removed read_gbq_object_table from Session, pandas/io/api.py, and pandas/init.py.

Comment thread packages/bigframes/.python-version Outdated
@@ -0,0 +1 @@
3.14.2
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this come from?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure where it came from, likely added by mistake by some tool as it contained an invalid version (3.14.2). I've removed it.

@@ -1 +1 @@
# Copyright 2025 Google LLC
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite get the file name and the test names in this file. Looks like they don't match.

We should rewrite those.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have rewritten the tests in test_blob_ops.py to use the blob prefix (e.g., test_blob_get_access_url) to match the file name. I also updated them to use the new bigframes.bigquery.obj.make_ref API instead of the deprecated str._to_blob, and updated the snapshots accordingly.

@GarrettWu
Copy link
Copy Markdown
Contributor

The title should be feat! instead of refactor!, as it is a user-facing change.

@shuoweil shuoweil changed the title refactor!: complete deprecation and cleanup of multimodal blob APIs feat!: complete deprecation and cleanup of multimodal blob APIs Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants