Support for Jupyter Notebooks¶

The jupyter module contains functions to support the use of Text Extensions for Pandas in Jupyter notebooks.

class text_extensions_for_pandas.jupyter.DataFrameWidget(dataframe: pandas.core.frame.DataFrame, metadata_column: Optional[pandas.core.series.Series] = None, interactive_columns: Optional[list] = None)[source]¶

display() → ipywidgets.widgets.widget.Widget[source]¶: Displays the widget. Returns a reference to the root output widget.

property selected: pandas.core.series.Series¶: A boolean series of the values of the selected rows in the table visualization.

set_interactive_columns(columns: list)[source]¶

Sets the columns to appear as interactive within the displayed widget.

Parameters: columns (list) – A list of column names to appear as interactive

to_dataframe() → pandas.core.frame.DataFrame[source]¶

Returns a copy of the DateFrame backing the internal state of the widget data.

Returns: A copy of the backing dataframe.
Return type: pandas.DataFrame

text_extensions_for_pandas.jupyter.pretty_print_html(column: Union[SpanArray, TokenSpanArray], show_offsets: bool) → str[source]¶

HTML pretty-printing of a series of spans for Jupyter notebooks.

Parameters

column – Span column (either character or token spans).
show_offsets – True to generate a table of span offsets in addition to the marked-up text

text_extensions_for_pandas.jupyter.run_with_progress_bar(num_items: int, fn: Callable, item_type: str = 'doc') → List[pandas.core.frame.DataFrame][source]¶

Display a progress bar while iterating over a list of dataframes.

Parameters

num_items – Number of items to iterate over
fn – A function that accepts a single integer argument – let’s call it i – and performs processing for document i and returns a pd.DataFrame of results
item_type – Human-readable name for the items that the calling code is iterating over