Support for Jupyter Notebooks

The jupyter module contains functions to support the use of Text Extensions for Pandas in Jupyter notebooks.

class text_extensions_for_pandas.jupyter.DataFrameWidget(dataframe: pandas.core.frame.DataFrame, metadata_column: Optional[pandas.core.series.Series] = None, interactive_columns: Optional[list] = None)[source]
display() ipywidgets.widgets.widget.Widget[source]

Displays the widget. Returns a reference to the root output widget.

property selected: pandas.core.series.Series

A boolean series of the values of the selected rows in the table visualization.

set_interactive_columns(columns: list)[source]

Sets the columns to appear as interactive within the displayed widget.

Parameters

columns (list) – A list of column names to appear as interactive

to_dataframe() pandas.core.frame.DataFrame[source]

Returns a copy of the DateFrame backing the internal state of the widget data.

Returns

A copy of the backing dataframe.

Return type

pandas.DataFrame

text_extensions_for_pandas.jupyter.pretty_print_html(column: Union[SpanArray, TokenSpanArray], show_offsets: bool) str[source]

HTML pretty-printing of a series of spans for Jupyter notebooks.

Parameters
  • column – Span column (either character or token spans).

  • show_offsets – True to generate a table of span offsets in addition to the marked-up text

text_extensions_for_pandas.jupyter.run_with_progress_bar(num_items: int, fn: Callable, item_type: str = 'doc') List[pandas.core.frame.DataFrame][source]

Display a progress bar while iterating over a list of dataframes.

Parameters
  • num_items – Number of items to iterate over

  • fn – A function that accepts a single integer argument – let’s call it i – and performs processing for document i and returns a pd.DataFrame of results

  • item_type – Human-readable name for the items that the calling code is iterating over