Python package
pangram.text_classifier module
- class pangram.text_classifier.PangramText(api_key: str | None = None)
Bases:
object- __init__(api_key: str | None = None) None
A classifier for text inputs using the Pangram Labs API.
- Parameters:
api_key (str, optional) – Your API key for the Pangram Labs. If not provided, the environment variable PANGRAM_API_KEY will be used.
- Raises:
ValueError – If the API key is not provided and not set in the environment.
- submit_bulk(text: List[str] | None = None, items: List[Dict[str, str]] | None = None) Dict
Submit a Bulk API job for asynchronous AI detection.
Provide either
textas a list of input strings oritemsas a list of dictionaries withtextand an optional customer-definedid. The response includes abulk_idfor polling and immediate per-item validation failures, if any.- Parameters:
text (List[str], optional) – A list of input texts to analyze.
items (List[Dict[str, str]], optional) – A list of item dictionaries. Each item must include
textand may includeid.
- Returns:
Bulk submission response containing
bulk_id,status,total_items,accepted_items, andfailed_items.- Return type:
Dict
- Raises:
ValueError – If both or neither payload shapes are provided, or if the API returns an error.
- get_bulk_status(bulk_id: str) Dict
Fetch the current status for a Bulk API job.
- Parameters:
bulk_id (str) – The bulk job ID returned by
submit_bulk().- Returns:
Bulk status response containing counters and timestamps.
- Return type:
Dict
- Raises:
ValueError – If the API returns an error or an invalid response.
- get_bulk_items(bulk_id: str, offset: int = 0, limit: int = 100) Dict
Fetch paginated item metadata for a Bulk API job.
- Parameters:
bulk_id (str) – The bulk job ID returned by
submit_bulk().offset (int) – Zero-based item offset. Defaults to 0.
limit (int) – Maximum number of items to return. The API allows up to 1000.
- Returns:
Paginated bulk item metadata.
- Return type:
Dict
- Raises:
ValueError – If the API returns an error or an invalid response.
- get_bulk_results_page(bulk_id: str, offset: int = 0, limit: int = 100) Dict
Fetch one page of results for a Bulk API job.
Completed successful items include a
resultfield with the same response shape returned bypredict(). Items that are still running haveresultset toNone. Failed items are returned separately infailed_items.- Parameters:
bulk_id (str) – The bulk job ID returned by
submit_bulk().offset (int) – Zero-based item offset. Defaults to 0.
limit (int) – Maximum number of items to return. The API allows up to 1000.
- Returns:
Paginated bulk result response.
- Return type:
Dict
- Raises:
ValueError – If the API returns an error or an invalid response.
- get_bulk_results(bulk_id: str, page_size: int = 1000) Dict
Fetch all available results for a Bulk API job.
This helper follows the paginated
/bulk/{bulk_id}/resultsendpoint until every submitted item index has been covered. Failed items are returned separately infailed_items. If the job is still running, unfinished accepted items are included initemswithresultset toNone.- Parameters:
bulk_id (str) – The bulk job ID returned by
submit_bulk().page_size (int) – Number of submitted item slots to request per API call. The API allows up to 1000.
- Returns:
Aggregated bulk result response containing
bulk_id,total_items,items, andfailed_items.- Return type:
Dict
- Raises:
ValueError – If page_size is invalid, or if the API returns an error or invalid response.
- wait_for_bulk(bulk_id: str, timeout: float = 3600, poll_interval: float = 0.5) Dict
Poll a Bulk API job until it reaches a terminal status.
Terminal statuses are
succeeded,failed, andpartial. Completion time depends on the number and length of submitted items and current system load.- Parameters:
bulk_id (str) – The bulk job ID returned by
submit_bulk().timeout (float) – Maximum seconds to wait for terminal completion.
poll_interval (float) – Seconds to wait between polling attempts. Values below 0.1 are clamped to 0.1.
- Returns:
Terminal bulk status response.
- Return type:
Dict
- Raises:
ValueError – If timeout or poll interval values are invalid, or if the API returns an error.
TimeoutError – If the bulk job does not complete before timeout.
- predict(text: str, public_dashboard_link: bool = False, timeout: float = 300, poll_interval: float = 0.5) Dict
Classify text as AI-, AI-assisted, or human-written.
Submits the text to Pangram’s async inference endpoint, waits for completion, and returns analysis with windowed results.
- Parameters:
text (str) – The text to be classified.
public_dashboard_link (bool) – Whether to include a public dashboard link in the completed response. Defaults to False.
timeout (float) – Maximum seconds to wait for the async task to complete. Defaults to 300.
poll_interval (float) – Seconds to wait between polling attempts. Values below 0.1 are clamped to 0.1. Defaults to 0.5.
- Returns:
Pangram analysis with AI-assistance detection as a dict with the following fields:
stage (str): The terminal async task stage, normally “STAGE_SUCCESS”.
text (str): The input text.
version (str): The API version identifier (e.g., “3.0”).
headline (str): Classification headline summarizing the result.
prediction (str): Long-form prediction string describing the classification.
prediction_short (str): Short-form prediction string (“AI”, “AI-Assisted”, “Human”, “Mixed”).
fraction_ai (float): Fraction of text classified as AI-written (0.0-1.0).
fraction_ai_assisted (float): Fraction of text classified as AI-assisted (0.0-1.0).
fraction_human (float): Fraction of text classified as human-written (0.0-1.0).
num_ai_segments (int): Number of text segments classified as AI.
num_ai_assisted_segments (int): Number of text segments classified as AI-assisted.
num_human_segments (int): Number of text segments classified as human.
dashboard_link (str): A link to the dashboard page containing the full classification result, if requested.
- windows (list): List of text windows and their classifications. Each window contains:
text (str): The window text.
label (str): Descriptive classification label (e.g., “AI-Generated”, “Moderately AI-Assisted”).
ai_assistance_score (float): Score detailing the level of AI assistance within the window (0.0-1.0), where 0 means no AI assistance and 1.0 means AI-generated.
confidence (str): Confidence level for the classification (“High”, “Medium”, “Low”).
start_index (int): Starting character index in the original text.
end_index (int): Ending character index in the original text.
word_count (int): Number of words in the window.
token_length (int): Token length of the window.
- Return type:
Dict
- Raises:
ValueError – If the API returns an error or if the response is invalid
TimeoutError – If the async task does not complete before timeout
- predict_files(file_paths: List[str | PathLike], public_dashboard_link: bool = False, timeout: float = 300) List[Dict]
Upload one or more files for AI detection.
Files are submitted to Pangram’s file upload endpoint as multipart form data with one
filesfield per uploaded .docx, .pdf, or .rtf file. Each returned result includes the extracted text, prediction fields, window-level analysis, and the uploadedfilename. Whenpublic_dashboard_linkis true, each result also includes adashboard_linkURL.- Parameters:
file_paths (List[Union[str, os.PathLike]]) – Paths to files to upload and analyze.
public_dashboard_link (bool) – Whether to create public dashboard links for the uploaded files. Defaults to False.
timeout (float) – Maximum seconds to wait for the upload request to complete. Defaults to 300.
- Returns:
A list of per-file result dictionaries returned by the API.
- Return type:
List[Dict]
- Raises:
ValueError – If no files are provided, if timeout is invalid, if the API returns an error, or if the response is invalid.
requests.RequestException – File open errors are raised by Python before the request is sent.
- predict_file(file_path: str | PathLike, public_dashboard_link: bool = False, timeout: float = 300) Dict
Upload a single file for AI detection.
This convenience method calls
predict_files()with one path and returns the first per-file result.- Parameters:
file_path (Union[str, os.PathLike]) – Path to the file to upload and analyze.
public_dashboard_link (bool) – Whether to create a public dashboard link for the uploaded file. Defaults to False.
timeout (float) – Maximum seconds to wait for the upload request to complete. Defaults to 300.
- Returns:
The per-file result dictionary returned by the API.
- Return type:
Dict
- Raises:
ValueError – If the API returns an error or an invalid response.
- predict_short(text: str) Dict
Classify text using the main async prediction endpoint.
Deprecated since version This: compatibility alias forwards to
predict(). Usepredict()directly for Pangram’s current response schema. This method may be removed on August 1, 2026.- Parameters:
text (str) – The text to be classified.
- Returns:
The same classification result returned by
predict().- Return type:
Dict
- batch_predict(text_batch: List[str]) List[Dict]
Classify a batch of text as AI-, AI-assisted, or human-written.
This method iterates through the batch and calls predict() for each text.
Deprecated since version This: compatibility method forwards to
predict()once per input text. Usesubmit_bulk()for asynchronous bulk jobs orpredict()for one-off calls. This method may be removed on August 1, 2026.- Parameters:
text_batch (List[str]) – A list of strings to be classified.
- Returns:
A list of classification results from the API for each text in the batch. Each result is a dict with the same fields as returned by predict().
- Return type:
List[Dict]
- predict_with_dashboard_link(text: str, timeout: float = 300, poll_interval: float = 0.5) Dict
Classify text as AI-, AI-assisted, or human-written.
Submits the text to Pangram’s async inference endpoint, waits for completion, and returns analysis with a public dashboard link.
- Parameters:
text (str) – The text to be classified.
timeout (float) – Maximum seconds to wait for the async task to complete. Defaults to 300.
poll_interval (float) – Seconds to wait between polling attempts. Values below 0.1 are clamped to 0.1. Defaults to 0.5.
- Returns:
The classification result from the API, as a dict with the following fields:
text (string): The classified text.
dashboard_link (string): A link to a dashboard page containing the classification result.
stage (string): The terminal async task stage, normally “STAGE_SUCCESS”.
prediction (string): Long-form prediction string describing the classification.
prediction_short (string): Short-form prediction string.
fraction_ai (float): Fraction of text classified as AI-written (0.0-1.0).
fraction_ai_assisted (float): Fraction of text classified as AI-assisted (0.0-1.0).
fraction_human (float): Fraction of text classified as human-written (0.0-1.0).
windows (list): List of text windows and their classifications.
- Return type:
dict
- Raises:
ValueError – If the API returns an error or if the response is invalid
TimeoutError – If the async task does not complete before timeout
- check_plagiarism(text: str) Dict
Check text for potential plagiarism by comparing it against a vast database of online content.
- Parameters:
text (str) – The text to check for plagiarism.
- Returns:
A dictionary containing the plagiarism check results, including:
text (str): The input text.
plagiarism_detected (bool): Whether plagiarism was detected
plagiarized_content (List): List of detected plagiarized content with sources
total_sentences (int): Total number of sentences checked
plagiarized_sentences (List): List of sentences detected as plagiarized
percent_plagiarized (float): Percentage of text detected as plagiarized
- Return type:
Dict
- Raises:
ValueError – If the API returns an error or if the response is invalid