cdp_scrapers.legistar_utils.LegistarScraper

class cdp_scrapers.legistar_utils.LegistarScraper(client: str, timezone: str, ignore_minutes_item_patterns: list[str] | None = None, vote_approve_pattern: str = 'approve|favor|yes', vote_abstain_pattern: str = 'abstain|refuse|refrain', vote_reject_pattern: str = 'reject|oppose|no', vote_absent_pattern: str = 'absent', vote_nonvoting_pattern: str = 'nv|(?:non.*voting)', matter_adopted_pattern: str = 'approved|confirmed|passed|adopted|consent|(?:voted.*com+it+ee)', matter_in_progress_pattern: str = 'heard|read|filed|held|(?:in.*com+it+ee)', matter_rejected_pattern: str = 'rejected|dropped', minutes_item_decision_passed_pattern: str = 'pass', minutes_item_decision_failed_pattern: str = 'not|fail', static_data: ScraperStaticData | None = None, person_aliases: dict[str, set[str]] | None = None, role_replacements: dict[str, str] | None = None)[source]

Base class for transforming Legistar API data to CDP IngestionModel.

If get_events() naively fails and raises an error, a given installation must define a derived class and implement the get_content_uris() function.

Parameters:
client: str

Legistar client name, e.g. “seattle” for Seattle, “kingcounty” for King County.

timezone: str

The timezone for the target client. i.e. “America/Los_Angeles” or “America/New_York” See https://en.wikipedia.org/wiki/List_of_tz_database_time_zones for canonical timezones.

ignore_minutes_item_patterns: List[str]

A list of string patterns or substrings to act as a minutes item filter. Any item in the provided list will be compiled as a regex string and any minute’s item that contains the compiled pattern will be filtered out of the produced CDP minutes item list. Default: [] (do not filter any minutes items)

vote_approve_pattern: str

Regex pattern used to convert Legistar instance’s votes in approval value to CDP constant value. Default: “approve|favor|yes”

vote_abstain_pattern: str

Regex pattern used to convert Legistar instance’s abstension value to CDP constant value. Note, this is a pure abstension, not an “approval by abstention” or “rejection by abstension” value. Those should be places in vote_approve_pattern and vote_reject_pattern respectively. Default: “abstain|refuse|refrain”

vote_reject_pattern: str

Regex pattern used to convert Legistar instance’s votes in rejection value to CDP constant value. Default: “reject|oppose|no”

vote_absent_pattern: str

Regex pattern used to convert Legistar instance’s excused absense value to CDP constant value. Default: “absent”

vote_nonvoting_pattern: str

Regex pattern used to convert Legistar instance’s non-voting value to CDP constant value. Default: “nv|(?:non.*voting)”

matter_adopted_pattern: str

Regex pattern used to convert Legistar instance’s matter was adopted to CDP constant value. Default: “approved|confirmed|passed|adopted”

matter_in_progess_pattern: str

Regex pattern used to convert Legistar instance’s matter is in-progress to CDP constant value. Default: “heard|ready|filed|held|(?:ins*committee)”

matter_rejected_pattern: str

Regex pattern used to convert Legistar instance’s matter was rejected to CDP constant value. Default: “rejected|dropped”

minutes_item_decision_passed_pattern: str

Regex pattern used to convert Legistar instance’s minutes item passage to CDP constant value. Default: “pass”

minutes_item_decision_failed_pattern: str

Regex pattern used to convert Legistar instance’s minutes item failure to CDP constant value. Default: “not|fail”

static_data: Optional[ScraperStaticData]

predefined Seats, Bodies and Persons used to provide more accurate Person.seat.

person_aliases: Optional[Dict[str, Set[str]]]

Dictionary used to catch name aliases and resolve improperly unique Persons to the one correct Person. Default: None

role_replacements: Optional[Dict[str, str]]

Dictionary used to replace role titles with CDP standard role titles. The keys should be titles you want to replace and the values should be a CDP standard role. Default: None

__init__(client: str, timezone: str, ignore_minutes_item_patterns: list[str] | None = None, vote_approve_pattern: str = 'approve|favor|yes', vote_abstain_pattern: str = 'abstain|refuse|refrain', vote_reject_pattern: str = 'reject|oppose|no', vote_absent_pattern: str = 'absent', vote_nonvoting_pattern: str = 'nv|(?:non.*voting)', matter_adopted_pattern: str = 'approved|confirmed|passed|adopted|consent|(?:voted.*com+it+ee)', matter_in_progress_pattern: str = 'heard|read|filed|held|(?:in.*com+it+ee)', matter_rejected_pattern: str = 'rejected|dropped', minutes_item_decision_passed_pattern: str = 'pass', minutes_item_decision_failed_pattern: str = 'not|fail', static_data: ScraperStaticData | None = None, person_aliases: dict[str, set[str]] | None = None, role_replacements: dict[str, str] | None = None)[source]

Methods

__init__(client, timezone[, ...])

check_for_cdp_min_ingestion([check_days])

Test if can obtain at least one minimally defined EventIngestionModel.

date_and_time_to_datetime(ev_date, ev_time)

Return datetime from ev_date and ev_time.

filter_event_minutes(ev_minutes_item)

Return None if minutes_item.name contains unimportant text that we want to ignore.

find_time_zone()

Return name for a US time zone matching UTC offset calculated from OS clock.

fix_event_minutes(ev_minutes_item, ...)

Inspect the MinutesItem and Matter in ev_minutes_item.

get_body(legistar_body)

Return CDP Body for Legistar body.

get_content_uris(legistar_ev)

Must implement in class derived from LegistarScraper.

get_event_minutes(legistar_ev_items)

Return List[EventMinutesItem] for Legistar API EventItems.

get_event_supporting_files(...)

Return List[SupportingFile] for Legistar API MatterAttachments.

get_events([begin, end])

Calls get_legistar_events_for_timespan to retrieve Legistar API data and return as List[EventIngestionModel].

get_matter(legistar_ev)

Return Matter from Legistar API EventItem.

get_matter_status(legistar_matter_status)

Return appropriate MatterStatusDecision constant from EventItemMatterStatus.

get_minutes_item(legistar_ev_item)

Return MinutesItem from parts of Legistar API EventItem.

get_minutes_item_decision(...)

Return appropriate EventMinutesItemDecision constant from EventItemPassedFlagName.

get_none_if_empty(model)

Check required keys in model, return None if any such key has no value.

get_person(legistar_person)

Return CDP Person for Legistar Person.

get_required_attrs(model)

Return list of keys required in model as specified in IngestionModel class definition.

get_roles(legistar_office_records)

Return list of CDP Role from list of legistar OfficeRecord.

get_sponsors(legistar_sponsors)

Get legislation sponsors.

get_vote_decision(legistar_vote)

Return appropriate VoteDecision constant based on Legistar Vote.

get_votes(legistar_votes)

Return List[Vote] for Legistar API Votes.

handle_old_new_council(old_names, new_names)

Override to handle old and new councilmember information.

inject_known_data(events)

Augment with long-term static data that changes very infrequently.

inject_known_person(person)

Inject information if person exists in static_data.persons.

localize_datetime(local_time)

Return input datetime with time zone information.

post_process_ingestion_models(events)

Called at the end of get_events() for fully custom site-specific prcessing.

resolve_person_alias(person)

If input person is in fact an alias of a reference known person, return the reference person instead.

use_or_replace_role(role_title)

Lookup if the provided role title should be replaced with a CDP standard value.

Attributes

is_legistar_compatible

Check that Legistar API recognizes client name.