cdp_scrapers.legistar_utils.LegistarScraper¶
- class cdp_scrapers.legistar_utils.LegistarScraper(client: str, timezone: str, ignore_minutes_item_patterns: list[str] | None = None, vote_approve_pattern: str = 'approve|favor|yes', vote_abstain_pattern: str = 'abstain|refuse|refrain', vote_reject_pattern: str = 'reject|oppose|no', vote_absent_pattern: str = 'absent', vote_nonvoting_pattern: str = 'nv|(?:non.*voting)', matter_adopted_pattern: str = 'approved|confirmed|passed|adopted|consent|(?:voted.*com+it+ee)', matter_in_progress_pattern: str = 'heard|read|filed|held|(?:in.*com+it+ee)', matter_rejected_pattern: str = 'rejected|dropped', minutes_item_decision_passed_pattern: str = 'pass', minutes_item_decision_failed_pattern: str = 'not|fail', static_data: ScraperStaticData | None = None, person_aliases: dict[str, set[str]] | None = None, role_replacements: dict[str, str] | None = None)[source]¶
Base class for transforming Legistar API data to CDP IngestionModel.
If get_events() naively fails and raises an error, a given installation must define a derived class and implement the get_content_uris() function.
- Parameters:
- client: str
Legistar client name, e.g. “seattle” for Seattle, “kingcounty” for King County.
- timezone: str
The timezone for the target client. i.e. “America/Los_Angeles” or “America/New_York” See https://en.wikipedia.org/wiki/List_of_tz_database_time_zones for canonical timezones.
- ignore_minutes_item_patterns: List[str]
A list of string patterns or substrings to act as a minutes item filter. Any item in the provided list will be compiled as a regex string and any minute’s item that contains the compiled pattern will be filtered out of the produced CDP minutes item list. Default: [] (do not filter any minutes items)
- vote_approve_pattern: str
Regex pattern used to convert Legistar instance’s votes in approval value to CDP constant value. Default: “approve|favor|yes”
- vote_abstain_pattern: str
Regex pattern used to convert Legistar instance’s abstension value to CDP constant value. Note, this is a pure abstension, not an “approval by abstention” or “rejection by abstension” value. Those should be places in vote_approve_pattern and vote_reject_pattern respectively. Default: “abstain|refuse|refrain”
- vote_reject_pattern: str
Regex pattern used to convert Legistar instance’s votes in rejection value to CDP constant value. Default: “reject|oppose|no”
- vote_absent_pattern: str
Regex pattern used to convert Legistar instance’s excused absense value to CDP constant value. Default: “absent”
- vote_nonvoting_pattern: str
Regex pattern used to convert Legistar instance’s non-voting value to CDP constant value. Default: “nv|(?:non.*voting)”
- matter_adopted_pattern: str
Regex pattern used to convert Legistar instance’s matter was adopted to CDP constant value. Default: “approved|confirmed|passed|adopted”
- matter_in_progess_pattern: str
Regex pattern used to convert Legistar instance’s matter is in-progress to CDP constant value. Default: “heard|ready|filed|held|(?:ins*committee)”
- matter_rejected_pattern: str
Regex pattern used to convert Legistar instance’s matter was rejected to CDP constant value. Default: “rejected|dropped”
- minutes_item_decision_passed_pattern: str
Regex pattern used to convert Legistar instance’s minutes item passage to CDP constant value. Default: “pass”
- minutes_item_decision_failed_pattern: str
Regex pattern used to convert Legistar instance’s minutes item failure to CDP constant value. Default: “not|fail”
- static_data: Optional[ScraperStaticData]
predefined Seats, Bodies and Persons used to provide more accurate Person.seat.
- person_aliases: Optional[Dict[str, Set[str]]]
Dictionary used to catch name aliases and resolve improperly unique Persons to the one correct Person. Default: None
- role_replacements: Optional[Dict[str, str]]
Dictionary used to replace role titles with CDP standard role titles. The keys should be titles you want to replace and the values should be a CDP standard role. Default: None
See also
- __init__(client: str, timezone: str, ignore_minutes_item_patterns: list[str] | None = None, vote_approve_pattern: str = 'approve|favor|yes', vote_abstain_pattern: str = 'abstain|refuse|refrain', vote_reject_pattern: str = 'reject|oppose|no', vote_absent_pattern: str = 'absent', vote_nonvoting_pattern: str = 'nv|(?:non.*voting)', matter_adopted_pattern: str = 'approved|confirmed|passed|adopted|consent|(?:voted.*com+it+ee)', matter_in_progress_pattern: str = 'heard|read|filed|held|(?:in.*com+it+ee)', matter_rejected_pattern: str = 'rejected|dropped', minutes_item_decision_passed_pattern: str = 'pass', minutes_item_decision_failed_pattern: str = 'not|fail', static_data: ScraperStaticData | None = None, person_aliases: dict[str, set[str]] | None = None, role_replacements: dict[str, str] | None = None)[source]¶
Methods
__init__
(client, timezone[, ...])check_for_cdp_min_ingestion
([check_days])Test if can obtain at least one minimally defined EventIngestionModel.
date_and_time_to_datetime
(ev_date, ev_time)Return datetime from ev_date and ev_time.
filter_event_minutes
(ev_minutes_item)Return None if minutes_item.name contains unimportant text that we want to ignore.
find_time_zone
()Return name for a US time zone matching UTC offset calculated from OS clock.
fix_event_minutes
(ev_minutes_item, ...)Inspect the MinutesItem and Matter in ev_minutes_item.
get_body
(legistar_body)Return CDP Body for Legistar body.
get_content_uris
(legistar_ev)Must implement in class derived from LegistarScraper.
get_event_minutes
(legistar_ev_items)Return List[EventMinutesItem] for Legistar API EventItems.
Return List[SupportingFile] for Legistar API MatterAttachments.
get_events
([begin, end])Calls get_legistar_events_for_timespan to retrieve Legistar API data and return as List[EventIngestionModel].
get_matter
(legistar_ev)Return Matter from Legistar API EventItem.
get_matter_status
(legistar_matter_status)Return appropriate MatterStatusDecision constant from EventItemMatterStatus.
get_minutes_item
(legistar_ev_item)Return MinutesItem from parts of Legistar API EventItem.
Return appropriate EventMinutesItemDecision constant from EventItemPassedFlagName.
get_none_if_empty
(model)Check required keys in model, return None if any such key has no value.
get_person
(legistar_person)Return CDP Person for Legistar Person.
get_required_attrs
(model)Return list of keys required in model as specified in IngestionModel class definition.
get_roles
(legistar_office_records)Return list of CDP Role from list of legistar OfficeRecord.
get_sponsors
(legistar_sponsors)Get legislation sponsors.
get_vote_decision
(legistar_vote)Return appropriate VoteDecision constant based on Legistar Vote.
get_votes
(legistar_votes)Return List[Vote] for Legistar API Votes.
handle_old_new_council
(old_names, new_names)Override to handle old and new councilmember information.
inject_known_data
(events)Augment with long-term static data that changes very infrequently.
inject_known_person
(person)Inject information if person exists in static_data.persons.
localize_datetime
(local_time)Return input datetime with time zone information.
post_process_ingestion_models
(events)Called at the end of get_events() for fully custom site-specific prcessing.
resolve_person_alias
(person)If input person is in fact an alias of a reference known person, return the reference person instead.
use_or_replace_role
(role_title)Lookup if the provided role title should be replaced with a CDP standard value.
Attributes
Check that Legistar API recognizes client name.