The geospatial toolkit for redistricting data.

Overview

maup

maup tests codecov PyPI conda-forge Package

maup is the geospatial toolkit for redistricting data. The package streamlines the basic workflows that arise when working with blocks, precincts, and districts, such as

The project's priorities are to be efficient by using spatial indices whenever possible and to integrate well with the existing ecosystem around pandas, geopandas and shapely. The package is distributed under the MIT License.

Installation

We recommend installing maup from conda-forge using conda:

conda install -c conda-forge maup

You can get conda by installing Miniconda, a free Python distribution made especially for data science and scientific computing. You might also consider Anaconda, which includes many data science packages that you might find useful.

To install maup from PyPI, run pip install maup from your terminal.

Examples

Here are some basic situations where you might find maup helpful. For these examples, we use test data from Providence, Rhode Island, which you can find in our Rhode Island shapefiles repo, or in the examples folder of this repo.

>>> import geopandas
>>> import pandas
>>>
>>> blocks = geopandas.read_file("zip://./examples/blocks.zip")
>>> precincts = geopandas.read_file("zip://./examples/precincts.zip")
>>> districts = geopandas.read_file("zip://./examples/districts.zip")

Assigning precincts to districts

The assign function in maup takes two sets of geometries called sources and targets and returns a pandas Series. The Series maps each geometry in sources to the geometry in targets that covers it. (Here, geometry A covers geometry B if every point of A and its boundary lies in B or its boundary.) If a source geometry is not covered by one single target geometry, it is assigned to the target geometry that covers the largest portion of its area.

>>> import maup
>>>
>>> assignment = maup.assign(precincts, districts)
>>> # Add the assigned districts as a column of the `precincts` GeoDataFrame:
>>> precincts["DISTRICT"] = assignment
>>> assignment.head()
0     7
1     5
2    13
3     6
4     1
dtype: int64

As an aside, you can use that assignment object to create a gerrychain Partition representing this districting plan.

Aggregating block data to precincts

Precinct shapefiles usually come with election data, but not demographic data. In order to study their demographics, we need to aggregate demographic data from census blocks up to the precinct level. We can do this by assigning blocks to precincts and then aggregating the data with a Pandas groupby operation:

>>> variables = ["TOTPOP", "NH_BLACK", "NH_WHITE"]
>>>
>>> assignment = maup.assign(blocks, precincts)
>>> precincts[variables] = blocks[variables].groupby(assignment).sum()
>>> precincts[variables].head()
   TOTPOP  NH_BLACK  NH_WHITE
0    5907       886       380
1    5636       924      1301
2    6549       584      4699
3    6009       435      1053
4    4962       156      3713

If you want to move data from one set of geometries to another but your source and target geometries do not nest neatly (i.e. have overlaps), see Prorating data when units do not nest neatly.

Disaggregating data from precincts down to blocks

It's common to have data at a coarser scale that you want to attach to finer-scaled geometries. Usually this happens when vote totals for a certain election are only reported at the county level, and we want to attach that data to precinct geometries.

Let's say we want to prorate the vote totals in the columns "PRES16D", "PRES16R" from our precincts GeoDataFrame down to our blocks GeoDataFrame. The first crucial step is to decide how we want to distribute a precinct's data to the blocks within it. Since we're prorating election data, it makes sense to use a block's total population or voting-age population. Here's how we might prorate by population ("TOTPOP"):

>>> election_columns = ["PRES16D", "PRES16R"]
>>> assignment = maup.assign(blocks, precincts)
>>>
>>> # We prorate the vote totals according to each block's share of the overall
>>> # precinct population:
>>> weights = blocks.TOTPOP / assignment.map(precincts.TOTPOP)
>>> prorated = maup.prorate(assignment, precincts[election_columns], weights)
>>>
>>> # Add the prorated vote totals as columns on the `blocks` GeoDataFrame:
>>> blocks[election_columns] = prorated
>>> # We'll call .round(2) to round the values for display purposes.
>>> blocks[election_columns].round(2).head()
   PRES16D  PRES16R
0     0.00     0.00
1    12.26     1.70
2    15.20     2.62
3    15.50     2.67
4     3.28     0.45

Warning about areal interpolation

We strongly urge you not to prorate by area! The area of a census block is not a good predictor of its population. In fact, the correlation goes in the other direction: larger census blocks are less populous than smaller ones.

Prorating data when units do not nest neatly

Suppose you have a shapefile of precincts with some election results data and you want to join that data onto a different, more recent precincts shapefile. The two sets of precincts will have overlaps, and will not nest neatly like the blocks and precincts did in the above examples. (Not that blocks and precincts always nest neatly...)

We can use maup.intersections to break the two sets of precincts into pieces that nest neatly into both sets. Then we can disaggregate from the old precincts onto these pieces, and aggregate up from the pieces to the new precincts. This move is a bit complicated, so maup provides a function called prorate that does just that.

We'll use our same blocks GeoDataFrame to estimate the populations of the pieces for the purposes of proration.

For our "new precincts" shapefile, we'll use the VTD shapefile for Rhode Island that the U.S. Census Bureau produced as part of their 2018 test run of for the 2020 Census.

>>> old_precincts = precincts
>>> new_precincts = geopandas.read_file("zip://./examples/new_precincts.zip")
>>>
>>> columns = ["SEN18D", "SEN18R"]
>>>
>>> # Include area_cutoff=0 to ignore any intersections with no area,
>>> # like boundary intersections, which we do not want to include in
>>> # our proration.
>>> pieces = maup.intersections(old_precincts, new_precincts, area_cutoff=0)
>>>
>>> # Weight by prorated population from blocks
>>> weights = blocks["TOTPOP"].groupby(maup.assign(blocks, pieces)).sum()
>>> # Normalize the weights so that votes are allocated according to their
>>> # share of population in the old_precincts
>>> weights = maup.normalize(weights, level=0)
>>>
>>> # Use blocks to estimate population of each piece
>>> new_precincts[columns] = maup.prorate(
...     pieces,
...     old_precincts[columns],
...     weights=weights
... )
>>> new_precincts[columns].head()
   SEN18D  SEN18R
0   752.0    51.0
1   370.0    21.0
2    97.0    17.0
3   585.0    74.0
4   246.0    20.0

Progress bars

For long-running operations, the user might want to see a progress bar to estimate how much longer a task will take (and whether to abandon it altogether).

maup provides an optional progress bar for this purpose. To temporarily activate a progress bar for a certain operation, use with maup.progress()::

>>> with maup.progress():
...     assignment = maup.assign(precincts, districts)
...

To turn on progress bars for all applicable operations (e.g. for an entire script), set maup.progress.enabled = True:

>>> maup.progress.enabled = True
>>> # Now a progress bar will display while this function runs:
>>> assignment = maup.assign(precincts, districts)
>>> # And this one too:
>>> pieces = maup.intersections(old_precincts, new_precincts, area_cutoff=0)

Fixing topological issues, overlaps, and gaps

Precinct shapefiles are often created by stitching together collections of precinct geometries sourced from different counties or different years. As a result, the shapefile often has gaps or overlaps between precincts where the different sources disagree about the boundaries. These gaps and overlaps pose problems when you are interested in working with the adjacency graph of the precincts, and not just in mapping the precincts. This adjacency information is especially important when studying redistricting, because districts are almost always expected to be contiguous.

maup provides functions for closing gaps and resolving overlaps in a collection of geometries. As an example, we'll apply both functions to these geometries, which have both an overlap and a gap:

Four polygons with a gap and some overlaps

Usually the gaps and overlaps in real shapefiles are tiny and easy to miss, but this exaggerated example will help illustrate the functionality.

First, we'll use shapely to create the polygons from scratch:

from shapely.geometry import Polygon
geometries = geopandas.GeoSeries([
    Polygon([(0, 0), (2, 0), (2, 1), (1, 1), (1, 2), (0, 2)]),
    Polygon([(2, 0), (4, 0), (4, 2), (2, 2)]),
    Polygon([(0, 2), (2, 2), (2, 4), (0, 4)]),
    Polygon([(2, 1), (4, 1), (4, 4), (2, 4)]),
])

Now we'll close the gap:

without_gaps = maup.close_gaps(geometries)

The without_gaps geometries look like this:

Four polygons with two overlapping

And then resolve the overlaps:

without_overlaps_or_gaps = maup.resolve_overlaps(without_gaps)

The without_overlaps_or_gaps geometries look like this:

Four squares

Alternatively, there is also a convenience maup.autorepair() function provided that attempts to resolve topological issues as well as close gaps and resolve overlaps:

without_overlaps_or_gaps = maup.autorepair(geometries)

The functions resolve_overlaps, close_gaps, and autorepair accept a relative_threshold argument. This threshold controls how large of a gap or overlap the function will attempt to fix. The default value of relative_threshold is 0.1, which means that the functions will leave alone any gap/overlap whose area is more than 10% of the area of the geometries that might absorb that gap/overlap. In the above example, we set relative_threshold=None to ensure that no gaps or overlaps were ignored.

Modifiable areal unit problem

The name of this package comes from the modifiable areal unit problem (MAUP): the same spatial data will look different depending on how you divide up the space. Since maup is all about changing the way your data is aggregated and partitioned, we have named it after the MAUP to encourage users to use the toolkit thoughtfully and responsibly.

Comments
  • Resolve_overlaps complains about differing CRSes despite setting them manually (and other problems)

    Resolve_overlaps complains about differing CRSes despite setting them manually (and other problems)

    Thanks for the library! This seems to solve exactly the kind of issues have. Unfortunately I encounter issues. Could you give some pointers on how to use resolve_overlaps?

    I have a shapefile of which I am trying to resolve overlaps. When call resolve_overlaps it complains that source and target geometries must have the same CRS. I manually set the CRS of the GeoSeries to epsg:28992 however and I don't know how to set them for the target.

    Since the error mentions target and source CRSes being None and EPSG:28992, I also tried manually setting the CRS to None. That resolves the previous issue, but now ends with a NoneType object has no attribute '_geom'.

    In summary:

    • Calling resolve_overlaps complains about differing src and target CRSes
    • Setting src crs to None 'fixes' that problem, but introduces a new one

    I included code and errors below.

    from maup import resolve_overlaps, close_gaps
    import geopandas as gpd
    
    gdf = gpd.read_file("/home/workworkwork/Downloads/for_simplification/segmentation_weerribben_largetest_vegetatietypen_redef_sm3_mf5.shp")
    
    print("Resolving self-intersections and removing empty polygons")
    polygons = gpd.GeoSeries([pp.buffer(0) for pp in polygons])
    polygons = polygons[~polygons.is_empty]
    polygons.crs = {'init': 'epsg:28992'} # or set to None
    print("Resolving overlaps")
    polygons = resolve_overlaps(polygons, relative_threshold=None)
    
    # Crashes
    

    Wrong CRSes:

    Traceback (most recent call last):
      File "resolve.py", line 17, in <module>
        polygons = resolve_overlaps(polygons, relative_threshold=None)
      File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/repair.py", line 98, in resolve_overlaps
        overlaps, with_overlaps_removed, relative_threshold=None
      File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/crs.py", line 11, in wrapped
        geoms1.crs, geoms2.crs
    TypeError: the source and target geometries must have the same CRS. None {'init': 'epsg:28992'}
    

    NoneType has no _geom:

    Traceback (most recent call last):
      File "resolve.py", line 17, in <module>
        polygons = resolve_overlaps(polygons, relative_threshold=None)
      File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/repair.py", line 98, in resolve_overlaps
        overlaps, with_overlaps_removed, relative_threshold=None
      File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/crs.py", line 14, in wrapped
        return f(*args, **kwargs)
      File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/repair.py", line 117, in absorb_by_shared_perimeter
        assignment = assign_to_max(intersections(sources, targets, area_cutoff=None).length)
      File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/crs.py", line 14, in wrapped
        return f(*args, **kwargs)
      File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/intersections.py", line 33, in intersections
        reindexed_targets
      File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/intersections.py", line 31, in <listcomp>
        (sources.index[j], targets.index[i], geometry)
      File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/indexed_geometries.py", line 45, in enumerate_intersections
        for j, intersection in self.intersections(target).items():
      File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/indexed_geometries.py", line 24, in intersections
        relevant_geometries = self.query(geometry)
      File "/home/workworkwork/.local/lib/python3.7/site-packages/maup/indexed_geometries.py", line 19, in query
        relevant_indices = [geom.index for geom in self.spatial_index.query(geometry)]
      File "/usr/lib64/python3.7/site-packages/shapely/strtree.py", line 60, in query
        lgeos.GEOSSTRtree_query(self._tree_handle, geom._geom, lgeos.GEOSQueryCallback(callback), None)
    AttributeError: 'NoneType' object has no attribute '_geom'
    
    opened by anieuwland 10
  • AttributeError: `Polygon` object has no attribute 'index'

    AttributeError: `Polygon` object has no attribute 'index'

    I'm getting the above Attribute Error when trying to aggregate VAP data from blocks up to precincts in Arizona. I think I remember running into this with a different shapefile and can't remember how it was fixed, so putting this issue up here. Shapefiles can be found here.

    Running the following code...

    import maup
    import geopandas as gpd
    
    blocks = gpd.read_file("AZ_blocks_VAP/")
    precincts = gpd.read_file("AZ_precincts_data/")
    
    variables = ["VAP", "AMINVAP", "AMIN*VAP"]
    
    assignment = maup.assign(blocks, precincts)
    precincts[variables] = blocks[variables].groupby(assignment).sum()
    

    gives this error:

    ---------------------------------------------------------------------------
    AttributeError                            Traceback (most recent call last)
    <ipython-input-11-ee818f0bf79d> in <module>
          1 variables = ["VAP", "AMINVAP", "AMIN*VAP"]
          2 
    ----> 3 assignment = maup.assign(blocks, precincts)
          4 precincts[variables] = blocks[variables].groupby(assignment).sum()
    
    ~/miniconda3/envs/maup/lib/python3.9/site-packages/maup/crs.py in wrapped(*args, **kwargs)
         12                 )
         13             )
    ---> 14         return f(*args, **kwargs)
         15 
         16     return wrapped
    
    ~/miniconda3/envs/maup/lib/python3.9/site-packages/maup/assign.py in assign(sources, targets)
         10     target that covers the most of its area.
         11     """
    ---> 12     assignment = assign_by_covering(sources, targets)
         13     unassigned = sources[assignment.isna()]
         14     assignments_by_area = assign_by_area(unassigned, targets)
    
    ~/miniconda3/envs/maup/lib/python3.9/site-packages/maup/assign.py in assign_by_covering(sources, targets)
         20 def assign_by_covering(sources, targets):
         21     indexed_sources = IndexedGeometries(sources)
    ---> 22     return indexed_sources.assign(targets)
         23 
         24 
    
    ~/miniconda3/envs/maup/lib/python3.9/site-packages/maup/indexed_geometries.py in assign(self, targets)
         40     def assign(self, targets):
         41         target_geometries = get_geometries(targets)
    ---> 42         groups = [
         43             self.covered_by(container).apply(lambda x: container_index)
         44             for container_index, container in progress(
    
    ~/miniconda3/envs/maup/lib/python3.9/site-packages/maup/indexed_geometries.py in <listcomp>(.0)
         41         target_geometries = get_geometries(targets)
         42         groups = [
    ---> 43             self.covered_by(container).apply(lambda x: container_index)
         44             for container_index, container in progress(
         45                 target_geometries.items(), len(target_geometries)
    
    ~/miniconda3/envs/maup/lib/python3.9/site-packages/maup/indexed_geometries.py in covered_by(self, container)
         29 
         30     def covered_by(self, container):
    ---> 31         relevant_geometries = self.query(container)
         32         prepared_container = prep(container)
         33 
    
    ~/miniconda3/envs/maup/lib/python3.9/site-packages/maup/indexed_geometries.py in query(self, geometry)
         19 
         20     def query(self, geometry):
    ---> 21         relevant_indices = [geom.index for geom in self.spatial_index.query(geometry)]
         22         relevant_geometries = self.geometries.loc[relevant_indices]
         23         return relevant_geometries
    
    ~/miniconda3/envs/maup/lib/python3.9/site-packages/maup/indexed_geometries.py in <listcomp>(.0)
         19 
         20     def query(self, geometry):
    ---> 21         relevant_indices = [geom.index for geom in self.spatial_index.query(geometry)]
         22         relevant_geometries = self.geometries.loc[relevant_indices]
         23         return relevant_geometries
    
    AttributeError: 'Polygon' object has no attribute 'index'
    [AZ_precincts_data.zip](https://github.com/mggg/maup/files/6932307/AZ_precincts_data.zip)
    
    
    opened by gabeschoenbach 8
  • Dependency versions

    Dependency versions

    setup.py doesn't specify any versions for the dependencies. I'm running into a problem that seems to be related to Shapely==1.7.1, in particular with the spatial index (STRtree):

    AttributeError: 'MultiPolygon' object has no attribute 'index'
    

    and:

    AttributeError: 'Polygon' object has no attribute 'index'
    

    from:

    maup/indexed_geometries.py", line 20, in <listcomp>
        relevant_indices = [geom.index for geom in self.spatial_index.query(geometry)]
    

    What version of Shapely should I be using?

    Thanks

    opened by frnsys 7
  • maup / install new version

    maup / install new version

    Hello, I'm trying to use maup.autorepair to fix problems in shapefiles for TX I downloaded from mggg states [TX_mggg.shp] but despite having reinstalled maup [using conda install -c conda-forge maup]

    when I do maup.autorepair() I get the error AttributeError: module 'maup' has no attribute 'autorepair'

    Checking the Conda environment it says maup is version 0.7. Which may not be up to date/ include the latest features. How do I upgrade to most recent version?? On OS X

    then my hope is that maup.assign(blocks, TX_mggg) suitably fixed won't end up with the dreaded 'can't reindex from duplicate axis' error.

    Any clues welcome...

    opened by dinosg 3
  • Allow higher versions of Geopandas

    Allow higher versions of Geopandas

    The current pyproject.toml has the geopandas dependency set as: geopandas = "^0.9.0". This does not allow any version 0.10.0 or newer. Using this older versions raises FutureWarnings and prevents me from using some of geopandas' newer features, so it would be nice to have the compatibility to install the newer versions.

    opened by calebclimatecabinet 2
  • TypeError raised in `maup.assign` when no targets cover an entire source

    TypeError raised in `maup.assign` when no targets cover an entire source

    When calling maup.assign(sources, targets) where no sources are completely covered by a target, we get:

    TypeError: Input must be valid geometry objects: 0
    

    Reproducible example:

    import geopandas as gpd
    from shapely.geometry import Polygon
    from shapely.affinity import translate
    import maup
    
    # Make a simple grid of 4 1x1 blocks
    s1 = Polygon([(0, 0), (0, 1), (1, 1), (1, 0)])
    s2 = Polygon([(1, 0), (1, 1), (2, 1), (2, 0)])
    s3 = Polygon([(0, 1), (0, 2), (1, 2), (1, 1)])
    s4 = Polygon([(1, 1), (1, 2), (2, 2), (2, 1)])
    sources = gpd.GeoSeries([s1, s2, s3, s4])
    
    # Make 4 matching targets that overlap
    targets = sources.apply(lambda x: translate(x, xoff=0.1))
    
    # Raises error
    maup.assign(sources, targets)
    

    I would expect that the above would return the Series: pd.Series([0, 1, 2, 3])

    opened by calebclimatecabinet 2
  • ValueError raised in maup.assign when a source geometry is fully covered by more than one target

    ValueError raised in maup.assign when a source geometry is fully covered by more than one target

    This line: https://github.com/mggg/maup/blob/933eb92d75e0b5ff7796d2b3bd067542a1d7dabd/maup/indexed_geometries.py#L48 causes ValueError: cannot reindex from a duplicate axis to be raised when a source geometry is fully covered by more than one target as it assumes that every source geometry is mapped to at most one target geometry. The solution is to remove overlaps. This is annoying to debug as the error message is very vague.

    opened by InnovativeInventor 2
  • Example in README loses votes and contains non-explicit assumptions

    Example in README loses votes and contains non-explicit assumptions

    import geopandas as gpd
    import geopandas
    import maup
    
    blocks = geopandas.read_file("zip://./examples/blocks.zip")
    precincts = geopandas.read_file("zip://./examples/precincts.zip")
    districts = geopandas.read_file("zip://./examples/districts.zip")
    
    election_columns = ["PRES16D", "PRES16R"]
    
    assignment = maup.assign(blocks, precincts)
    weights = blocks.TOTPOP / assignment.map(precincts.TOTPOP)
    prorated = maup.prorate(assignment, precincts[election_columns], weights)
    blocks[election_columns] = prorated
    
    print(blocks[election_columns].sum())
    print(precincts[election_columns].sum())
    

    forcing precinct TOTPOP to equal block TOTPOP doesn't resovle the issue:

    import geopandas as gpd
    import geopandas
    import maup
    
    blocks = geopandas.read_file("zip://./examples/blocks.zip")
    precincts = geopandas.read_file("zip://./examples/precincts.zip")
    districts = geopandas.read_file("zip://./examples/districts.zip")
    precincts["TOTPOP"] *= blocks.TOTPOP.sum()/precincts.TOTPOP.sum()
    
    assert precincts["TOTPOP"].sum() == blocks["TOTPOP"].sum()
    election_columns = ["PRES16D", "PRES16R"]
    
    assignment = maup.assign(blocks, precincts)
    weights = blocks.TOTPOP / assignment.map(precincts.TOTPOP)
    prorated = maup.prorate(assignment, precincts[election_columns], weights)
    blocks[election_columns] = prorated
    
    print(blocks[election_columns].sum())
    print(precincts[election_columns].sum())
    
    bug 
    opened by InnovativeInventor 2
  • Fix IndexedGeometries for Shapely==1.7.1, see #29

    Fix IndexedGeometries for Shapely==1.7.1, see #29

    When using Shapely 1.7.1 IndexedGeometries.query fails because the assigned geom.index values don't persist.

    Shapely documentation suggests building your own index:

        To get the original indices of the returned objects, create an
        auxiliary dictionary. But use the geometry *ids* as keys since
        the shapely geometry objects themselves are not hashable.
    
        >>> index_by_id = dict((id(pt), i) for i, pt in enumerate(points))
        >>> [(index_by_id[id(pt)], pt.wkt) for pt in tree.query(Point(2,2).buffer(1.0))]
        [(1, 'POINT (1 1)'), (2, 'POINT (2 2)'), (3, 'POINT (3 3)')]
    

    The problem with this particular approach is that using id is unreliable--multiple objects may have the same id throughout the lifecycle of a program.

    Instead I'm using a kind of ugly way to generate a hash for a given geometry.

    opened by frnsys 2
  • Shapely 1.8a (alpha) breaks test_intersections_correct_when_all_overlapping test

    Shapely 1.8a (alpha) breaks test_intersections_correct_when_all_overlapping test

    Hopefully this gets fixed before Shapely releases 1.8. Steps to reproduce:

    pip uninstall shapely
    pip install shapely==1.8a1
    pytest
    

    This bug is on their master branch, so we should probably file a bug report to make them aware of the issue.

    opened by InnovativeInventor 2
  • Fix #15 and add real-world tests for close_gaps() and resolve_overlaps()

    Fix #15 and add real-world tests for close_gaps() and resolve_overlaps()

    I'm still waiting on the new tests to pass locally (the Utah shapefile takes forever!), but it works on MI and a few other shapefiles that had issues previously. All (previously written) tests pass! Essentially, line 95 in repair.py was the culprit.

    Also -- as a note, the setup for the Travis CI continuous testing no longer works, so you have to run tests locally for now.

    opened by InnovativeInventor 2
  • Bump certifi from 2021.5.30 to 2022.12.7

    Bump certifi from 2021.5.30 to 2022.12.7

    Bumps certifi from 2021.5.30 to 2022.12.7.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Make the weights in the README have explicit assumptions

    Make the weights in the README have explicit assumptions

    Fixes #34, finally . . .

    Now, the assumption that the precinct.TOTPOP and the blocks.TOTPOP are consistent with the assignment that maup generates is explicit in the README, rather than implicit.

    opened by InnovativeInventor 0
  • Port maup to Shapely 2.0

    Port maup to Shapely 2.0

    Shapely 2.0 will introduce breaking changes. This PR addresses those changes and is similar to the changes made in https://github.com/mggg/GerryChain/pull/405.

    opened by InnovativeInventor 0
  • ValueError: cannot reindex from a duplicate axis

    ValueError: cannot reindex from a duplicate axis

    then maup.assign just crashes... after spending a while getting thru the assignments. example:

    In [10]: assign1 = maup.assign(blocks20, vtds10) 100%|██████████| 8941/8941 [11:36<00:00, 12.85it/s] Traceback (most recent call last):

    File "", line 1, in assign1 = maup.assign(blocks20, vtds10)

    File "/Users/dpg/opt/anaconda3/lib/python3.7/site-packages/maup/crs.py", line 14, in wrapped return f(*args, **kwargs)

    File "/Users/dpg/opt/anaconda3/lib/python3.7/site-packages/maup/assign.py", line 12, in assign assignment = assign_by_covering(sources, targets)

    File "/Users/dpg/opt/anaconda3/lib/python3.7/site-packages/maup/assign.py", line 22, in assign_by_covering return indexed_sources.assign(targets)

    File "/Users/dpg/opt/anaconda3/lib/python3.7/site-packages/maup/indexed_geometries.py", line 42, in assign assignment = pandas.concat(groups).reindex(self.index)

    File "/Users/dpg/opt/anaconda3/lib/python3.7/site-packages/pandas/core/series.py", line 4579, in reindex return super().reindex(index=index, **kwargs)

    File "/Users/dpg/opt/anaconda3/lib/python3.7/site-packages/pandas/core/generic.py", line 4810, in reindex axes, level, limit, tolerance, method, fill_value, copy

    File "/Users/dpg/opt/anaconda3/lib/python3.7/site-packages/pandas/core/generic.py", line 4834, in _reindex_axes allow_dups=False,

    File "/Users/dpg/opt/anaconda3/lib/python3.7/site-packages/pandas/core/generic.py", line 4880, in _reindex_with_indexers copy=copy,

    File "/Users/dpg/opt/anaconda3/lib/python3.7/site-packages/pandas/core/internals/managers.py", line 663, in reindex_indexer self.axes[axis]._validate_can_reindex(indexer)

    File "/Users/dpg/opt/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3785, in _validate_can_reindex raise ValueError("cannot reindex from a duplicate axis")

    ValueError: cannot reindex from a duplicate axis

    opened by dinosg 9
  • Higher level `maup` API functions

    Higher level `maup` API functions

    We should expose higher-level API functions to maup from small to big, big to small, and same granularity to same granularity in maup to prevent user error.

    opened by InnovativeInventor 0
Releases(v1.0.8)
  • v1.0.8(Jun 12, 2022)

    What's Changed

    • Bump numpy from 1.20.3 to 1.21.0 by @dependabot in https://github.com/mggg/maup/pull/68
    • Bump numpy from 1.20.3 to 1.21.0 in /docs by @dependabot in https://github.com/mggg/maup/pull/67
    • Allow higher versions of Geopandas by @calebclimatecabinet in https://github.com/mggg/maup/pull/70

    New Contributors

    • @dependabot made their first contribution in https://github.com/mggg/maup/pull/68
    • @calebclimatecabinet made their first contribution in https://github.com/mggg/maup/pull/70

    Full Changelog: https://github.com/mggg/maup/compare/v1.0.7...v1.0.8

    Source code(tar.gz)
    Source code(zip)
  • v1.0.7(May 16, 2022)

    This release contains mostly bug fixes, etc. Note that only the PyPI package has been updated; conda-forge has not.

    What's Changed

    • Add expand_to function by @InnovativeInventor in https://github.com/mggg/maup/pull/46
    • Fix maup.doctor typo with target_union by @InnovativeInventor in https://github.com/mggg/maup/pull/45
    • Fix AttributeError when relevant_geometries is empty by @InnovativeInventor in https://github.com/mggg/maup/pull/44
    • Fix TypeError when nothing is assigned by covering by @InnovativeInventor in https://github.com/mggg/maup/pull/48

    Full Changelog: https://github.com/mggg/maup/compare/v1.0...v1.0.7

    Source code(tar.gz)
    Source code(zip)
  • v1.0(Jun 17, 2021)

Owner
Metric Geometry and Gerrymandering Group
A nonpartisan research organization studying applications of geometry and computing to U.S. redistricting. See also @mggg-states for data and @gerrymandr.
Metric Geometry and Gerrymandering Group
GeoNode is an open source platform that facilitates the creation, sharing, and collaborative use of geospatial data.

Table of Contents What is GeoNode? Try out GeoNode Install Learn GeoNode Development Contributing Roadmap Showcase Most useful links Licensing What is

GeoNode Development Team 1.2k Dec 26, 2022
Minimum Bounding Box of Geospatial data

BBOX Problem definition: The spatial data users often are required to obtain the coordinates of the minimum bounding box of vector and raster data in

Ali Khosravi Kazazi 1 Sep 8, 2022
A part of HyRiver software stack for handling geospatial data manipulations

Package Description Status PyNHD Navigate and subset NHDPlus (MR and HR) using web services Py3DEP Access topographic data through National Map's 3DEP

Taher Chegini 5 Dec 14, 2022
Enable geospatial data mining through Google Earth Engine in Grasshopper 3D, via its most recent Hops component.

AALU_Geo Mining This repository is produced for a masterclass at the Architectural Association Landscape Urbanism programme. Requirements Rhinoceros (

null 4 Nov 16, 2022
Specification for storing geospatial vector data (point, line, polygon) in Parquet

GeoParquet About This repository defines how to store geospatial vector data (point, lines, polygons) in Apache Parquet, a popular columnar storage fo

Open Geospatial Consortium 449 Dec 27, 2022
WebGL2 powered geospatial visualization layers

deck.gl | Website WebGL2-powered, highly performant large-scale data visualization deck.gl is designed to simplify high-performance, WebGL-based visua

Vis.gl 10.5k Jan 8, 2023
Rasterio reads and writes geospatial raster datasets

Rasterio Rasterio reads and writes geospatial raster data. Geographic information systems use GeoTIFF and other formats to organize and store gridded,

Mapbox 1.9k Jan 7, 2023
Summary statistics of geospatial raster datasets based on vector geometries.

rasterstats rasterstats is a Python module for summarizing geospatial raster datasets based on vector geometries. It includes functions for zonal stat

Matthew Perry 437 Dec 23, 2022
Geospatial Image Processing for Python

GIPPY Gippy is a Python library for image processing of geospatial raster data. The core of the library is implemented as a C++ library, libgip, with

GIPIT 83 Aug 19, 2022
leafmap - A Python package for geospatial analysis and interactive mapping in a Jupyter environment.

A Python package for geospatial analysis and interactive mapping with minimal coding in a Jupyter environment

Qiusheng Wu 1.4k Jan 2, 2023
🌐 Local tile server for viewing geospatial raster files with ipyleaflet

?? Local Tile Server for Geospatial Rasters Need to visualize a rather large raster (gigabytes) you have locally? This is for you. A Flask application

Bane Sullivan 192 Jan 4, 2023
h3-js provides a JavaScript version of H3, a hexagon-based geospatial indexing system.

h3-js The h3-js library provides a pure-JavaScript version of the H3 Core Library, a hexagon-based geographic grid system. It can be used either in No

Uber Open Source 648 Jan 7, 2023
🌐 Local tile server for viewing geospatial raster files with ipyleaflet or folium

?? Local Tile Server for Geospatial Rasters Need to visualize a rather large (gigabytes) raster you have locally? This is for you. A Flask application

Bane Sullivan 192 Jan 4, 2023
Open Data Cube analyses continental scale Earth Observation data through time

Open Data Cube Core Overview The Open Data Cube Core provides an integrated gridded data analysis environment for decades of analysis ready earth obse

Open Data Cube 410 Dec 13, 2022
A public data repository for datasets created from TransLink GTFS data.

TransLink Spatial Data What: TransLink is the statutory public transit authority for the Metro Vancouver region. This GitHub repository is a collectio

Henry Tang 3 Jan 14, 2022
Using Global fishing watch's data to build a machine learning model that can identify illegal fishing and poaching activities through satellite and geo-location data.

Using Global fishing watch's data to build a machine learning model that can identify illegal fishing and poaching activities through satellite and geo-location data.

Ayush Mishra 3 May 6, 2022
Python Data. Leaflet.js Maps.

folium Python Data, Leaflet.js Maps folium builds on the data wrangling strengths of the Python ecosystem and the mapping strengths of the Leaflet.js

null 6k Jan 2, 2023
Python tools for geographic data

GeoPandas Python tools for geographic data Introduction GeoPandas is a project to add support for geographic data to pandas objects. It currently impl

GeoPandas 3.5k Jan 3, 2023
Fiona reads and writes geographic data files

Fiona Fiona reads and writes geographic data files and thereby helps Python programmers integrate geographic information systems with other computer s

null 987 Jan 4, 2023