How can I look up a polygon that contains a point fast, given all polygons are a partition?
up vote
0
down vote
favorite
I have a function
get_polygon(polygon_collection, point):
for polygon in polygon_collection:
if polygon.intersects(point):
return polygon
return None
This approach works, but it is in O(n) * O(single polygon check). This can for sure be reduced to O(log(n)) * O(single polygon check) if one builds a tree data structure.
Does shapely support that directly?
Structure of polyogon collection
- The collection contains n MultiPolygons - hence one object can consist of many polygons.
- Each MultiPolygon can have enclaves / exclaves
- Typically (> 95%), they are simple polygons without holes
- Typically (> 50%), the shape is rather simple (e.g. squares)
Use case example
The list of polygons could be postal code areas in Germany. That would be several thousands. Then I have GPS positions of me and some friends, that would be several thousands as well. I want to say in which area where we got most data points.
python shapely
add a comment |
up vote
0
down vote
favorite
I have a function
get_polygon(polygon_collection, point):
for polygon in polygon_collection:
if polygon.intersects(point):
return polygon
return None
This approach works, but it is in O(n) * O(single polygon check). This can for sure be reduced to O(log(n)) * O(single polygon check) if one builds a tree data structure.
Does shapely support that directly?
Structure of polyogon collection
- The collection contains n MultiPolygons - hence one object can consist of many polygons.
- Each MultiPolygon can have enclaves / exclaves
- Typically (> 95%), they are simple polygons without holes
- Typically (> 50%), the shape is rather simple (e.g. squares)
Use case example
The list of polygons could be postal code areas in Germany. That would be several thousands. Then I have GPS positions of me and some friends, that would be several thousands as well. I want to say in which area where we got most data points.
python shapely
I don't think you're tackling the problem correctly (for your use case). Postal codes in Germany have a specific order, you should never check all postal codes there. Start with checking out the first digit so you can rule out ~9/10 already and never have to check those. Then narrow it down from there. Look at the postal codes as a tree. You would never iterate through all leaves if you can exclude a whole branch.
– DonQuiKong
yesterday
For general polygons, pre-compute a sorted list of bounding boxes for the gps-achses and then figure out which few polygons the gps coordinate can be in for example. If your x-coordinate (or whatever the gps equivalent is) is 5 and your polygons are distributed from 0 to 100, only check those where the bounding box starts left from 5 and ends right from 5. That's iterating partly through 2 sorted lists. Same for y.
– DonQuiKong
yesterday
@DonQuiKong The use case I described is way simpler than my actual use case. I already suggested that building a tree data structure is what I would like to do (in fact, I already wrote custom software for it). But I wonder if shapely has built-in support for building that tree and using it.
– Martin Thoma
yesterday
Oh, that's what you mean. Yeah, never mind.
– DonQuiKong
yesterday
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I have a function
get_polygon(polygon_collection, point):
for polygon in polygon_collection:
if polygon.intersects(point):
return polygon
return None
This approach works, but it is in O(n) * O(single polygon check). This can for sure be reduced to O(log(n)) * O(single polygon check) if one builds a tree data structure.
Does shapely support that directly?
Structure of polyogon collection
- The collection contains n MultiPolygons - hence one object can consist of many polygons.
- Each MultiPolygon can have enclaves / exclaves
- Typically (> 95%), they are simple polygons without holes
- Typically (> 50%), the shape is rather simple (e.g. squares)
Use case example
The list of polygons could be postal code areas in Germany. That would be several thousands. Then I have GPS positions of me and some friends, that would be several thousands as well. I want to say in which area where we got most data points.
python shapely
I have a function
get_polygon(polygon_collection, point):
for polygon in polygon_collection:
if polygon.intersects(point):
return polygon
return None
This approach works, but it is in O(n) * O(single polygon check). This can for sure be reduced to O(log(n)) * O(single polygon check) if one builds a tree data structure.
Does shapely support that directly?
Structure of polyogon collection
- The collection contains n MultiPolygons - hence one object can consist of many polygons.
- Each MultiPolygon can have enclaves / exclaves
- Typically (> 95%), they are simple polygons without holes
- Typically (> 50%), the shape is rather simple (e.g. squares)
Use case example
The list of polygons could be postal code areas in Germany. That would be several thousands. Then I have GPS positions of me and some friends, that would be several thousands as well. I want to say in which area where we got most data points.
python shapely
python shapely
edited yesterday
asked yesterday
Martin Thoma
38.6k50278493
38.6k50278493
I don't think you're tackling the problem correctly (for your use case). Postal codes in Germany have a specific order, you should never check all postal codes there. Start with checking out the first digit so you can rule out ~9/10 already and never have to check those. Then narrow it down from there. Look at the postal codes as a tree. You would never iterate through all leaves if you can exclude a whole branch.
– DonQuiKong
yesterday
For general polygons, pre-compute a sorted list of bounding boxes for the gps-achses and then figure out which few polygons the gps coordinate can be in for example. If your x-coordinate (or whatever the gps equivalent is) is 5 and your polygons are distributed from 0 to 100, only check those where the bounding box starts left from 5 and ends right from 5. That's iterating partly through 2 sorted lists. Same for y.
– DonQuiKong
yesterday
@DonQuiKong The use case I described is way simpler than my actual use case. I already suggested that building a tree data structure is what I would like to do (in fact, I already wrote custom software for it). But I wonder if shapely has built-in support for building that tree and using it.
– Martin Thoma
yesterday
Oh, that's what you mean. Yeah, never mind.
– DonQuiKong
yesterday
add a comment |
I don't think you're tackling the problem correctly (for your use case). Postal codes in Germany have a specific order, you should never check all postal codes there. Start with checking out the first digit so you can rule out ~9/10 already and never have to check those. Then narrow it down from there. Look at the postal codes as a tree. You would never iterate through all leaves if you can exclude a whole branch.
– DonQuiKong
yesterday
For general polygons, pre-compute a sorted list of bounding boxes for the gps-achses and then figure out which few polygons the gps coordinate can be in for example. If your x-coordinate (or whatever the gps equivalent is) is 5 and your polygons are distributed from 0 to 100, only check those where the bounding box starts left from 5 and ends right from 5. That's iterating partly through 2 sorted lists. Same for y.
– DonQuiKong
yesterday
@DonQuiKong The use case I described is way simpler than my actual use case. I already suggested that building a tree data structure is what I would like to do (in fact, I already wrote custom software for it). But I wonder if shapely has built-in support for building that tree and using it.
– Martin Thoma
yesterday
Oh, that's what you mean. Yeah, never mind.
– DonQuiKong
yesterday
I don't think you're tackling the problem correctly (for your use case). Postal codes in Germany have a specific order, you should never check all postal codes there. Start with checking out the first digit so you can rule out ~9/10 already and never have to check those. Then narrow it down from there. Look at the postal codes as a tree. You would never iterate through all leaves if you can exclude a whole branch.
– DonQuiKong
yesterday
I don't think you're tackling the problem correctly (for your use case). Postal codes in Germany have a specific order, you should never check all postal codes there. Start with checking out the first digit so you can rule out ~9/10 already and never have to check those. Then narrow it down from there. Look at the postal codes as a tree. You would never iterate through all leaves if you can exclude a whole branch.
– DonQuiKong
yesterday
For general polygons, pre-compute a sorted list of bounding boxes for the gps-achses and then figure out which few polygons the gps coordinate can be in for example. If your x-coordinate (or whatever the gps equivalent is) is 5 and your polygons are distributed from 0 to 100, only check those where the bounding box starts left from 5 and ends right from 5. That's iterating partly through 2 sorted lists. Same for y.
– DonQuiKong
yesterday
For general polygons, pre-compute a sorted list of bounding boxes for the gps-achses and then figure out which few polygons the gps coordinate can be in for example. If your x-coordinate (or whatever the gps equivalent is) is 5 and your polygons are distributed from 0 to 100, only check those where the bounding box starts left from 5 and ends right from 5. That's iterating partly through 2 sorted lists. Same for y.
– DonQuiKong
yesterday
@DonQuiKong The use case I described is way simpler than my actual use case. I already suggested that building a tree data structure is what I would like to do (in fact, I already wrote custom software for it). But I wonder if shapely has built-in support for building that tree and using it.
– Martin Thoma
yesterday
@DonQuiKong The use case I described is way simpler than my actual use case. I already suggested that building a tree data structure is what I would like to do (in fact, I already wrote custom software for it). But I wonder if shapely has built-in support for building that tree and using it.
– Martin Thoma
yesterday
Oh, that's what you mean. Yeah, never mind.
– DonQuiKong
yesterday
Oh, that's what you mean. Yeah, never mind.
– DonQuiKong
yesterday
add a comment |
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53224490%2fhow-can-i-look-up-a-polygon-that-contains-a-point-fast-given-all-polygons-are-a%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
I don't think you're tackling the problem correctly (for your use case). Postal codes in Germany have a specific order, you should never check all postal codes there. Start with checking out the first digit so you can rule out ~9/10 already and never have to check those. Then narrow it down from there. Look at the postal codes as a tree. You would never iterate through all leaves if you can exclude a whole branch.
– DonQuiKong
yesterday
For general polygons, pre-compute a sorted list of bounding boxes for the gps-achses and then figure out which few polygons the gps coordinate can be in for example. If your x-coordinate (or whatever the gps equivalent is) is 5 and your polygons are distributed from 0 to 100, only check those where the bounding box starts left from 5 and ends right from 5. That's iterating partly through 2 sorted lists. Same for y.
– DonQuiKong
yesterday
@DonQuiKong The use case I described is way simpler than my actual use case. I already suggested that building a tree data structure is what I would like to do (in fact, I already wrote custom software for it). But I wonder if shapely has built-in support for building that tree and using it.
– Martin Thoma
yesterday
Oh, that's what you mean. Yeah, never mind.
– DonQuiKong
yesterday