We have moved many of our geographic analyses to QGIS from ESRI’s ArcGIS but we have yet to find an adequate replacement for ESRI’s Network Analyst extension. Network Analyst, itself, can be clunky and tricky to use (not to mention expensive!), but it has enabled us to compute network buffers and minimum network distance fairly efficiently.
One alternative with a lot of potential is the pgRouting extension to the PostGIS/PostgreSQL database that provides routing functionality. We have a lot of routing to do as part of an upcoming project so we thought this would be the perfect time to evaluate pgRouting. Below we share the steps involved in set-up and simple routing between two points.
Our goal was to evaluate the potential of using pgRouting in place of Network Analyst on a specific project. The evaluation was done quickly and is not meant as a side-by-side comparison between the two technologies.
pgRouting Set-up
1. Install the pgRouting functionality
We’re running PostgreSQL 9.3 with PostGIS 2.1. Luckily, PostGIS 2.1 actually comes with pgRouting out of the box so there was no need to install anything additional.
2. Add pgRouting functions to database
Following the instructions in the documentation you create a new database (ours is called routing
) and then add the PostGIS functions and add the core pgRouting functions:
-- add PostGIS functions
CREATE EXTENSION postgis;
-- add pgRouting core functions
CREATE EXTENSION pgrouting;
3. Get and format your network data
There are a number of ways to do this. We followed the instructions provided by Boston GIS and first installed osm2po (version 4.8.8). We decided to follow along further with the Boston GIS example and use the Boston data but amazingly this website by Michal Migurski provides pre-extracted OpenStreetMap data for major world cities. In order to prepare the OSM data for routing, you can follow Boston GIS’s instructions, but basically we edited the demo.bat
file pointing to the Boston osm.pbf data and then right clicked and chose ‘Run as Administrator’. Here is the code (directly from Boston GIS):
java -Xmx512m -jar osm2po-core-4.7.7-signed.jar prefix=hh tileSize=x http://osm-extracted-metros.s3.amazonaws.com/boston.osm.pbf
4. Load network data to PostgreSQL
Once you’ve completed the step above you can load the data into PostgreSQL. You will again need to create a shell script, this time one that runs an SQL file that was created when you ran the Java program in the last step. Again, copying code directly from Boston GIS the shell script would include the following code:
set PSQL="C:\Program Files\PostgreSQL\9.2\bin\psql"
set PGPORT=5432
set PGHOST=localhost
set PGPASSWORD=something
cd hh
%PSQL% -U postgres -d routing -q -f "hh_2po_4pgr.sql"
pause
And in the image below you can see that we’ve created a new table called hh_2po_4pgr
which is the network dataset.
5. Do some routing
Here is where some of the code diverges from the Boston GIS site. The functions they use (e.g., astar_sp_delta
) have apparently been deprecated. As as result, I leaned on Anita Graser’s useful instructions. And it is in this section where my ignorance about pgRouting will become clear.
To use the shortest path function (using the Dijkstra algorithm) the pgr_dijkstra
function requires an SQL query, a start point, an end point and then, two booleans that I have not looked into yet. The starting points refer to nodes in the network and their IDs are the source and target variables in the table we created. Here is what the table looks like:
And here is the code we use:
SELECT seq, id1 AS node, id2 AS edge, cost
FROM pgr_dijkstra(
'SELECT id, source, target, st_length(geom_way) as cost FROM hh_2po_4pgr',
100, 200, false, false
)
This gets us the cost for traversing each link in the network from source to target (to get the total you would sum the cost).
Normally, of course, you do not want to have to choose a source and target road segment, you’d prefer to use a lat/long or an address. I could not find a great way to do this, unfortunately. Instead, I identified the nearest source node to my starting location and then the nearest target node from my target latitude (borrowing code from here).
This code looks like:
select source from hh_2po_4pgr order by st_distance(geom_way,
st_setsrid(st_makepoint(-71.120328, 42.327462), 4326)) limit 1;
select target from hh_2po_4pgr order by st_distance(geom_way,
st_setsrid(st_makepoint(-71.060934, 42.358421), 4326)) limit 1;
This code returns the nodes 33649, 8759 which I can then plug into the function above. The final touch is that I’d like to link back to the network file so that I can put the route on a map so the full code looks like:
create table myroute as SELECT seq, id1 AS node, id2 AS edge, di.cost, geom_way
FROM pgr_dijkstra(
'SELECT id, source, target, st_length(geom_way) as cost FROM hh_2po_4pgr',
33649, 8759, false, false
) as di
JOIN hh_2po_4pgr
ON di.id2 = hh_2po_4pgr.id
The total cost is 0.071487 degrees (we are working with an unprojected network). Given that one degree is approximately 69 miles we get a total cost of 4.9 miles.
Here is the same route in Google:
And then using pgRouting:
6. Conclusion
This small experiment with pgRouting does not make me qualified to summarize the pros and cons. The software seems to be a promising option for open source routing but my personal experience with the process is that it is challenging to work with, there is a lot of room for improvement and may not yet permit me to drop using Network Analyst.
Hello,
Thanks allot for writing a great article.
I am stuck at an issue and have spent allot of time but couldn’t get out of it. I have been carefully following each and every step that you have written in the article. After the execution of step 3, I do not see “hh_2po_4pgr.sql” being created in “hh” folder. Because of it I am just stuck and can’t go any further to perform routing queries etc.
I have also tried several versions of osm2po, such as, 4.7.7, 4.8.8 and 5.1.0. I have also downloaded the file from geofabrik and wrote the hard-coded path (of my machine) to .osm.pbf file. The .bat file processes the data by showing node being processed, ways, etc., but at the end I do not see “hh_2po_4pgr.sql” in the hh folder. However, it creates “hh_2po.txt” (the log file) and few other “2po” format files. But it doesn’t write .sql file in hh directory.
What can be the possible reason? How should I proceed to make it work?
Thank you very much for your time and support.
Kind regards,
Amir
Unfortunately, I’m not going to be able to help you on this one. The post is two-years old and in the end I decided that PostGIS routing wasn’t going to work out for me (yet!) so I haven’t been using it. I’m guessing that several things have changed since I wrote this post! Perhaps put a question up on stackoverflow?
Hi. It worked for me. I carefully read the log file and found that there is “java.lang.OutOfMemoryError: Java heap space” error. Therefore, I just had to increase the Java heap size by changing the parameter value to -Xmx4096m in the .bat file. It worked like a charm! 🙂 Hope it helps others.