MSQL Spatial Exploration

| 2 Comments

Today I decided to do some playing with MYSQL's Spatial Extentions.

I started by creating a simple table of "Points of Interest"

CREATE TABLE `poi_spatial` (
`id` int(10) unsigned NOT NULL auto_increment,
`name` varchar(255) NOT NULL,
`lon` varchar(20) default NULL,
`lat` varchar(20) default NULL,
`loc` point NOT NULL,
PRIMARY KEY (`id`),
KEY `name` (`name`),
SPATIAL KEY `loc` (`loc`)
) ENGINE=MyISAM

The only really interesting column in this table is the loc column. It is declared as type "point".

To play with this, I then inserted some rows in the table. To make that easier, I created a function "ToPoint" that would accept the Latitude and Longtitude and return a POINT:
DELIMITER $$

DROP FUNCTION IF EXISTS `ToPoint`$$

CREATE FUNCTION `ToPoint` (lat VARCHAR(11), lon VARCHAR(11) )
RETURNS POINT DETERMINISTIC
BEGIN
DECLARE result POINT;
DECLARE loc VARCHAR(50);
SELECT CONCAT('POINT( ', lon , ' ' , lat , ')' )INTO loc;
SELECT GeomFromText(loc) INTO result;
RETURN RESULT;
END$$

So now I can insert with a simple call:

INSERT INTO poi_spatial ( name, lat, lon, loc ) VALUES ( "name", "lat", "lon", ToPoint( "lat", "lon" ))

To get some good data, I downloaded some GNIS data from the USGS. I read the data with a simple perl script and added data to my table.

Now, I wanted to find the entry in the poi_spatial table that I was closest to. The version of MYSQL that I'm running does not have a built-in DISTANCE() function, so I (with the help of the internet) wrote one:
DELIMITER $$

DROP FUNCTION IF EXISTS `Distance`$$

CREATE FUNCTION `Distance` (a POINT, b POINT)
RETURNS DOUBLE DETERMINISTIC
BEGIN
DECLARE dist DOUBLE;
SELECT round(glength(linestringfromwkb(linestring(asbinary(a),asbinary(b)))) * 100, 1) INTO dist;
RETURN dist;
END$$

Now, I simply query the database with something like this:
SELECT name, distance( loc, ToPoint( "myLat", "myLon" )) as distance FROM poi_spatial ORDER BY distance.

This is not the most optimal solution, but it seems to work pretty well.

More links for reference:
MySQL Reference Manual
Doing the same thing without using MySQL spatial extensions
Some help from the MySQL Forums
A really helpful presentation

2 Comments

The query shown above still needs to scan through all of the rows in the table, and calculate the distance to all of the positions in the table, and then sort it to find the closest one. It's pretty slow, especially if you load in global data from GeoNames or such.

You can speed it up a lot by specifying a smaller box around your position and looking up points within the box, and then calculating the distances. This can be done using an additional condition in the WHERE part of the query:

FROM poi_spatial
WHERE CONTAINS(GeomFromText('Polygon((...))'), loc)
ORDER BY distance LIMIT 10

Happy SQL hacking...

Thanks for the pointer Hessu! Adding the bounding box does significantly improve the performance. I'm using myLat/myLon +/- 0.2 to create the bounding box. Seems to provide plenty of miles and I should find a city within that most of the time.

Leave a comment