Services that allow users to identify nearby points of interest continue to grow in popularity. I'm sure we're all familiar with social websites that let you search for the profiles of people near a postal code, or mobile applications that use geolocation to identify Thai restaurants within walking distance. It's surprisingly simple to implement such functionality, and in this post I will discuss how to do so.
The first step is to obtain the latitude and longitude coordinates of any locations you want to make searchable. In the restaurant scenario, you'd want the latitude and longitude of each eatery. In the social website scenario, you'd want to obtain a list of postal codes with their centroid latitude and longitude.
In general, postal code-based geolocation is a bad idea; their boundaries rarely form simple polygons, the area they cover vary in size, and are subject to change based on the whims of the postal service. But many times we find ourselves stuck on a client project that calls for things we know better than to do, so I'll spare you my rant and instead ask that you keep the following in mind:
The second step is to develop an appropriate strategy to search your data. You don't want to calculate the distance to all target locations' coordinates from a given origin; this naive, brute-force search is rarely desirable, even if the dataset is small and your application/site traffic is low. My approach is to calculate the latitude and longitude coordinates for a search perimeter's north, east, south, and west bounds (bearings of 0°, 90°, 180°, and -90° or 270° respectively). I can then search the subset of records that have locations which falls within this bounding-box.
I use the following formulas to calculate target coordinates and restrict my dataset search:
where:
Implemented in code, the formulas look like this:
which a simple wrapper function can then call repeatedly with different bearings to determine the search bounds:
Records that are located within this search area still need to be filtered, and I use the following formula to calculate the distance between two coordinates:
Implemented in code, the formula look like this:
A search using these functions would look like this (assume $latitude and $longitude are given and provide the origin coordinates, and $distance and $units provide the desired search radius):
A deep understanding of the mathematics used in navigation is nice to have, but is not truly necessary. We're lucky to live in an age where finding suitable formulas is just a quick Internet search away, and discussions on the trade-offs between their simplicity and accuracy are easy to find. The formulas I used are based on the Haversine Formula which have been accurate enough for my needs. Others such as Vincenty's Formulae take into account the ellipsoidal nature of the Earth to produce more accurate results, but I'm helping people locate near-by friends or businesses... not programming cruise missiles.
The final step is obtaining origin coordinates. In the postal code example from the beginning you can look up the code's centroid latitude and longitude and use those as your search origin. If your target audience is receptive, you may want to consider taking advantage of the JavaScript Geolocation API defined by the W3C. The API exposes the navigator.geolocation object which can obtain the user's current location. A user will rarely know his or her exact coordinates, but many Internet-enabled devices and smartphones now ship with GPS capabilities which the browser can access. navigator.geolocation.getCurrentPosition() causes the brower to request permission from the user to report his or her location, and if permission is granted will invoke a specified callback function.
Hopefully you now seen how relatively easy it is to support searching nearby points of interest in your web-application. First and foremost it is important to maintain a good dataset of target coordinates, since though users will understand the search results to be approximations, you can never return good results with bad data. Implementing algorithms necessary to search the set efficiently is relatively straightforward and I have shown you the approach I use; other algorithms and formulas can be found and implemented if you need more accurate results. And modern APIs like navigator.geolocation make it possible for you to offer new ways to your users to interact with your data. Feel free to share your thoughts and experiences in the comments section below.
Special thanks to Matthew Turland and Davey Shafik who both helped me out immensely when I first got my feet wet with geolocation.
Esperanto translation is available at zaemis.tumblr.com/post/38447430414/geo-loka-sercxado
The first step is to obtain the latitude and longitude coordinates of any locations you want to make searchable. In the restaurant scenario, you'd want the latitude and longitude of each eatery. In the social website scenario, you'd want to obtain a list of postal codes with their centroid latitude and longitude.
In general, postal code-based geolocation is a bad idea; their boundaries rarely form simple polygons, the area they cover vary in size, and are subject to change based on the whims of the postal service. But many times we find ourselves stuck on a client project that calls for things we know better than to do, so I'll spare you my rant and instead ask that you keep the following in mind:
- ZCTAs are not ZIP codes so do not use them as such. Some people recommend using the US Census Bureau's 2000 ZCTA dataset as freely available ZIP code latitude/longitude information for the United States, but the Bureau's own website warns ZCTA codes are designed to label census blocks and do not always correspond to ZIP codes assigned to the same area by the US Postal Service. Unpopulated areas such as lakes and forests may have their own ZCTA code, while smaller ZIP codes may not even "appear in the ZCTA universe."
- Postal codes are volatile so make sure you understand your dataset provider's update policies. CD Light, LLC offers a line of commercial datasets which they license from the US Postal Service and Canada Post Corporation and then augment and maintain using a variety of additional sources. New codes are added and obsolete codes are removed from their datasets. Quarterly and monthly updates are made available through reasonably priced subscription plans. federalgovernmentzipcodes.us on the other hand offers a free US ZIP code dataset that is irregularly-maintained using a smaller set of additional sources. Obsolete ZIP codes are included but marked "decommissioned." It's not clear however which company or individual is behind the offering.
The second step is to develop an appropriate strategy to search your data. You don't want to calculate the distance to all target locations' coordinates from a given origin; this naive, brute-force search is rarely desirable, even if the dataset is small and your application/site traffic is low. My approach is to calculate the latitude and longitude coordinates for a search perimeter's north, east, south, and west bounds (bearings of 0°, 90°, 180°, and -90° or 270° respectively). I can then search the subset of records that have locations which falls within this bounding-box.
I use the following formulas to calculate target coordinates and restrict my dataset search:
where:
- d is distance
- R is the Earth's radius (WGS84 model semi-major axis is 3,963.19 miles)
- ᵩ is latitude
- λ is longitude
- θ is bearing
Implemented in code, the formulas look like this:
// calculate destination lat/lon given a starting point, bearing, and distance function destination($lat,$lon, $bearing, $distance,$units="mi") { $radius = strcasecmp($units, "km") ? 3963.19 : 6378.137; $rLat = deg2rad($lat); $rLon = deg2rad($lon); $rBearing = deg2rad($bearing); $rAngDist = $distance / $radius; $rLatB = asin(sin($rLat) * cos($rAngDist) + cos($rLat) * sin($rAngDist) * cos($rBearing)); $rLonB = $rLon + atan2(sin($rBearing) * sin($rAngDist) * cos($rLat), cos($rAngDist) - sin($rLat) * sin($rLatB)); return array("lat" => rad2deg($rLatB), "lon" => rad2deg($rLonB)); }
which a simple wrapper function can then call repeatedly with different bearings to determine the search bounds:
// calculate bounding box function bound($lat,$lon, $distance,$units="mi") { return array("N" => destination($lat,$lon, 0, $distance,$units), "E" => destination($lat,$lon, 90, $distance,$units), "S" => destination($lat,$lon, 180, $distance,$units), "W" => destination($lat,$lon, 270, $distance,$units)); }
Records that are located within this search area still need to be filtered, and I use the following formula to calculate the distance between two coordinates:
Implemented in code, the formula look like this:
// calculate distance between two lat/lon coordinates function distance($latA,$lonA, $latB,$lonB, $units="mi") { $radius = strcasecmp($units, "km") ? 3963.19 : 6378.137; $rLatA = deg2rad($latA); $rLatB = deg2rad($latB); $rHalfDeltaLat = deg2rad(($latB - $latA) / 2); $rHalfDeltaLon = deg2rad(($lonB - $lonA) / 2); return 2 * $radius * asin(sqrt(pow(sin($rHalfDeltaLat), 2) + cos($rLatA) * cos($rLatB) * pow(sin($rHalfDeltaLon), 2))); }
A search using these functions would look like this (assume $latitude and $longitude are given and provide the origin coordinates, and $distance and $units provide the desired search radius):
$b = bound($latitude,$longitude, $distance,$units); $query = "SELECT location_name, address, city, state, zip_code, latitude, longitude FROM locations WHERE latitude BETWEEN {$b["S"]["lat"]} AND {$b["N"]["lat"]} AND longitude BETWEEN {$b["W"]["lon"]} AND {$b["E"]["lon"]}"; $result = $db->query($query); $locations = array(); while ($row = $result->fetch(PDO::FETCH_ASSOC)) { $dist = distance($latitude,$lonitude, $row["latitude"],$row["longitude"], $units); if ($dist <= $distance) { $locations[] = array("name" => $row["location_name"], "address" => $row["address"], "city" => $row["city"], "state" => $row["state"], "zip_code" => $row["zip_code"], "distance" => round($dist, 2)); } }
A deep understanding of the mathematics used in navigation is nice to have, but is not truly necessary. We're lucky to live in an age where finding suitable formulas is just a quick Internet search away, and discussions on the trade-offs between their simplicity and accuracy are easy to find. The formulas I used are based on the Haversine Formula which have been accurate enough for my needs. Others such as Vincenty's Formulae take into account the ellipsoidal nature of the Earth to produce more accurate results, but I'm helping people locate near-by friends or businesses... not programming cruise missiles.
The final step is obtaining origin coordinates. In the postal code example from the beginning you can look up the code's centroid latitude and longitude and use those as your search origin. If your target audience is receptive, you may want to consider taking advantage of the JavaScript Geolocation API defined by the W3C. The API exposes the navigator.geolocation object which can obtain the user's current location. A user will rarely know his or her exact coordinates, but many Internet-enabled devices and smartphones now ship with GPS capabilities which the browser can access. navigator.geolocation.getCurrentPosition() causes the brower to request permission from the user to report his or her location, and if permission is granted will invoke a specified callback function.
<form> Search <input id="txtDist" maxlength="3" name="distance" size="3" type="text"> <select name="units"> <option value="mi">mi</option> <option value="km">km</option> </select> from<br> <input checked="checked" name="searchBy" type="radio" value="zip">ZIP code <input maxlength="5" name="zipCode" size="5" type="text"><br> <input id="coords" name="searchBy" type="radio" value="coords">coordinates <input id="txtLat" maxlength="10" name="latitude" size="6" type="text"> Lat. <input id="txtLon" maxlength="10" name="longitude" size="6" type="text"> Lon.<br> <input type="submit" value="Search"> </form> <script type="text/javascript"> (function () { function setCoords(pos) { document.getElementById("coords").checked = true; document.getElementById("txtLat").value = pos.coordinates.latitude; document.getElementById("txtLon").value = pos.coordinates.longitude; } if (navigator.geolocation) { navigator.geolocation.getCurrentPosition(setCoords); } })(); </script>
Hopefully you now seen how relatively easy it is to support searching nearby points of interest in your web-application. First and foremost it is important to maintain a good dataset of target coordinates, since though users will understand the search results to be approximations, you can never return good results with bad data. Implementing algorithms necessary to search the set efficiently is relatively straightforward and I have shown you the approach I use; other algorithms and formulas can be found and implemented if you need more accurate results. And modern APIs like navigator.geolocation make it possible for you to offer new ways to your users to interact with your data. Feel free to share your thoughts and experiences in the comments section below.
Special thanks to Matthew Turland and Davey Shafik who both helped me out immensely when I first got my feet wet with geolocation.
Esperanto translation is available at zaemis.tumblr.com/post/38447430414/geo-loka-sercxado
Genius!
ReplyDeleteYour strategy for distance searching is exactly what I had in mind. Thank you for providing all the formulas and explanation.
ReplyDeleteThis is awesome...
ReplyDeleteHuge thanks for creating an easy to use set of functions!!
ReplyDeleteThanks.. just awesome
ReplyDeleteThanks, awasome, keep it up ,mind blowing
ReplyDeleteExcellent...!!!
ReplyDeleteIs the Distance in M or KM
ReplyDelete