Calculate the Distance Between Two Latitude/Longitude Points using Plain SQL

13 06 2011

June 13, 2011 (Modified June 14, 2011)

A question recently appeared on the comp.databases.oracle.server Usenet group that oddly made me recall a lesson from a mathematics class that I taught in the early 1990s.  A bit strange how a question related to Oracle Database would trigger such a memory, but it happened.  The question posed in the thread initially asked how to calculate the distance between the points:

 item  latitude  longitude
---- ---------  ---------
P1   50.716667  -1.883333
P2   51.023332  -1.872231

Let’s see, the earth is flat, so…

= ((51.023332 - 50.716667)^2 + (1.883333 - 1.872231)^2)^(1/2) * 3960 * 3.141592/180 miles
= (0.094043422225 + 0.000123254404)^(1/2) * 3960 * 3.141592/180 miles
= 0.30686589355775594241245361976998 * 3960 * 3.141592/180 miles
= 1215.18894 * 3.141592/180 miles
= 21.209 miles

So, if the earth is flat, the Pythagorean Theorem seems to give a sensible answer… maybe.  My recollection of the formulas used for the correct calculation is a bit fuzzy.  Good news, I found the three page lesson plan that I put together in the early 1990s to solve a problem similar to the one posed by the OP in the Usenet thread.  The lesson plan begins by stating that part of the point of the lesson was to destroy Euclid’s Elements, which were introduced 23 centuries ago… a bit ambitious, looking back.  Even though the lesson plan is not too terribly clear given the amount of mathematics knowledge that I have forgotten, I managed to find the following formula in the lesson plan:

c0 = cos^(-1)(cos(90-alpha1)*cos(90-alpha2) + sin(90-alpha1)*sin(90-alpha2)*cos(theta1-theta2))
distance = c0 * (2pi/360) * 3960 miles

That looks complicated, even without the help of a computer… what is alpha, and what is theta – it’s Greek to me?

A Google search found another helpful page that describes how to calculate distances between pairs of latitude and longitude coordinates.  Included in that page is the following formula that is compatible with Microsoft Excel:

=ACOS(SIN(lat1)*SIN(lat2)+COS(lat1)*COS(lat2)*COS(lon2-lon1))*6371 KM 

Other than outputting kilometers rather than miles, and using SIN where my formula uses COS, the two formulas are close to being identical.  Based on that observation, alpha must be the latitude coordinate and theta must be the longitude coordinate in the formula that I found in my lesson plan.

The lesson plan included a couple of sample questions and answers:

Prague (14 degrees 26 minutes east, 50 degrees 5 minutes north)
Rio de Janeiro (43 degrees 12 minutes west, 22 degrees 57 minutes south)
Distance = 6152 miles

(119 19 degrees 48 minutes west, 36 degrees 44 minutes north)
(88 degrees 30 minutes west, 42 degrees south)
Distance = 5789.38 4832.09 miles

Will either of the two formulas, when plugged into an Oracle Database, output the distances that I wrote down roughly two decades ago?  That sounds like a challenge.

The setup (make certain that you do not swap the longitude and latitude numbers, the results will be slightly different):

DROP TABLE T1 PURGE;

CREATE TABLE T1 (
  LOCATION VARCHAR2(40),
  LAT_RAD NUMBER,
  LON_RAD NUMBER,
  LAT_DEG NUMBER,
  LON_DEG NUMBER,
  PRIMARY KEY(LOCATION));

INSERT INTO T1 VALUES (
  'Prague',
  ROUND((-(50 + 5/60))*3.141592654/180,6),
  ROUND((14 + 26/60)*3.141592654/180,6),
  ROUND(-(50 + 5/60),6),
  ROUND(14 + 26/60,6)
  );

INSERT INTO T1 VALUES (
  'Rio de Janeiro',
  ROUND((22 + 57/60)*3.141592654/180,6),
  ROUND((-(43 + 12/60))*3.141592654/180,6),
  ROUND(22 + 57/60,6),
  ROUND(-(43 + 12/60),6)
  );

COMMIT; 

Note in the above that I had to convert the degrees and minutes notation to decimal numbers and then also to equivalent radian values.  It would probably be best just to store the radian values, because those values can be used directly in the calculations.  Let’s take a look at the contents of this table:

COLUMN LAT_RAD FORMAT 99.999999
COLUMN LON_RAD FORMAT 99.999999
COLUMN LAT_DEG FORMAT 999.999999
COLUMN LON_DEG FORMAT 999.999999
COLUMN LOCATION FORMAT A20
SET LINESIZE 140
SET TRIMSPOOL ON
SET PAGESIZE 1000

SELECT
  *
FROM
  T1;

LOCATION                LAT_RAD    LON_RAD     LAT_DEG     LON_DEG
-------------------- ---------- ---------- ----------- -----------
Prague                 -.874119    .251909  -50.083333   14.433333
Rio de Janeiro          .400553   -.753982   22.950000  -43.200000 

As seen in the above, west and north directional values are assigned negative numbers.

Let’s try converting the Excel version of the formula to Oracle Database SQL:

COLUMN DISTANCE FORMAT 999990.00

SELECT
  ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960 DISTANCE
FROM
  (SELECT
    LAT_RAD,
    LON_RAD
  FROM
    T1
  WHERE
    LOCATION='Prague') LOC1,
  (SELECT
    LAT_RAD,
    LON_RAD
  FROM
    T1
  WHERE
    LOCATION='Rio de Janeiro') LOC2;

  DISTANCE
----------
   6152.02 

That number seems to match the number that I wrote into the lesson plan.  Let’s try the formula from the lesson plan:

SELECT
  ACOS(COS(1.570796327-LOC1.LAT_RAD)*COS(1.570796327-LOC2.LAT_RAD) + SIN(1.570796327-LOC1.LAT_RAD)*SIN(1.570796327-LOC2.LAT_RAD)*COS(LOC1.LON_RAD-LOC2.LON_RAD)) * 3960 DISTANCE
FROM
  (SELECT
    LAT_RAD,
    LON_RAD
  FROM
    T1
  WHERE
    LOCATION='Prague') LOC1,
  (SELECT
    LAT_RAD,
    LON_RAD
  FROM
    T1
  WHERE
    LOCATION='Rio de Janeiro') LOC2;

  DISTANCE
----------
   6152.02 

If only I had access to an Oracle Database’s SQL interpretter back then rather than a fancy scientific calculator… :-)

Let’s try to calculate the distance for the latitude and longitude values provided by the OP (we need to convert the degree values to radians):

SELECT
  ACOS(COS(1.570796327-LOC1.LAT_RAD)*COS(1.570796327-LOC2.LAT_RAD) + SIN(1.570796327-LOC1.LAT_RAD)*SIN(1.570796327-LOC2.LAT_RAD)*COS(LOC1.LON_RAD-LOC2.LON_RAD)) * 3960 DISTANCE
FROM
  (SELECT
    50.716667*3.141592/180 LAT_RAD,
    -1.883333*3.141592/180 LON_RAD
  FROM
    DUAL) LOC1,
  (SELECT
    51.023332*3.141592/180 LAT_RAD,
    -1.872231*3.141592/180 LON_RAD
  FROM
    DUAL) LOC2;

  DISTANCE
----------
21.20 

You may notice that for the short distance between the latitude/longitude pairs that the “earth is flat” method produced the same answer that we see with the more complicated formula shown above.

For a double-check, let’s put the second set of demonstration points from my lesson plan into the T1 table:

INSERT INTO T1 VALUES (
  'TEST P1',
  NULL,
  NULL,
  19 + 48/60,     ---- 119 + 48/60,
  -(36 + 44/60)
  );

INSERT INTO T1 VALUES (
  'TEST P2',
  NULL,
  NULL,
  88 + 30/60,
  42
  );

UPDATE
  T1
SET
  LAT_RAD=ROUND(LAT_DEG/180*3.141592654,6),
  LON_RAD=ROUND(LON_DEG/180*3.141592654,6),
  LAT_DEG=ROUND(LAT_DEG,6),
  LON_DEG=ROUND(LON_DEG,6)
WHERE
  LAT_RAD IS NULL;

COMMIT;

Checking the distance with the two formulas:

SELECT
  ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960 DISTANCE
FROM
  (SELECT
    LAT_RAD,
    LON_RAD
  FROM
    T1
  WHERE
    LOCATION='TEST P1') LOC1,
  (SELECT
    LAT_RAD,
    LON_RAD
  FROM
    T1
  WHERE
    LOCATION='TEST P2') LOC2;

  DISTANCE
----------
   4832.09  ---5789.07--- 

 —

 SELECT
  ACOS(COS(1.570796327-LOC1.LAT_RAD)*COS(1.570796327-LOC2.LAT_RAD) + SIN(1.570796327-LOC1.LAT_RAD)*SIN(1.570796327-LOC2.LAT_RAD)*COS(LOC1.LON_RAD-LOC2.LON_RAD)) * 3960 DISTANCE
FROM
  (SELECT
    LAT_RAD,
    LON_RAD
  FROM
    T1
  WHERE
    LOCATION='TEST P1') LOC1,
  (SELECT
    LAT_RAD,
    LON_RAD
  FROM
    T1
  WHERE
    LOCATION='TEST P2') LOC2;

  DISTANCE
----------
   4832.09  ---5789.07---

The values returned are identical to the value I wrote into the lesson plan, so the formulas are probably correct.

The OP’s second question is to identify all locations that are within a given radius of a specified location.  This might be easy, or it might not.  Someone in the Usenet thread suggested using Oracle Spatial, which is an additional cost licensed item that may be added to the Enterprise Edition for a list price of $17,500 per CPU plus $3,850 (USD) annual maintenance.  As I am not familiar with the product, I do not know if Oracle Spatial could answer this question (does someone know for certain?). 

We need some more test data to see if we can find a solution for the OP.  We will use the SIN and COS functions just to insert some “random” yet repeatable data:

INSERT INTO
  T1
SELECT
  'POINT '||TO_CHAR(ROWNUM),
  NULL,
  NULL,
  COS(ROWNUM/40)*180,
  SIN(ROWNUM/33)*180
FROM
  DUAL
CONNECT BY
  LEVEL<=10000;

UPDATE
  T1
SET
  LAT_RAD=ROUND(LAT_DEG/180*3.141592654,6),
  LON_RAD=ROUND(LON_DEG/180*3.141592654,6),
  LAT_DEG=ROUND(LAT_DEG,6),
  LON_DEG=ROUND(LON_DEG,6)
WHERE
  LAT_RAD IS NULL;

COMMIT;

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER, TABNAME=>'T1', CASCADE=> TRUE) 

Let’s give the first SQL statement a try, looking for those areas that are less than 500 miles away from “TEST P1″:

SELECT
  LOC2.LOCATION,
  ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960 DISTANCE
FROM
  (SELECT
    LAT_RAD,
    LON_RAD
  FROM
    T1
  WHERE
    LOCATION='TEST P1') LOC1,
  (SELECT
    LOCATION,
    LAT_RAD,
    LON_RAD
  FROM
    T1
  WHERE
    LOCATION<>'TEST P1') LOC2
WHERE
  (ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960) < 500
ORDER BY
  LOC2.LOCATION; 

LOCATION               DISTANCE
-------------------- ----------
POINT 1561               331.91
POINT 1562                46.45
POINT 1563               426.22
POINT 200                439.78
POINT 2066               318.73
POINT 2067               387.62
POINT 4970               432.93
POINT 7989               210.35
POINT 7990               268.25
POINT 8494               443.05
POINT 9855               258.38
POINT 9856               120.22
Rio de Janeiro           469.61

13 rows selected. 

LOCATION               DISTANCE -------------------- ---------- POINT 1358               428.75 POINT 1359               487.10 POINT 402                452.35 POINT 403                124.05 POINT 404                247.85 POINT 4675               452.49 POINT 4676               168.24 POINT 4677               306.37 POINT 4847               405.92 POINT 5206               449.08 POINT 5207               160.17 POINT 5208               367.93 POINT 5378               496.50 POINT 5379               341.63 POINT 5380               453.77 POINT 6714               282.04 POINT 6715                96.74 POINT 6716               484.03 POINT 8696               386.01 POINT 8697                77.56 POINT 8698               312.19 POINT 9652               420.05 22 rows selected. 

And the second SQL statement:

SELECT
  LOC2.LOCATION,
  ACOS(COS(1.570796327-LOC1.LAT_RAD)*COS(1.570796327-LOC2.LAT_RAD) + SIN(1.570796327-LOC1.LAT_RAD)*SIN(1.570796327-LOC2.LAT_RAD)*COS(LOC1.LON_RAD-LOC2.LON_RAD)) * 3960 DISTANCE
FROM
  (SELECT
    LAT_RAD,
    LON_RAD
  FROM
    T1
  WHERE
    LOCATION='TEST P1') LOC1,
  (SELECT
    LOCATION,
    LAT_RAD,
    LON_RAD
  FROM
    T1
  WHERE
    LOCATION<>'TEST P1') LOC2
WHERE
  (ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960) < 500
ORDER BY
  LOC2.LOCATION;

LOCATION               DISTANCE
-------------------- ----------
POINT 1561               331.91
POINT 1562                46.45
POINT 1563               426.22
POINT 200                439.78
POINT 2066               318.73
POINT 2067               387.62
POINT 4970               432.93
POINT 7989               210.35
POINT 7990               268.25
POINT 8494               443.05
POINT 9855               258.38
POINT 9856               120.22
Rio de Janeiro           469.61
 
13 rows selected.

LOCATION               DISTANCE -------------------- ---------- POINT 1358               428.75 POINT 1359               487.10 POINT 402                452.35 POINT 403                124.05 POINT 404                247.85 POINT 4675               452.49 POINT 4676               168.24 POINT 4677               306.37 POINT 4847               405.92 POINT 5206               449.08 POINT 5207               160.17 POINT 5208               367.93 POINT 5378               496.50 POINT 5379               341.63 POINT 5380               453.77 POINT 6714               282.04 POINT 6715                96.74 POINT 6716               484.03 POINT 8696               386.01 POINT 8697                77.56 POINT 8698               312.19 POINT 9652               420.05 22 rows selected. 

If you tried either of the above two SQL statements, you probably found that it took a couple of seconds to make it through the 10,004 or so rows in the table.  All of the calculations are likely hammering the server’s CPU (go parallel and hammer more than one CPU?).

Maybe we should try submitting bind variables in place of some of the calculations that we are requesting for the server to perform?  Let’s set up the bind variables to prepare another test:

SELECT
  SIN(LAT_RAD),
  COS(LAT_RAD),
  SIN(LON_RAD),
  COS(LON_RAD)
FROM
  T1
WHERE
  LOCATION='TEST P1'; 

SIN(LAT_RAD) COS(LAT_RAD) SIN(LON_RAD) COS(LON_RAD)
------------ ------------ ------------ ------------
   .33873774   .940880834   -.59809181   .801427592

SIN(LAT_RAD) COS(LAT_RAD) SIN(LON_RAD) COS(LON_RAD) ------------ ------------ ------------ ------------   -.59809181   .801427592   .867765674   -.49697358 VAR SIN1_LAT_RAD NUMBER

VAR COS1_LAT_RAD NUMBER
VAR SIN1_LON_RAD NUMBER
VAR COS1_LON_RAD NUMBER

EXEC :SIN1_LAT_RAD := .33873774
EXEC :COS1_LAT_RAD := .940880834
EXEC :SIN1_LON_RAD := -.59809181
EXEC :COS1_LON_RAD := .801427592

EXEC :SIN1_LAT_RAD := -.59809181 EXEC :COS1_LAT_RAD := .801427592 EXEC :SIN1_LON_RAD := .867765674 EXEC :COS1_LON_RAD := -.49697358 

To test the performance, execute the following scipt (more than once to eliminate the hard parse):

SET ARRAYSIZE 100
ALTER SESSION SET TRACEFILE_IDENTIFIER = 'SQL_10046';
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 8';

SELECT
  LOC2.LOCATION,
  ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960 DISTANCE
FROM
  (SELECT
    LAT_RAD,
    LON_RAD
  FROM
    T1
  WHERE
    LOCATION='TEST P1') LOC1,
  (SELECT
    LOCATION,
    LAT_RAD,
    LON_RAD
  FROM
    T1
  WHERE
    LOCATION<>'TEST P1') LOC2
WHERE
  (ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960) < 500
ORDER BY
  LOC2.LOCATION;

SELECT
  LOC2.LOCATION,
  ACOS(COS(1.570796327-LOC1.LAT_RAD)*COS(1.570796327-LOC2.LAT_RAD) + SIN(1.570796327-LOC1.LAT_RAD)*SIN(1.570796327-LOC2.LAT_RAD)*COS(LOC1.LON_RAD-LOC2.LON_RAD)) * 3960 DISTANCE
FROM
  (SELECT
    LAT_RAD,
    LON_RAD
  FROM
    T1
  WHERE
    LOCATION='TEST P1') LOC1,
  (SELECT
    LOCATION,
    LAT_RAD,
    LON_RAD
  FROM
    T1
  WHERE
    LOCATION<>'TEST P1') LOC2
WHERE
  (ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960) < 500
ORDER BY
  LOC2.LOCATION;

SELECT
  LOC2.LOCATION,
  ABS(ACOS(:SIN1_LAT_RAD*SIN(LOC2.LAT_RAD)+:COS1_LAT_RAD*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960 DISTANCE
FROM
  (SELECT
    LAT_RAD,
    LON_RAD
  FROM
    T1
  WHERE
    LOCATION='TEST P1') LOC1,
  (SELECT
    LOCATION,
    LAT_RAD,
    LON_RAD
  FROM
    T1
  WHERE
    LOCATION<>'TEST P1') LOC2
WHERE
  (ABS(ACOS(:SIN1_LAT_RAD*SIN(LOC2.LAT_RAD)+:COS1_LAT_RAD*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960) < 500
ORDER BY
  LOC2.LOCATION;

ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT OFF'; 

Trace file output for the first query (1.794 CPU seconds):

FETCH #406890544:c=1794011,e=1797194,p=0,cr=94,cu=0,mis=0,r=1,dep=0,og=1,plh=2797247905,tim=2500817545497
WAIT #406890544: nam='SQL*Net message from client' ela= 357 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500817545983
WAIT #406890544: nam='SQL*Net message to client' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500817546020
FETCH #406890544:c=0,e=44,p=0,cr=0,cu=0,mis=0,r=21,dep=0,og=1,plh=2797247905,tim=2500817546058
STAT #406890544 id=1 cnt=22 pid=0 pos=1 obj=0 op='SORT ORDER BY (cr=94 pr=0 pw=0 time=1797194 us cost=13 size=25000 card=500)'
STAT #406890544 id=2 cnt=22 pid=1 pos=1 obj=0 op='NESTED LOOPS  (cr=94 pr=0 pw=0 time=125707 us cost=12 size=25000 card=500)'
STAT #406890544 id=3 cnt=1 pid=2 pos=1 obj=90695 op='TABLE ACCESS BY INDEX ROWID T1 (cr=3 pr=0 pw=0 time=23 us cost=2 size=25 card=1)'
STAT #406890544 id=4 cnt=1 pid=3 pos=1 obj=90696 op='INDEX UNIQUE SCAN SYS_C0038308 (cr=2 pr=0 pw=0 time=17 us cost=1 size=0 card=1)'
STAT #406890544 id=5 cnt=22 pid=2 pos=2 obj=90695 op='TABLE ACCESS FULL T1 (cr=91 pr=0 pw=0 time=125683 us cost=10 size=12500 card=500)'
WAIT #406890544: nam='SQL*Net message from client' ela= 4359 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500817550490
CLOSE #406890544:c=0,e=14,dep=0,type=0,tim=2500817550613 

Trace file output for the second query (1.529 CPU seconds):

FETCH #406890544:c=1528810,e=1519062,p=0,cr=94,cu=0,mis=0,r=1,dep=0,og=1,plh=2797247905,tim=2500819083108
WAIT #406890544: nam='SQL*Net message from client' ela= 320 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500819083487
WAIT #406890544: nam='SQL*Net message to client' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500819083524
FETCH #406890544:c=0,e=42,p=0,cr=0,cu=0,mis=0,r=21,dep=0,og=1,plh=2797247905,tim=2500819083560
STAT #406890544 id=1 cnt=22 pid=0 pos=1 obj=0 op='SORT ORDER BY (cr=94 pr=0 pw=0 time=1519059 us cost=13 size=25000 card=500)'
STAT #406890544 id=2 cnt=22 pid=1 pos=1 obj=0 op='NESTED LOOPS  (cr=94 pr=0 pw=0 time=50452 us cost=12 size=25000 card=500)'
STAT #406890544 id=3 cnt=1 pid=2 pos=1 obj=90695 op='TABLE ACCESS BY INDEX ROWID T1 (cr=3 pr=0 pw=0 time=15 us cost=2 size=25 card=1)'
STAT #406890544 id=4 cnt=1 pid=3 pos=1 obj=90696 op='INDEX UNIQUE SCAN SYS_C0038308 (cr=2 pr=0 pw=0 time=10 us cost=1 size=0 card=1)'
STAT #406890544 id=5 cnt=22 pid=2 pos=2 obj=90695 op='TABLE ACCESS FULL T1 (cr=91 pr=0 pw=0 time=50437 us cost=10 size=12500 card=500)'
WAIT #406890544: nam='SQL*Net message from client' ela= 3294 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500819086974
CLOSE #406890544:c=0,e=13,dep=0,type=0,tim=2500819087016 

Trace file output for the third query (1.388 CPU seconds):

FETCH #406890544:c=1388409,e=1380995,p=0,cr=94,cu=0,mis=0,r=1,dep=0,og=1,plh=2797247905,tim=2500820478110
WAIT #406890544: nam='SQL*Net message from client' ela= 278 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500820478441
WAIT #406890544: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500820478480
FETCH #406890544:c=0,e=46,p=0,cr=0,cu=0,mis=0,r=21,dep=0,og=1,plh=2797247905,tim=2500820478519
STAT #406890544 id=1 cnt=22 pid=0 pos=1 obj=0 op='SORT ORDER BY (cr=94 pr=0 pw=0 time=1380991 us cost=13 size=21500 card=500)'
STAT #406890544 id=2 cnt=22 pid=1 pos=1 obj=0 op='NESTED LOOPS  (cr=94 pr=0 pw=0 time=45079 us cost=12 size=21500 card=500)'
STAT #406890544 id=3 cnt=1 pid=2 pos=1 obj=90695 op='TABLE ACCESS BY INDEX ROWID T1 (cr=3 pr=0 pw=0 time=12 us cost=2 size=18 card=1)'
STAT #406890544 id=4 cnt=1 pid=3 pos=1 obj=90696 op='INDEX UNIQUE SCAN SYS_C0038308 (cr=2 pr=0 pw=0 time=9 us cost=1 size=0 card=1)'
STAT #406890544 id=5 cnt=22 pid=2 pos=2 obj=90695 op='TABLE ACCESS FULL T1 (cr=91 pr=0 pw=0 time=45046 us cost=10 size=12500 card=500)'
WAIT #406890544: nam='SQL*Net message from client' ela= 2713 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500820481326
CLOSE #406890544:c=0,e=15,dep=0,type=0,tim=2500820481372 

Maybe we should just assume that the world is flat to save time (note that I am not converting the degree values to radians)?

SELECT
  LOC2.LOCATION,
  SQRT(POWER((LOC1.LAT_DEG - LOC2.LAT_DEG),2) + POWER((LOC1.LON_DEG - LOC2.LON_DEG),2)) * 3960 * 3.141592654/180 DISTANCE
FROM
  (SELECT
    LAT_DEG,
    LON_DEG
  FROM
    T1
  WHERE
    LOCATION='TEST P1') LOC1,
  (SELECT
    LOCATION,
    LAT_DEG,
    LON_DEG
  FROM
    T1
  WHERE
    LOCATION<>'TEST P1') LOC2
WHERE
  (SQRT(POWER((LOC1.LAT_DEG - LOC2.LAT_DEG),2) + POWER((LOC1.LON_DEG - LOC2.LON_DEG),2)) * 3960 * 3.141592654/180) < 500
ORDER BY
  LOC2.LOCATION;

LOCATION               DISTANCE
-------------------- ----------
POINT 1561               351.10
POINT 1562                49.17
POINT 1563               446.41
POINT 200                447.03
POINT 2066               327.96
POINT 2067               389.31
POINT 4970               436.30
POINT 7989               222.68
POINT 7990               274.47
POINT 8494               446.20
POINT 9855               272.91
POINT 9856               126.70
Rio de Janeiro           497.15

13 rows selected.
LOCATION               DISTANCE -------------------- ---------- POINT 4847           487.345998 POINT 5207           194.867475 POINT 5208           371.803354 POINT 6714           306.023772 POINT 6715            110.18648  5 rows selected.

Still 13 rows returned, but the distances are a bit off due to the curvature of the earth. What happened to the other 17 rows… did I make a mistake, or is the world not flat?

Note that the original version of this article made a critical mistake – latitude values cannot exceed 90 degrees.  While trying to implement my suggestion in a comment to use +-0.0002525254004999719 radians to speed up the query (note, still need to take into account the wrap-around for longitude values) I discovered inconsistent results.  This article is corrected to fix that mistake.

Added June 14, 2011:

The speed up version of the query:

SET ARRAYSIZE 100 ALTER SESSION SET TRACEFILE_IDENTIFIER = 'SQL_10046-FAST'; 
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 8';     
SELECT 
  LOC2.LOCATION, 
  ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960 DISTANCE 
FROM 
  (SELECT 
     LAT_RAD, 
     LON_RAD 
   FROM 
     T1 
   WHERE 
     LOCATION='TEST P1') LOC1, 
  (SELECT 
     LOCATION, 
     LAT_RAD, 
     LON_RAD 
   FROM 
     T1 
   WHERE 
     LOCATION<>'TEST P1') LOC2 
WHERE 
  LOC2.LAT_RAD BETWEEN LOC1.LAT_RAD-(0.0002525254004999719 * 500) AND LOC1.LAT_RAD+(0.0002525254004999719 * 500) 
  AND LOC2.LON_RAD BETWEEN LOC1.LON_RAD-(0.0002525254004999719 * 500) AND LOC1.LON_RAD+(0.0002525254004999719 * 500) 
  AND (ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960) < 500 
ORDER BY 
  LOC2.LOCATION; 

ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT OFF'; 

The fetch required almost no CPU time, from the 10046 trace file:

PARSE #406878256:c=0,e=3157,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=1311020123,tim=2586341558538 
EXEC #406878256:c=0,e=35,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1311020123,tim=2586341558744 
WAIT #406878256: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2586341558793 
FETCH #406878256:c=15600,e=7387,p=0,cr=94,cu=0,mis=0,r=1,dep=0,og=1,plh=1311020123,tim=2586341566207 
WAIT #406878256: nam='SQL*Net message from client' ela= 11647 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2586341577955 
WAIT #406878256: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2586341578017 
FETCH #406878256:c=0,e=42,p=0,cr=0,cu=0,mis=0,r=12,dep=0,og=1,plh=1311020123,tim=2586341578052 
STAT #406878256 id=1 cnt=13 pid=0 pos=1 obj=0 op='SORT ORDER BY (cr=94 pr=0 pw=0 time=7387 us cost=13 size=1950 card=39)' 
STAT #406878256 id=2 cnt=13 pid=1 pos=1 obj=0 op='NESTED LOOPS  (cr=94 pr=0 pw=0 time=3920 us cost=12 size=1950 card=39)' 
STAT #406878256 id=3 cnt=1 pid=2 pos=1 obj=90785 op='TABLE ACCESS BY INDEX ROWID T1 (cr=3 pr=0 pw=0 time=23 us cost=2 size=25 card=1)' 
STAT #406878256 id=4 cnt=1 pid=3 pos=1 obj=90786 op='INDEX UNIQUE SCAN SYS_C0038321 (cr=2 pr=0 pw=0 time=15 us cost=1 size=0 card=1)' 
STAT #406878256 id=5 cnt=13 pid=2 pos=2 obj=90785 op='TABLE ACCESS FULL T1 (cr=91 pr=0 pw=0 time=3894 us cost=10 size=975 card=39)' 
WAIT #406878256: nam='SQL*Net message from client' ela= 7604 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2586341585780 
CLOSE #406878256:c=0,e=13,dep=0,type=0,tim=2586341585850 

The above is not a finished product, it is still necessary to take care of wrap-around values for the longitude coordinates.


Actions

Information

9 responses

13 06 2011
Niall Litchfield

Charles

I *believe* but don’t know for sure that oracle locator (which is included in SE) will deal with this. For certain you can just call the web service directly via xdb. :)

13 06 2011
Charles Hooper

Niall,

Thank you for the information.

I just had another thought regarding how to further decrease the CPU requirements for this SQL statement, and thus to speed up the SQL statement… the mathematics logic is starting to come back to me. The latitude values must be in the range of +- the radian value that is associated with the distance along the surface of the earth. The same is true for the longitude values.

With that logic in mind, it would be possible to reduce the number of rows leaving the inline view, and thus entering the WHERE clause of the main SQL statement. If the radius of the earth is roughly 3960 miles, the circumference is 3960 * 2 * 3.14159275 = 24881 miles. 1 degree = 69.115 miles, 1 mile = 0.0144686 degrees, 1 mile = 2.525254004999719e-4 radians. Thus, for each mile of the radius, the search range beyond the central latitude/longitude pair would be extended by +-2.525254004999719e-4 radians to pre-filter the majority of the rows that would exit the inline view and enter the WHERE clause of the main portion of the query.

14 06 2011
Louis Ywema

Hi,
using Oracle Locator might speed it up and : The Earth isn’t flat anymore and Spatial knows it! ;-)

select sdo_geom.sdo_distance(sdo_geometry(2001
,
, null
, mdsys.SDO_ELEM_INFO_ARRAY(1, 1, 1)
, mdsys.SDO_ORDINATE_ARRAY(, )
)
,sdo_geometry(2001
,
, null
, mdsys.SDO_ELEM_INFO_ARRAY(1, 1, 1)
, mdsys.SDO_ORDINATE_ARRAY(, )
)
,0.1
,’unit=km’) distance
from dual;

14 06 2011
Gary

sdo_distance is one of the SDO_GEOM permitted under Locator (or as I prefer to call it sub-spatial). [Why they can't put the permitted procedures in a separate package from the non-permitted ones is a question only the Auditors can answer.]

http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28400/sdo_locator.htm#i632018

I don’t recall comparing the long equation mechanism with Locator/Spatial, but I did switch to the ‘flat earth’ calculation because it was a lot less cpu-intensive that SDO_GEOM and the error wasn’t significant for my purposes.

14 06 2011
Charles Hooper

Louis and Gary,

Thank you for the ideas. In an OTN thread (http://forums.oracle.com/forums/thread.jspa?threadID=1020948 ) I found an example of using SDO_GEOM.SDO_DISTANCE. The result? An error, why?

http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14255/sdo_locator.htm

“The installation of Locator depends on the successful and proper installation of Oracle interMedia. interMedia is installed and configured with Oracle Database 10g…”

http://download.oracle.com/docs/cd/E18283_01/appdev.112/e11830/sdo_locator.htm

“The installation of Locator depends on the successful and proper installation of Oracle Multimedia. Oracle Multimedia is installed and configured with Oracle Database 11g, although you can…”

If someone (like me) did not see much point in installing Oracle Multimedia (formerly called Oracle interMedia), it is quite likely that the same error will be received.

With Oracle Multimedia installed:

SELECT SDO_GEOM.SDO_DISTANCE(
SDO_GEOMETRY(2001,8307,SDO_POINT_TYPE((14 + 26/60),-(50 + 5/60),NULL),NULL,NULL),
SDO_GEOMETRY(2001,8307,SDO_POINT_TYPE(-(43 + 12/60),(22 + 57/60),NULL),NULL,NULL),
0.0001,'unit=km') AS DISTANCE_BETWEEN_POINTS 
FROM DUAL ;
 
DISTANCE_BETWEEN_POINTS
-----------------------
             9876.67951

According to Google, 9,876.67951 kilometers = 6,137.08412 miles – very close to the 6,152 miles returned by the formula I posted above.

SELECT SDO_GEOM.SDO_DISTANCE(
SDO_GEOMETRY(2001,8307,SDO_POINT_TYPE(-(36 + 44/60),(19 + 48/60),NULL),NULL,NULL),
SDO_GEOMETRY(2001,8307,SDO_POINT_TYPE((42),(88 + 30/60),NULL),NULL,NULL),
0.0001,'unit=km') AS DISTANCE_BETWEEN_POINTS 
FROM DUAL ;
 
DISTANCE_BETWEEN_POINTS
-----------------------
             7779.77983

According to Google, 7,779.77983 kilometers = 4,834.13107 miles – very close to the 4,832.09 miles returned by the formula I posted above.

I have not compared the performance of the SDO_GEOM.SDO_DISTANCE method with the performance of the formula that I posted above.

15 06 2011
Mark W. Farnham

Interesting analyses and optimizations, including the care folks stated about results being within requirements. If you need to be extremely precise though, remember that the earth is neither flat nor an ideal sphere.

15 06 2011
Charles Hooper

Mark,

Excellent points.

It probably would be a good idea to point out that the calculations do not determine driving distances between locations, nor do the calculations exactly determine flying distances (if a plane is a couple miles up in the air, that effectively makes the earth’s radius a couple of miles larger from the perspective of the plane. I now have a little more respect for the GPS manufacturers whose products attempt to determine the final destination arrival time.

4 09 2011
Tom Heller

Hi, I just stumbled upon this, in an effort to figure distances between radio towers for which I have the Lat/Lon coordinates.

I typed your first ‘flat Earth’ formula into Excel and it returned a figure very close to what I was expecting (tho’ a bit larger than the county’s GIS mapping system’s ‘digitize distance’ function provides). I’m not worried at this point about absolute precision.

BUT….then I typed into Excel the kilometer formula that started with acos — and I got blown out of the water. The 2.9 miles of the first equation swelled to -I forget, some couple thousand km’s. So, is there a decimal point missing somewhere in there?

I probably won’t return to learn your answer, but wanted to point out the ‘mis-spelling’. (I checked the formula I had typed and it perfectly reflected yours.)

Any way, thanks for the interesting post! I also came out with lots of respect for the challenges GPS software (and users) must cope!

4 09 2011
Charles Hooper

Tom,

I am not sure what might have happened with your formula in Excel. In this article I converted the Excel formula to work in a SQL statement, and that formula appeared to be producing the expected results. A couple of things to keep in mind:
* North and West positions should be entered as negative values
* Two negatives make a positive ( -p1 – -p2 ) = ( -p1 + p2)
* Accidentally reversing the longitude and latitude values will cause the calculations to be significantly off
* Excel, SQL, and various programming languages expect SIN and COS values in radians, so you need to convert the longitude and latitude values to their radian equivalents before plugging the values into the formula in Excel. This article describes how to perform the conversion:

=SIN(degree / 180 * 3.141592)

Once the conversion to radians is performed, you should be able to obtain the correct answer. The following is from the SQL statement:

ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Follow

Get every new post delivered to your Inbox.

Join 141 other followers

%d bloggers like this: