June 13, 2011 (Modified June 14, 2011)
A question recently appeared on the comp.databases.oracle.server Usenet group that oddly made me recall a lesson from a mathematics class that I taught in the early 1990s. A bit strange how a question related to Oracle Database would trigger such a memory, but it happened. The question posed in the thread initially asked how to calculate the distance between the points:
item latitude longitude ---- --------- --------- P1 50.716667 -1.883333 P2 51.023332 -1.872231
Let’s see, the earth is flat, so…
= ((51.023332 - 50.716667)^2 + (1.883333 - 1.872231)^2)^(1/2) * 3960 * 3.141592/180 miles = (0.094043422225 + 0.000123254404)^(1/2) * 3960 * 3.141592/180 miles = 0.30686589355775594241245361976998 * 3960 * 3.141592/180 miles = 1215.18894 * 3.141592/180 miles = 21.209 miles
So, if the earth is flat, the Pythagorean Theorem seems to give a sensible answer… maybe. My recollection of the formulas used for the correct calculation is a bit fuzzy. Good news, I found the three page lesson plan that I put together in the early 1990s to solve a problem similar to the one posed by the OP in the Usenet thread. The lesson plan begins by stating that part of the point of the lesson was to destroy Euclid’s Elements, which were introduced 23 centuries ago… a bit ambitious, looking back. Even though the lesson plan is not too terribly clear given the amount of mathematics knowledge that I have forgotten, I managed to find the following formula in the lesson plan:
c0 = cos^(-1)(cos(90-alpha1)*cos(90-alpha2) + sin(90-alpha1)*sin(90-alpha2)*cos(theta1-theta2)) distance = c0 * (2pi/360) * 3960 miles
That looks complicated, even without the help of a computer… what is alpha, and what is theta – it’s Greek to me?
A Google search found another helpful page that describes how to calculate distances between pairs of latitude and longitude coordinates. Included in that page is the following formula that is compatible with Microsoft Excel:
=ACOS(SIN(lat1)*SIN(lat2)+COS(lat1)*COS(lat2)*COS(lon2-lon1))*6371 KM
Other than outputting kilometers rather than miles, and using SIN where my formula uses COS, the two formulas are close to being identical. Based on that observation, alpha must be the latitude coordinate and theta must be the longitude coordinate in the formula that I found in my lesson plan.
The lesson plan included a couple of sample questions and answers:
Prague (14 degrees 26 minutes east, 50 degrees 5 minutes north) Rio de Janeiro (43 degrees 12 minutes west, 22 degrees 57 minutes south) Distance = 6152 miles
—
(11919 degrees 48 minutes west, 36 degrees 44 minutes north) (88 degrees 30 minutes west, 42 degrees south) Distance =5789.384832.09 miles
Will either of the two formulas, when plugged into an Oracle Database, output the distances that I wrote down roughly two decades ago? That sounds like a challenge.
The setup (make certain that you do not swap the longitude and latitude numbers, the results will be slightly different):
DROP TABLE T1 PURGE; CREATE TABLE T1 ( LOCATION VARCHAR2(40), LAT_RAD NUMBER, LON_RAD NUMBER, LAT_DEG NUMBER, LON_DEG NUMBER, PRIMARY KEY(LOCATION)); INSERT INTO T1 VALUES ( 'Prague', ROUND((-(50 + 5/60))*3.141592654/180,6), ROUND((14 + 26/60)*3.141592654/180,6), ROUND(-(50 + 5/60),6), ROUND(14 + 26/60,6) ); INSERT INTO T1 VALUES ( 'Rio de Janeiro', ROUND((22 + 57/60)*3.141592654/180,6), ROUND((-(43 + 12/60))*3.141592654/180,6), ROUND(22 + 57/60,6), ROUND(-(43 + 12/60),6) ); COMMIT;
Note in the above that I had to convert the degrees and minutes notation to decimal numbers and then also to equivalent radian values. It would probably be best just to store the radian values, because those values can be used directly in the calculations. Let’s take a look at the contents of this table:
COLUMN LAT_RAD FORMAT 99.999999 COLUMN LON_RAD FORMAT 99.999999 COLUMN LAT_DEG FORMAT 999.999999 COLUMN LON_DEG FORMAT 999.999999 COLUMN LOCATION FORMAT A20 SET LINESIZE 140 SET TRIMSPOOL ON SET PAGESIZE 1000 SELECT * FROM T1; LOCATION LAT_RAD LON_RAD LAT_DEG LON_DEG -------------------- ---------- ---------- ----------- ----------- Prague -.874119 .251909 -50.083333 14.433333 Rio de Janeiro .400553 -.753982 22.950000 -43.200000
As seen in the above, west and north directional values are assigned negative numbers.
Let’s try converting the Excel version of the formula to Oracle Database SQL:
COLUMN DISTANCE FORMAT 999990.00 SELECT ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960 DISTANCE FROM (SELECT LAT_RAD, LON_RAD FROM T1 WHERE LOCATION='Prague') LOC1, (SELECT LAT_RAD, LON_RAD FROM T1 WHERE LOCATION='Rio de Janeiro') LOC2; DISTANCE ---------- 6152.02
That number seems to match the number that I wrote into the lesson plan. Let’s try the formula from the lesson plan:
SELECT ACOS(COS(1.570796327-LOC1.LAT_RAD)*COS(1.570796327-LOC2.LAT_RAD) + SIN(1.570796327-LOC1.LAT_RAD)*SIN(1.570796327-LOC2.LAT_RAD)*COS(LOC1.LON_RAD-LOC2.LON_RAD)) * 3960 DISTANCE FROM (SELECT LAT_RAD, LON_RAD FROM T1 WHERE LOCATION='Prague') LOC1, (SELECT LAT_RAD, LON_RAD FROM T1 WHERE LOCATION='Rio de Janeiro') LOC2; DISTANCE ---------- 6152.02
If only I had access to an Oracle Database’s SQL interpretter back then rather than a fancy scientific calculator… 🙂
Let’s try to calculate the distance for the latitude and longitude values provided by the OP (we need to convert the degree values to radians):
SELECT ACOS(COS(1.570796327-LOC1.LAT_RAD)*COS(1.570796327-LOC2.LAT_RAD) + SIN(1.570796327-LOC1.LAT_RAD)*SIN(1.570796327-LOC2.LAT_RAD)*COS(LOC1.LON_RAD-LOC2.LON_RAD)) * 3960 DISTANCE FROM (SELECT 50.716667*3.141592/180 LAT_RAD, -1.883333*3.141592/180 LON_RAD FROM DUAL) LOC1, (SELECT 51.023332*3.141592/180 LAT_RAD, -1.872231*3.141592/180 LON_RAD FROM DUAL) LOC2; DISTANCE ---------- 21.20
You may notice that for the short distance between the latitude/longitude pairs that the “earth is flat” method produced the same answer that we see with the more complicated formula shown above.
For a double-check, let’s put the second set of demonstration points from my lesson plan into the T1 table:
INSERT INTO T1 VALUES ( 'TEST P1', NULL, NULL, 19 + 48/60,---- 119 + 48/60,-(36 + 44/60) ); INSERT INTO T1 VALUES ( 'TEST P2', NULL, NULL, 88 + 30/60, 42 ); UPDATE T1 SET LAT_RAD=ROUND(LAT_DEG/180*3.141592654,6), LON_RAD=ROUND(LON_DEG/180*3.141592654,6), LAT_DEG=ROUND(LAT_DEG,6), LON_DEG=ROUND(LON_DEG,6) WHERE LAT_RAD IS NULL; COMMIT;
Checking the distance with the two formulas:
SELECT ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960 DISTANCE FROM (SELECT LAT_RAD, LON_RAD FROM T1 WHERE LOCATION='TEST P1') LOC1, (SELECT LAT_RAD, LON_RAD FROM T1 WHERE LOCATION='TEST P2') LOC2; DISTANCE ---------- 4832.09---5789.07---
—
SELECT ACOS(COS(1.570796327-LOC1.LAT_RAD)*COS(1.570796327-LOC2.LAT_RAD) + SIN(1.570796327-LOC1.LAT_RAD)*SIN(1.570796327-LOC2.LAT_RAD)*COS(LOC1.LON_RAD-LOC2.LON_RAD)) * 3960 DISTANCE FROM (SELECT LAT_RAD, LON_RAD FROM T1 WHERE LOCATION='TEST P1') LOC1, (SELECT LAT_RAD, LON_RAD FROM T1 WHERE LOCATION='TEST P2') LOC2; DISTANCE ---------- 4832.09---5789.07---
The values returned are identical to the value I wrote into the lesson plan, so the formulas are probably correct.
The OP’s second question is to identify all locations that are within a given radius of a specified location. This might be easy, or it might not. Someone in the Usenet thread suggested using Oracle Spatial, which is an additional cost licensed item that may be added to the Enterprise Edition for a list price of $17,500 per CPU plus $3,850 (USD) annual maintenance. As I am not familiar with the product, I do not know if Oracle Spatial could answer this question (does someone know for certain?).
We need some more test data to see if we can find a solution for the OP. We will use the SIN and COS functions just to insert some “random” yet repeatable data:
INSERT INTO T1 SELECT 'POINT '||TO_CHAR(ROWNUM), NULL, NULL, COS(ROWNUM/40)*180, SIN(ROWNUM/33)*180 FROM DUAL CONNECT BY LEVEL<=10000; UPDATE T1 SET LAT_RAD=ROUND(LAT_DEG/180*3.141592654,6), LON_RAD=ROUND(LON_DEG/180*3.141592654,6), LAT_DEG=ROUND(LAT_DEG,6), LON_DEG=ROUND(LON_DEG,6) WHERE LAT_RAD IS NULL; COMMIT; EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER, TABNAME=>'T1', CASCADE=> TRUE)
Let’s give the first SQL statement a try, looking for those areas that are less than 500 miles away from “TEST P1”:
SELECT LOC2.LOCATION, ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960 DISTANCE FROM (SELECT LAT_RAD, LON_RAD FROM T1 WHERE LOCATION='TEST P1') LOC1, (SELECT LOCATION, LAT_RAD, LON_RAD FROM T1 WHERE LOCATION<>'TEST P1') LOC2 WHERE (ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960) < 500 ORDER BY LOC2.LOCATION; LOCATION DISTANCE -------------------- ---------- POINT 1561 331.91 POINT 1562 46.45 POINT 1563 426.22 POINT 200 439.78 POINT 2066 318.73 POINT 2067 387.62 POINT 4970 432.93 POINT 7989 210.35 POINT 7990 268.25 POINT 8494 443.05 POINT 9855 258.38 POINT 9856 120.22 Rio de Janeiro 469.61 13 rows selected.LOCATION DISTANCE -------------------- ---------- POINT 1358 428.75 POINT 1359 487.10 POINT 402 452.35 POINT 403 124.05 POINT 404 247.85 POINT 4675 452.49 POINT 4676 168.24 POINT 4677 306.37 POINT 4847 405.92 POINT 5206 449.08 POINT 5207 160.17 POINT 5208 367.93 POINT 5378 496.50 POINT 5379 341.63 POINT 5380 453.77 POINT 6714 282.04 POINT 6715 96.74 POINT 6716 484.03 POINT 8696 386.01 POINT 8697 77.56 POINT 8698 312.19 POINT 9652 420.05 22 rows selected.
And the second SQL statement:
SELECT LOC2.LOCATION, ACOS(COS(1.570796327-LOC1.LAT_RAD)*COS(1.570796327-LOC2.LAT_RAD) + SIN(1.570796327-LOC1.LAT_RAD)*SIN(1.570796327-LOC2.LAT_RAD)*COS(LOC1.LON_RAD-LOC2.LON_RAD)) * 3960 DISTANCE FROM (SELECT LAT_RAD, LON_RAD FROM T1 WHERE LOCATION='TEST P1') LOC1, (SELECT LOCATION, LAT_RAD, LON_RAD FROM T1 WHERE LOCATION<>'TEST P1') LOC2 WHERE (ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960) < 500 ORDER BY LOC2.LOCATION; LOCATION DISTANCE -------------------- ---------- POINT 1561 331.91 POINT 1562 46.45 POINT 1563 426.22 POINT 200 439.78 POINT 2066 318.73 POINT 2067 387.62 POINT 4970 432.93 POINT 7989 210.35 POINT 7990 268.25 POINT 8494 443.05 POINT 9855 258.38 POINT 9856 120.22 Rio de Janeiro 469.61 13 rows selected.LOCATION DISTANCE -------------------- ---------- POINT 1358 428.75 POINT 1359 487.10 POINT 402 452.35 POINT 403 124.05 POINT 404 247.85 POINT 4675 452.49 POINT 4676 168.24 POINT 4677 306.37 POINT 4847 405.92 POINT 5206 449.08 POINT 5207 160.17 POINT 5208 367.93 POINT 5378 496.50 POINT 5379 341.63 POINT 5380 453.77 POINT 6714 282.04 POINT 6715 96.74 POINT 6716 484.03 POINT 8696 386.01 POINT 8697 77.56 POINT 8698 312.19 POINT 9652 420.05 22 rows selected.
If you tried either of the above two SQL statements, you probably found that it took a couple of seconds to make it through the 10,004 or so rows in the table. All of the calculations are likely hammering the server’s CPU (go parallel and hammer more than one CPU?).
Maybe we should try submitting bind variables in place of some of the calculations that we are requesting for the server to perform? Let’s set up the bind variables to prepare another test:
SELECT SIN(LAT_RAD), COS(LAT_RAD), SIN(LON_RAD), COS(LON_RAD) FROM T1 WHERE LOCATION='TEST P1'; SIN(LAT_RAD) COS(LAT_RAD) SIN(LON_RAD) COS(LON_RAD) ------------ ------------ ------------ ------------ .33873774 .940880834 -.59809181 .801427592SIN(LAT_RAD) COS(LAT_RAD) SIN(LON_RAD) COS(LON_RAD) ------------ ------------ ------------ ------------ -.59809181 .801427592 .867765674 -.49697358VAR SIN1_LAT_RAD NUMBER VAR COS1_LAT_RAD NUMBER VAR SIN1_LON_RAD NUMBER VAR COS1_LON_RAD NUMBER EXEC :SIN1_LAT_RAD := .33873774 EXEC :COS1_LAT_RAD := .940880834 EXEC :SIN1_LON_RAD := -.59809181 EXEC :COS1_LON_RAD := .801427592EXEC :SIN1_LAT_RAD := -.59809181 EXEC :COS1_LAT_RAD := .801427592 EXEC :SIN1_LON_RAD := .867765674 EXEC :COS1_LON_RAD := -.49697358
To test the performance, execute the following scipt (more than once to eliminate the hard parse):
SET ARRAYSIZE 100 ALTER SESSION SET TRACEFILE_IDENTIFIER = 'SQL_10046'; ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 8'; SELECT LOC2.LOCATION, ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960 DISTANCE FROM (SELECT LAT_RAD, LON_RAD FROM T1 WHERE LOCATION='TEST P1') LOC1, (SELECT LOCATION, LAT_RAD, LON_RAD FROM T1 WHERE LOCATION<>'TEST P1') LOC2 WHERE (ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960) < 500 ORDER BY LOC2.LOCATION; SELECT LOC2.LOCATION, ACOS(COS(1.570796327-LOC1.LAT_RAD)*COS(1.570796327-LOC2.LAT_RAD) + SIN(1.570796327-LOC1.LAT_RAD)*SIN(1.570796327-LOC2.LAT_RAD)*COS(LOC1.LON_RAD-LOC2.LON_RAD)) * 3960 DISTANCE FROM (SELECT LAT_RAD, LON_RAD FROM T1 WHERE LOCATION='TEST P1') LOC1, (SELECT LOCATION, LAT_RAD, LON_RAD FROM T1 WHERE LOCATION<>'TEST P1') LOC2 WHERE (ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960) < 500 ORDER BY LOC2.LOCATION; SELECT LOC2.LOCATION, ABS(ACOS(:SIN1_LAT_RAD*SIN(LOC2.LAT_RAD)+:COS1_LAT_RAD*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960 DISTANCE FROM (SELECT LAT_RAD, LON_RAD FROM T1 WHERE LOCATION='TEST P1') LOC1, (SELECT LOCATION, LAT_RAD, LON_RAD FROM T1 WHERE LOCATION<>'TEST P1') LOC2 WHERE (ABS(ACOS(:SIN1_LAT_RAD*SIN(LOC2.LAT_RAD)+:COS1_LAT_RAD*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960) < 500 ORDER BY LOC2.LOCATION; ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT OFF';
Trace file output for the first query (1.794 CPU seconds):
FETCH #406890544:c=1794011,e=1797194,p=0,cr=94,cu=0,mis=0,r=1,dep=0,og=1,plh=2797247905,tim=2500817545497 WAIT #406890544: nam='SQL*Net message from client' ela= 357 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500817545983 WAIT #406890544: nam='SQL*Net message to client' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500817546020 FETCH #406890544:c=0,e=44,p=0,cr=0,cu=0,mis=0,r=21,dep=0,og=1,plh=2797247905,tim=2500817546058 STAT #406890544 id=1 cnt=22 pid=0 pos=1 obj=0 op='SORT ORDER BY (cr=94 pr=0 pw=0 time=1797194 us cost=13 size=25000 card=500)' STAT #406890544 id=2 cnt=22 pid=1 pos=1 obj=0 op='NESTED LOOPS (cr=94 pr=0 pw=0 time=125707 us cost=12 size=25000 card=500)' STAT #406890544 id=3 cnt=1 pid=2 pos=1 obj=90695 op='TABLE ACCESS BY INDEX ROWID T1 (cr=3 pr=0 pw=0 time=23 us cost=2 size=25 card=1)' STAT #406890544 id=4 cnt=1 pid=3 pos=1 obj=90696 op='INDEX UNIQUE SCAN SYS_C0038308 (cr=2 pr=0 pw=0 time=17 us cost=1 size=0 card=1)' STAT #406890544 id=5 cnt=22 pid=2 pos=2 obj=90695 op='TABLE ACCESS FULL T1 (cr=91 pr=0 pw=0 time=125683 us cost=10 size=12500 card=500)' WAIT #406890544: nam='SQL*Net message from client' ela= 4359 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500817550490 CLOSE #406890544:c=0,e=14,dep=0,type=0,tim=2500817550613
Trace file output for the second query (1.529 CPU seconds):
FETCH #406890544:c=1528810,e=1519062,p=0,cr=94,cu=0,mis=0,r=1,dep=0,og=1,plh=2797247905,tim=2500819083108 WAIT #406890544: nam='SQL*Net message from client' ela= 320 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500819083487 WAIT #406890544: nam='SQL*Net message to client' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500819083524 FETCH #406890544:c=0,e=42,p=0,cr=0,cu=0,mis=0,r=21,dep=0,og=1,plh=2797247905,tim=2500819083560 STAT #406890544 id=1 cnt=22 pid=0 pos=1 obj=0 op='SORT ORDER BY (cr=94 pr=0 pw=0 time=1519059 us cost=13 size=25000 card=500)' STAT #406890544 id=2 cnt=22 pid=1 pos=1 obj=0 op='NESTED LOOPS (cr=94 pr=0 pw=0 time=50452 us cost=12 size=25000 card=500)' STAT #406890544 id=3 cnt=1 pid=2 pos=1 obj=90695 op='TABLE ACCESS BY INDEX ROWID T1 (cr=3 pr=0 pw=0 time=15 us cost=2 size=25 card=1)' STAT #406890544 id=4 cnt=1 pid=3 pos=1 obj=90696 op='INDEX UNIQUE SCAN SYS_C0038308 (cr=2 pr=0 pw=0 time=10 us cost=1 size=0 card=1)' STAT #406890544 id=5 cnt=22 pid=2 pos=2 obj=90695 op='TABLE ACCESS FULL T1 (cr=91 pr=0 pw=0 time=50437 us cost=10 size=12500 card=500)' WAIT #406890544: nam='SQL*Net message from client' ela= 3294 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500819086974 CLOSE #406890544:c=0,e=13,dep=0,type=0,tim=2500819087016
Trace file output for the third query (1.388 CPU seconds):
FETCH #406890544:c=1388409,e=1380995,p=0,cr=94,cu=0,mis=0,r=1,dep=0,og=1,plh=2797247905,tim=2500820478110 WAIT #406890544: nam='SQL*Net message from client' ela= 278 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500820478441 WAIT #406890544: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500820478480 FETCH #406890544:c=0,e=46,p=0,cr=0,cu=0,mis=0,r=21,dep=0,og=1,plh=2797247905,tim=2500820478519 STAT #406890544 id=1 cnt=22 pid=0 pos=1 obj=0 op='SORT ORDER BY (cr=94 pr=0 pw=0 time=1380991 us cost=13 size=21500 card=500)' STAT #406890544 id=2 cnt=22 pid=1 pos=1 obj=0 op='NESTED LOOPS (cr=94 pr=0 pw=0 time=45079 us cost=12 size=21500 card=500)' STAT #406890544 id=3 cnt=1 pid=2 pos=1 obj=90695 op='TABLE ACCESS BY INDEX ROWID T1 (cr=3 pr=0 pw=0 time=12 us cost=2 size=18 card=1)' STAT #406890544 id=4 cnt=1 pid=3 pos=1 obj=90696 op='INDEX UNIQUE SCAN SYS_C0038308 (cr=2 pr=0 pw=0 time=9 us cost=1 size=0 card=1)' STAT #406890544 id=5 cnt=22 pid=2 pos=2 obj=90695 op='TABLE ACCESS FULL T1 (cr=91 pr=0 pw=0 time=45046 us cost=10 size=12500 card=500)' WAIT #406890544: nam='SQL*Net message from client' ela= 2713 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500820481326 CLOSE #406890544:c=0,e=15,dep=0,type=0,tim=2500820481372
Maybe we should just assume that the world is flat to save time (note that I am not converting the degree values to radians)?
SELECT LOC2.LOCATION, SQRT(POWER((LOC1.LAT_DEG - LOC2.LAT_DEG),2) + POWER((LOC1.LON_DEG - LOC2.LON_DEG),2)) * 3960 * 3.141592654/180 DISTANCE FROM (SELECT LAT_DEG, LON_DEG FROM T1 WHERE LOCATION='TEST P1') LOC1, (SELECT LOCATION, LAT_DEG, LON_DEG FROM T1 WHERE LOCATION<>'TEST P1') LOC2 WHERE (SQRT(POWER((LOC1.LAT_DEG - LOC2.LAT_DEG),2) + POWER((LOC1.LON_DEG - LOC2.LON_DEG),2)) * 3960 * 3.141592654/180) < 500 ORDER BY LOC2.LOCATION; LOCATION DISTANCE -------------------- ---------- POINT 1561 351.10 POINT 1562 49.17 POINT 1563 446.41 POINT 200 447.03 POINT 2066 327.96 POINT 2067 389.31 POINT 4970 436.30 POINT 7989 222.68 POINT 7990 274.47 POINT 8494 446.20 POINT 9855 272.91 POINT 9856 126.70 Rio de Janeiro 497.15 13 rows selected.LOCATION DISTANCE -------------------- ---------- POINT 4847 487.345998 POINT 5207 194.867475 POINT 5208 371.803354 POINT 6714 306.023772 POINT 6715 110.18648 5 rows selected.
Still 13 rows returned, but the distances are a bit off due to the curvature of the earth. What happened to the other 17 rows… did I make a mistake, or is the world not flat?
Note that the original version of this article made a critical mistake – latitude values cannot exceed 90 degrees. While trying to implement my suggestion in a comment to use +-0.0002525254004999719 radians to speed up the query (note, still need to take into account the wrap-around for longitude values) I discovered inconsistent results. This article is corrected to fix that mistake.
Added June 14, 2011:
The speed up version of the query:
SET ARRAYSIZE 100 ALTER SESSION SET TRACEFILE_IDENTIFIER = 'SQL_10046-FAST'; ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 8'; SELECT LOC2.LOCATION, ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960 DISTANCE FROM (SELECT LAT_RAD, LON_RAD FROM T1 WHERE LOCATION='TEST P1') LOC1, (SELECT LOCATION, LAT_RAD, LON_RAD FROM T1 WHERE LOCATION<>'TEST P1') LOC2 WHERE LOC2.LAT_RAD BETWEEN LOC1.LAT_RAD-(0.0002525254004999719 * 500) AND LOC1.LAT_RAD+(0.0002525254004999719 * 500) AND LOC2.LON_RAD BETWEEN LOC1.LON_RAD-(0.0002525254004999719 * 500) AND LOC1.LON_RAD+(0.0002525254004999719 * 500) AND (ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960) < 500 ORDER BY LOC2.LOCATION; ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT OFF';
The fetch required almost no CPU time, from the 10046 trace file:
PARSE #406878256:c=0,e=3157,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=1311020123,tim=2586341558538 EXEC #406878256:c=0,e=35,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1311020123,tim=2586341558744 WAIT #406878256: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2586341558793 FETCH #406878256:c=15600,e=7387,p=0,cr=94,cu=0,mis=0,r=1,dep=0,og=1,plh=1311020123,tim=2586341566207 WAIT #406878256: nam='SQL*Net message from client' ela= 11647 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2586341577955 WAIT #406878256: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2586341578017 FETCH #406878256:c=0,e=42,p=0,cr=0,cu=0,mis=0,r=12,dep=0,og=1,plh=1311020123,tim=2586341578052 STAT #406878256 id=1 cnt=13 pid=0 pos=1 obj=0 op='SORT ORDER BY (cr=94 pr=0 pw=0 time=7387 us cost=13 size=1950 card=39)' STAT #406878256 id=2 cnt=13 pid=1 pos=1 obj=0 op='NESTED LOOPS (cr=94 pr=0 pw=0 time=3920 us cost=12 size=1950 card=39)' STAT #406878256 id=3 cnt=1 pid=2 pos=1 obj=90785 op='TABLE ACCESS BY INDEX ROWID T1 (cr=3 pr=0 pw=0 time=23 us cost=2 size=25 card=1)' STAT #406878256 id=4 cnt=1 pid=3 pos=1 obj=90786 op='INDEX UNIQUE SCAN SYS_C0038321 (cr=2 pr=0 pw=0 time=15 us cost=1 size=0 card=1)' STAT #406878256 id=5 cnt=13 pid=2 pos=2 obj=90785 op='TABLE ACCESS FULL T1 (cr=91 pr=0 pw=0 time=3894 us cost=10 size=975 card=39)' WAIT #406878256: nam='SQL*Net message from client' ela= 7604 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2586341585780 CLOSE #406878256:c=0,e=13,dep=0,type=0,tim=2586341585850
The above is not a finished product, it is still necessary to take care of wrap-around values for the longitude coordinates.
Charles
I *believe* but don’t know for sure that oracle locator (which is included in SE) will deal with this. For certain you can just call the web service directly via xdb. 🙂
Niall,
Thank you for the information.
I just had another thought regarding how to further decrease the CPU requirements for this SQL statement, and thus to speed up the SQL statement… the mathematics logic is starting to come back to me. The latitude values must be in the range of +- the radian value that is associated with the distance along the surface of the earth. The same is true for the longitude values.
With that logic in mind, it would be possible to reduce the number of rows leaving the inline view, and thus entering the WHERE clause of the main SQL statement. If the radius of the earth is roughly 3960 miles, the circumference is 3960 * 2 * 3.14159275 = 24881 miles. 1 degree = 69.115 miles, 1 mile = 0.0144686 degrees, 1 mile = 2.525254004999719e-4 radians. Thus, for each mile of the radius, the search range beyond the central latitude/longitude pair would be extended by +-2.525254004999719e-4 radians to pre-filter the majority of the rows that would exit the inline view and enter the WHERE clause of the main portion of the query.
Hi,
using Oracle Locator might speed it up and : The Earth isn’t flat anymore and Spatial knows it! 😉
select sdo_geom.sdo_distance(sdo_geometry(2001
,
, null
, mdsys.SDO_ELEM_INFO_ARRAY(1, 1, 1)
, mdsys.SDO_ORDINATE_ARRAY(, )
)
,sdo_geometry(2001
,
, null
, mdsys.SDO_ELEM_INFO_ARRAY(1, 1, 1)
, mdsys.SDO_ORDINATE_ARRAY(, )
)
,0.1
,’unit=km’) distance
from dual;
sdo_distance is one of the SDO_GEOM permitted under Locator (or as I prefer to call it sub-spatial). [Why they can’t put the permitted procedures in a separate package from the non-permitted ones is a question only the Auditors can answer.]
http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28400/sdo_locator.htm#i632018
I don’t recall comparing the long equation mechanism with Locator/Spatial, but I did switch to the ‘flat earth’ calculation because it was a lot less cpu-intensive that SDO_GEOM and the error wasn’t significant for my purposes.
Louis and Gary,
Thank you for the ideas. In an OTN thread (http://forums.oracle.com/forums/thread.jspa?threadID=1020948 ) I found an example of using SDO_GEOM.SDO_DISTANCE. The result? An error, why?
http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14255/sdo_locator.htm
“The installation of Locator depends on the successful and proper installation of Oracle interMedia. interMedia is installed and configured with Oracle Database 10g…”
http://download.oracle.com/docs/cd/E18283_01/appdev.112/e11830/sdo_locator.htm
“The installation of Locator depends on the successful and proper installation of Oracle Multimedia. Oracle Multimedia is installed and configured with Oracle Database 11g, although you can…”
If someone (like me) did not see much point in installing Oracle Multimedia (formerly called Oracle interMedia), it is quite likely that the same error will be received.
With Oracle Multimedia installed:
According to Google, 9,876.67951 kilometers = 6,137.08412 miles – very close to the 6,152 miles returned by the formula I posted above.
—
According to Google, 7,779.77983 kilometers = 4,834.13107 miles – very close to the 4,832.09 miles returned by the formula I posted above.
I have not compared the performance of the SDO_GEOM.SDO_DISTANCE method with the performance of the formula that I posted above.
Interesting analyses and optimizations, including the care folks stated about results being within requirements. If you need to be extremely precise though, remember that the earth is neither flat nor an ideal sphere.
Mark,
Excellent points.
It probably would be a good idea to point out that the calculations do not determine driving distances between locations, nor do the calculations exactly determine flying distances (if a plane is a couple miles up in the air, that effectively makes the earth’s radius a couple of miles larger from the perspective of the plane. I now have a little more respect for the GPS manufacturers whose products attempt to determine the final destination arrival time.
Hi, I just stumbled upon this, in an effort to figure distances between radio towers for which I have the Lat/Lon coordinates.
I typed your first ‘flat Earth’ formula into Excel and it returned a figure very close to what I was expecting (tho’ a bit larger than the county’s GIS mapping system’s ‘digitize distance’ function provides). I’m not worried at this point about absolute precision.
BUT….then I typed into Excel the kilometer formula that started with acos — and I got blown out of the water. The 2.9 miles of the first equation swelled to -I forget, some couple thousand km’s. So, is there a decimal point missing somewhere in there?
I probably won’t return to learn your answer, but wanted to point out the ‘mis-spelling’. (I checked the formula I had typed and it perfectly reflected yours.)
Any way, thanks for the interesting post! I also came out with lots of respect for the challenges GPS software (and users) must cope!
Tom,
I am not sure what might have happened with your formula in Excel. In this article I converted the Excel formula to work in a SQL statement, and that formula appeared to be producing the expected results. A couple of things to keep in mind:
* North and West positions should be entered as negative values
* Two negatives make a positive ( -p1 – -p2 ) = ( -p1 + p2)
* Accidentally reversing the longitude and latitude values will cause the calculations to be significantly off
* Excel, SQL, and various programming languages expect SIN and COS values in radians, so you need to convert the longitude and latitude values to their radian equivalents before plugging the values into the formula in Excel. This article describes how to perform the conversion:
Once the conversion to radians is performed, you should be able to obtain the correct answer. The following is from the SQL statement:
Hi @Charles Hooper
(DROP TABLE T1 PURGE;
CREATE TABLE T1 (
LOCATION VARCHAR2(40),
LAT_RAD NUMBER,
LON_RAD NUMBER,
LAT_DEG NUMBER,
LON_DEG NUMBER,
PRIMARY KEY(LOCATION));
INSERT INTO T1 VALUES (
‘Prague’,
ROUND((-(50 + 5/60))*3.141592654/180,6),
ROUND((14 + 26/60)*3.141592654/180,6),
ROUND(-(50 + 5/60),6),
ROUND(14 + 26/60,6)
);
INSERT INTO T1 VALUES (
‘Rio de Janeiro’,
ROUND((22 + 57/60)*3.141592654/180,6),
ROUND((-(43 + 12/60))*3.141592654/180,6),
ROUND(22 + 57/60,6),
ROUND(-(43 + 12/60),6)
);
COMMIT; )
The above query is working fine but with the above query i can only get Latitude&Longitude(Degrees,Radius) but i want to catch the seconds also in my use case can you help me in resolving that issue.
Thank you,
P Venkat Prasad Reddy.