June 13, 2011 (Modified June 14, 2011)
A question recently appeared on the comp.databases.oracle.server Usenet group that oddly made me recall a lesson from a mathematics class that I taught in the early 1990s. A bit strange how a question related to Oracle Database would trigger such a memory, but it happened. The question posed in the thread initially asked how to calculate the distance between the points:
item latitude longitude
---- --------- ---------
P1 50.716667 -1.883333
P2 51.023332 -1.872231
Let’s see, the earth is flat, so…
= ((51.023332 - 50.716667)^2 + (1.883333 - 1.872231)^2)^(1/2) * 3960 * 3.141592/180 miles
= (0.094043422225 + 0.000123254404)^(1/2) * 3960 * 3.141592/180 miles
= 0.30686589355775594241245361976998 * 3960 * 3.141592/180 miles
= 1215.18894 * 3.141592/180 miles
= 21.209 miles
So, if the earth is flat, the Pythagorean Theorem seems to give a sensible answer… maybe. My recollection of the formulas used for the correct calculation is a bit fuzzy. Good news, I found the three page lesson plan that I put together in the early 1990s to solve a problem similar to the one posed by the OP in the Usenet thread. The lesson plan begins by stating that part of the point of the lesson was to destroy Euclid’s Elements, which were introduced 23 centuries ago… a bit ambitious, looking back. Even though the lesson plan is not too terribly clear given the amount of mathematics knowledge that I have forgotten, I managed to find the following formula in the lesson plan:
c0 = cos^(-1)(cos(90-alpha1)*cos(90-alpha2) + sin(90-alpha1)*sin(90-alpha2)*cos(theta1-theta2))
distance = c0 * (2pi/360) * 3960 miles
That looks complicated, even without the help of a computer… what is alpha, and what is theta – it’s Greek to me?
A Google search found another helpful page that describes how to calculate distances between pairs of latitude and longitude coordinates. Included in that page is the following formula that is compatible with Microsoft Excel:
=ACOS(SIN(lat1)*SIN(lat2)+COS(lat1)*COS(lat2)*COS(lon2-lon1))*6371 KM
Other than outputting kilometers rather than miles, and using SIN where my formula uses COS, the two formulas are close to being identical. Based on that observation, alpha must be the latitude coordinate and theta must be the longitude coordinate in the formula that I found in my lesson plan.
The lesson plan included a couple of sample questions and answers:
Prague (14 degrees 26 minutes east, 50 degrees 5 minutes north)
Rio de Janeiro (43 degrees 12 minutes west, 22 degrees 57 minutes south)
Distance = 6152 miles
—
(119 19 degrees 48 minutes west, 36 degrees 44 minutes north)
(88 degrees 30 minutes west, 42 degrees south)
Distance = 5789.38 4832.09 miles
Will either of the two formulas, when plugged into an Oracle Database, output the distances that I wrote down roughly two decades ago? That sounds like a challenge.
The setup (make certain that you do not swap the longitude and latitude numbers, the results will be slightly different):
DROP TABLE T1 PURGE;
CREATE TABLE T1 (
LOCATION VARCHAR2(40),
LAT_RAD NUMBER,
LON_RAD NUMBER,
LAT_DEG NUMBER,
LON_DEG NUMBER,
PRIMARY KEY(LOCATION));
INSERT INTO T1 VALUES (
'Prague',
ROUND((-(50 + 5/60))*3.141592654/180,6),
ROUND((14 + 26/60)*3.141592654/180,6),
ROUND(-(50 + 5/60),6),
ROUND(14 + 26/60,6)
);
INSERT INTO T1 VALUES (
'Rio de Janeiro',
ROUND((22 + 57/60)*3.141592654/180,6),
ROUND((-(43 + 12/60))*3.141592654/180,6),
ROUND(22 + 57/60,6),
ROUND(-(43 + 12/60),6)
);
COMMIT;
Note in the above that I had to convert the degrees and minutes notation to decimal numbers and then also to equivalent radian values. It would probably be best just to store the radian values, because those values can be used directly in the calculations. Let’s take a look at the contents of this table:
COLUMN LAT_RAD FORMAT 99.999999
COLUMN LON_RAD FORMAT 99.999999
COLUMN LAT_DEG FORMAT 999.999999
COLUMN LON_DEG FORMAT 999.999999
COLUMN LOCATION FORMAT A20
SET LINESIZE 140
SET TRIMSPOOL ON
SET PAGESIZE 1000
SELECT
*
FROM
T1;
LOCATION LAT_RAD LON_RAD LAT_DEG LON_DEG
-------------------- ---------- ---------- ----------- -----------
Prague -.874119 .251909 -50.083333 14.433333
Rio de Janeiro .400553 -.753982 22.950000 -43.200000
As seen in the above, west and north directional values are assigned negative numbers.
Let’s try converting the Excel version of the formula to Oracle Database SQL:
COLUMN DISTANCE FORMAT 999990.00
SELECT
ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960 DISTANCE
FROM
(SELECT
LAT_RAD,
LON_RAD
FROM
T1
WHERE
LOCATION='Prague') LOC1,
(SELECT
LAT_RAD,
LON_RAD
FROM
T1
WHERE
LOCATION='Rio de Janeiro') LOC2;
DISTANCE
----------
6152.02
That number seems to match the number that I wrote into the lesson plan. Let’s try the formula from the lesson plan:
SELECT
ACOS(COS(1.570796327-LOC1.LAT_RAD)*COS(1.570796327-LOC2.LAT_RAD) + SIN(1.570796327-LOC1.LAT_RAD)*SIN(1.570796327-LOC2.LAT_RAD)*COS(LOC1.LON_RAD-LOC2.LON_RAD)) * 3960 DISTANCE
FROM
(SELECT
LAT_RAD,
LON_RAD
FROM
T1
WHERE
LOCATION='Prague') LOC1,
(SELECT
LAT_RAD,
LON_RAD
FROM
T1
WHERE
LOCATION='Rio de Janeiro') LOC2;
DISTANCE
----------
6152.02
If only I had access to an Oracle Database’s SQL interpretter back then rather than a fancy scientific calculator… 🙂
Let’s try to calculate the distance for the latitude and longitude values provided by the OP (we need to convert the degree values to radians):
SELECT
ACOS(COS(1.570796327-LOC1.LAT_RAD)*COS(1.570796327-LOC2.LAT_RAD) + SIN(1.570796327-LOC1.LAT_RAD)*SIN(1.570796327-LOC2.LAT_RAD)*COS(LOC1.LON_RAD-LOC2.LON_RAD)) * 3960 DISTANCE
FROM
(SELECT
50.716667*3.141592/180 LAT_RAD,
-1.883333*3.141592/180 LON_RAD
FROM
DUAL) LOC1,
(SELECT
51.023332*3.141592/180 LAT_RAD,
-1.872231*3.141592/180 LON_RAD
FROM
DUAL) LOC2;
DISTANCE
----------
21.20
You may notice that for the short distance between the latitude/longitude pairs that the “earth is flat” method produced the same answer that we see with the more complicated formula shown above.
For a double-check, let’s put the second set of demonstration points from my lesson plan into the T1 table:
INSERT INTO T1 VALUES (
'TEST P1',
NULL,
NULL,
19 + 48/60, ---- 119 + 48/60,
-(36 + 44/60)
);
INSERT INTO T1 VALUES (
'TEST P2',
NULL,
NULL,
88 + 30/60,
42
);
UPDATE
T1
SET
LAT_RAD=ROUND(LAT_DEG/180*3.141592654,6),
LON_RAD=ROUND(LON_DEG/180*3.141592654,6),
LAT_DEG=ROUND(LAT_DEG,6),
LON_DEG=ROUND(LON_DEG,6)
WHERE
LAT_RAD IS NULL;
COMMIT;
Checking the distance with the two formulas:
SELECT
ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960 DISTANCE
FROM
(SELECT
LAT_RAD,
LON_RAD
FROM
T1
WHERE
LOCATION='TEST P1') LOC1,
(SELECT
LAT_RAD,
LON_RAD
FROM
T1
WHERE
LOCATION='TEST P2') LOC2;
DISTANCE
----------
4832.09 ---5789.07---
—
SELECT
ACOS(COS(1.570796327-LOC1.LAT_RAD)*COS(1.570796327-LOC2.LAT_RAD) + SIN(1.570796327-LOC1.LAT_RAD)*SIN(1.570796327-LOC2.LAT_RAD)*COS(LOC1.LON_RAD-LOC2.LON_RAD)) * 3960 DISTANCE
FROM
(SELECT
LAT_RAD,
LON_RAD
FROM
T1
WHERE
LOCATION='TEST P1') LOC1,
(SELECT
LAT_RAD,
LON_RAD
FROM
T1
WHERE
LOCATION='TEST P2') LOC2;
DISTANCE
----------
4832.09 ---5789.07---
The values returned are identical to the value I wrote into the lesson plan, so the formulas are probably correct.
The OP’s second question is to identify all locations that are within a given radius of a specified location. This might be easy, or it might not. Someone in the Usenet thread suggested using Oracle Spatial, which is an additional cost licensed item that may be added to the Enterprise Edition for a list price of $17,500 per CPU plus $3,850 (USD) annual maintenance. As I am not familiar with the product, I do not know if Oracle Spatial could answer this question (does someone know for certain?).
We need some more test data to see if we can find a solution for the OP. We will use the SIN and COS functions just to insert some “random” yet repeatable data:
INSERT INTO
T1
SELECT
'POINT '||TO_CHAR(ROWNUM),
NULL,
NULL,
COS(ROWNUM/40)*180,
SIN(ROWNUM/33)*180
FROM
DUAL
CONNECT BY
LEVEL<=10000;
UPDATE
T1
SET
LAT_RAD=ROUND(LAT_DEG/180*3.141592654,6),
LON_RAD=ROUND(LON_DEG/180*3.141592654,6),
LAT_DEG=ROUND(LAT_DEG,6),
LON_DEG=ROUND(LON_DEG,6)
WHERE
LAT_RAD IS NULL;
COMMIT;
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER, TABNAME=>'T1', CASCADE=> TRUE)
Let’s give the first SQL statement a try, looking for those areas that are less than 500 miles away from “TEST P1”:
SELECT
LOC2.LOCATION,
ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960 DISTANCE
FROM
(SELECT
LAT_RAD,
LON_RAD
FROM
T1
WHERE
LOCATION='TEST P1') LOC1,
(SELECT
LOCATION,
LAT_RAD,
LON_RAD
FROM
T1
WHERE
LOCATION<>'TEST P1') LOC2
WHERE
(ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960) < 500
ORDER BY
LOC2.LOCATION;
LOCATION DISTANCE
-------------------- ----------
POINT 1561 331.91
POINT 1562 46.45
POINT 1563 426.22
POINT 200 439.78
POINT 2066 318.73
POINT 2067 387.62
POINT 4970 432.93
POINT 7989 210.35
POINT 7990 268.25
POINT 8494 443.05
POINT 9855 258.38
POINT 9856 120.22
Rio de Janeiro 469.61
13 rows selected.
LOCATION DISTANCE -------------------- ---------- POINT 1358 428.75 POINT 1359 487.10 POINT 402 452.35 POINT 403 124.05 POINT 404 247.85 POINT 4675 452.49 POINT 4676 168.24 POINT 4677 306.37 POINT 4847 405.92 POINT 5206 449.08 POINT 5207 160.17 POINT 5208 367.93 POINT 5378 496.50 POINT 5379 341.63 POINT 5380 453.77 POINT 6714 282.04 POINT 6715 96.74 POINT 6716 484.03 POINT 8696 386.01 POINT 8697 77.56 POINT 8698 312.19 POINT 9652 420.05 22 rows selected.
And the second SQL statement:
SELECT
LOC2.LOCATION,
ACOS(COS(1.570796327-LOC1.LAT_RAD)*COS(1.570796327-LOC2.LAT_RAD) + SIN(1.570796327-LOC1.LAT_RAD)*SIN(1.570796327-LOC2.LAT_RAD)*COS(LOC1.LON_RAD-LOC2.LON_RAD)) * 3960 DISTANCE
FROM
(SELECT
LAT_RAD,
LON_RAD
FROM
T1
WHERE
LOCATION='TEST P1') LOC1,
(SELECT
LOCATION,
LAT_RAD,
LON_RAD
FROM
T1
WHERE
LOCATION<>'TEST P1') LOC2
WHERE
(ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960) < 500
ORDER BY
LOC2.LOCATION;
LOCATION DISTANCE
-------------------- ----------
POINT 1561 331.91
POINT 1562 46.45
POINT 1563 426.22
POINT 200 439.78
POINT 2066 318.73
POINT 2067 387.62
POINT 4970 432.93
POINT 7989 210.35
POINT 7990 268.25
POINT 8494 443.05
POINT 9855 258.38
POINT 9856 120.22
Rio de Janeiro 469.61
13 rows selected.
LOCATION DISTANCE -------------------- ---------- POINT 1358 428.75 POINT 1359 487.10 POINT 402 452.35 POINT 403 124.05 POINT 404 247.85 POINT 4675 452.49 POINT 4676 168.24 POINT 4677 306.37 POINT 4847 405.92 POINT 5206 449.08 POINT 5207 160.17 POINT 5208 367.93 POINT 5378 496.50 POINT 5379 341.63 POINT 5380 453.77 POINT 6714 282.04 POINT 6715 96.74 POINT 6716 484.03 POINT 8696 386.01 POINT 8697 77.56 POINT 8698 312.19 POINT 9652 420.05 22 rows selected.
If you tried either of the above two SQL statements, you probably found that it took a couple of seconds to make it through the 10,004 or so rows in the table. All of the calculations are likely hammering the server’s CPU (go parallel and hammer more than one CPU?).
Maybe we should try submitting bind variables in place of some of the calculations that we are requesting for the server to perform? Let’s set up the bind variables to prepare another test:
SELECT
SIN(LAT_RAD),
COS(LAT_RAD),
SIN(LON_RAD),
COS(LON_RAD)
FROM
T1
WHERE
LOCATION='TEST P1';
SIN(LAT_RAD) COS(LAT_RAD) SIN(LON_RAD) COS(LON_RAD)
------------ ------------ ------------ ------------
.33873774 .940880834 -.59809181 .801427592
SIN(LAT_RAD) COS(LAT_RAD) SIN(LON_RAD) COS(LON_RAD) ------------ ------------ ------------ ------------ -.59809181 .801427592 .867765674 -.49697358 VAR SIN1_LAT_RAD NUMBER
VAR COS1_LAT_RAD NUMBER
VAR SIN1_LON_RAD NUMBER
VAR COS1_LON_RAD NUMBER
EXEC :SIN1_LAT_RAD := .33873774
EXEC :COS1_LAT_RAD := .940880834
EXEC :SIN1_LON_RAD := -.59809181
EXEC :COS1_LON_RAD := .801427592
EXEC :SIN1_LAT_RAD := -.59809181 EXEC :COS1_LAT_RAD := .801427592 EXEC :SIN1_LON_RAD := .867765674 EXEC :COS1_LON_RAD := -.49697358
To test the performance, execute the following scipt (more than once to eliminate the hard parse):
SET ARRAYSIZE 100
ALTER SESSION SET TRACEFILE_IDENTIFIER = 'SQL_10046';
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 8';
SELECT
LOC2.LOCATION,
ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960 DISTANCE
FROM
(SELECT
LAT_RAD,
LON_RAD
FROM
T1
WHERE
LOCATION='TEST P1') LOC1,
(SELECT
LOCATION,
LAT_RAD,
LON_RAD
FROM
T1
WHERE
LOCATION<>'TEST P1') LOC2
WHERE
(ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960) < 500
ORDER BY
LOC2.LOCATION;
SELECT
LOC2.LOCATION,
ACOS(COS(1.570796327-LOC1.LAT_RAD)*COS(1.570796327-LOC2.LAT_RAD) + SIN(1.570796327-LOC1.LAT_RAD)*SIN(1.570796327-LOC2.LAT_RAD)*COS(LOC1.LON_RAD-LOC2.LON_RAD)) * 3960 DISTANCE
FROM
(SELECT
LAT_RAD,
LON_RAD
FROM
T1
WHERE
LOCATION='TEST P1') LOC1,
(SELECT
LOCATION,
LAT_RAD,
LON_RAD
FROM
T1
WHERE
LOCATION<>'TEST P1') LOC2
WHERE
(ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960) < 500
ORDER BY
LOC2.LOCATION;
SELECT
LOC2.LOCATION,
ABS(ACOS(:SIN1_LAT_RAD*SIN(LOC2.LAT_RAD)+:COS1_LAT_RAD*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960 DISTANCE
FROM
(SELECT
LAT_RAD,
LON_RAD
FROM
T1
WHERE
LOCATION='TEST P1') LOC1,
(SELECT
LOCATION,
LAT_RAD,
LON_RAD
FROM
T1
WHERE
LOCATION<>'TEST P1') LOC2
WHERE
(ABS(ACOS(:SIN1_LAT_RAD*SIN(LOC2.LAT_RAD)+:COS1_LAT_RAD*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960) < 500
ORDER BY
LOC2.LOCATION;
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT OFF';
Trace file output for the first query (1.794 CPU seconds):
FETCH #406890544:c=1794011,e=1797194,p=0,cr=94,cu=0,mis=0,r=1,dep=0,og=1,plh=2797247905,tim=2500817545497
WAIT #406890544: nam='SQL*Net message from client' ela= 357 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500817545983
WAIT #406890544: nam='SQL*Net message to client' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500817546020
FETCH #406890544:c=0,e=44,p=0,cr=0,cu=0,mis=0,r=21,dep=0,og=1,plh=2797247905,tim=2500817546058
STAT #406890544 id=1 cnt=22 pid=0 pos=1 obj=0 op='SORT ORDER BY (cr=94 pr=0 pw=0 time=1797194 us cost=13 size=25000 card=500)'
STAT #406890544 id=2 cnt=22 pid=1 pos=1 obj=0 op='NESTED LOOPS (cr=94 pr=0 pw=0 time=125707 us cost=12 size=25000 card=500)'
STAT #406890544 id=3 cnt=1 pid=2 pos=1 obj=90695 op='TABLE ACCESS BY INDEX ROWID T1 (cr=3 pr=0 pw=0 time=23 us cost=2 size=25 card=1)'
STAT #406890544 id=4 cnt=1 pid=3 pos=1 obj=90696 op='INDEX UNIQUE SCAN SYS_C0038308 (cr=2 pr=0 pw=0 time=17 us cost=1 size=0 card=1)'
STAT #406890544 id=5 cnt=22 pid=2 pos=2 obj=90695 op='TABLE ACCESS FULL T1 (cr=91 pr=0 pw=0 time=125683 us cost=10 size=12500 card=500)'
WAIT #406890544: nam='SQL*Net message from client' ela= 4359 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500817550490
CLOSE #406890544:c=0,e=14,dep=0,type=0,tim=2500817550613
Trace file output for the second query (1.529 CPU seconds):
FETCH #406890544:c=1528810,e=1519062,p=0,cr=94,cu=0,mis=0,r=1,dep=0,og=1,plh=2797247905,tim=2500819083108
WAIT #406890544: nam='SQL*Net message from client' ela= 320 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500819083487
WAIT #406890544: nam='SQL*Net message to client' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500819083524
FETCH #406890544:c=0,e=42,p=0,cr=0,cu=0,mis=0,r=21,dep=0,og=1,plh=2797247905,tim=2500819083560
STAT #406890544 id=1 cnt=22 pid=0 pos=1 obj=0 op='SORT ORDER BY (cr=94 pr=0 pw=0 time=1519059 us cost=13 size=25000 card=500)'
STAT #406890544 id=2 cnt=22 pid=1 pos=1 obj=0 op='NESTED LOOPS (cr=94 pr=0 pw=0 time=50452 us cost=12 size=25000 card=500)'
STAT #406890544 id=3 cnt=1 pid=2 pos=1 obj=90695 op='TABLE ACCESS BY INDEX ROWID T1 (cr=3 pr=0 pw=0 time=15 us cost=2 size=25 card=1)'
STAT #406890544 id=4 cnt=1 pid=3 pos=1 obj=90696 op='INDEX UNIQUE SCAN SYS_C0038308 (cr=2 pr=0 pw=0 time=10 us cost=1 size=0 card=1)'
STAT #406890544 id=5 cnt=22 pid=2 pos=2 obj=90695 op='TABLE ACCESS FULL T1 (cr=91 pr=0 pw=0 time=50437 us cost=10 size=12500 card=500)'
WAIT #406890544: nam='SQL*Net message from client' ela= 3294 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500819086974
CLOSE #406890544:c=0,e=13,dep=0,type=0,tim=2500819087016
Trace file output for the third query (1.388 CPU seconds):
FETCH #406890544:c=1388409,e=1380995,p=0,cr=94,cu=0,mis=0,r=1,dep=0,og=1,plh=2797247905,tim=2500820478110
WAIT #406890544: nam='SQL*Net message from client' ela= 278 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500820478441
WAIT #406890544: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500820478480
FETCH #406890544:c=0,e=46,p=0,cr=0,cu=0,mis=0,r=21,dep=0,og=1,plh=2797247905,tim=2500820478519
STAT #406890544 id=1 cnt=22 pid=0 pos=1 obj=0 op='SORT ORDER BY (cr=94 pr=0 pw=0 time=1380991 us cost=13 size=21500 card=500)'
STAT #406890544 id=2 cnt=22 pid=1 pos=1 obj=0 op='NESTED LOOPS (cr=94 pr=0 pw=0 time=45079 us cost=12 size=21500 card=500)'
STAT #406890544 id=3 cnt=1 pid=2 pos=1 obj=90695 op='TABLE ACCESS BY INDEX ROWID T1 (cr=3 pr=0 pw=0 time=12 us cost=2 size=18 card=1)'
STAT #406890544 id=4 cnt=1 pid=3 pos=1 obj=90696 op='INDEX UNIQUE SCAN SYS_C0038308 (cr=2 pr=0 pw=0 time=9 us cost=1 size=0 card=1)'
STAT #406890544 id=5 cnt=22 pid=2 pos=2 obj=90695 op='TABLE ACCESS FULL T1 (cr=91 pr=0 pw=0 time=45046 us cost=10 size=12500 card=500)'
WAIT #406890544: nam='SQL*Net message from client' ela= 2713 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2500820481326
CLOSE #406890544:c=0,e=15,dep=0,type=0,tim=2500820481372
Maybe we should just assume that the world is flat to save time (note that I am not converting the degree values to radians)?
SELECT
LOC2.LOCATION,
SQRT(POWER((LOC1.LAT_DEG - LOC2.LAT_DEG),2) + POWER((LOC1.LON_DEG - LOC2.LON_DEG),2)) * 3960 * 3.141592654/180 DISTANCE
FROM
(SELECT
LAT_DEG,
LON_DEG
FROM
T1
WHERE
LOCATION='TEST P1') LOC1,
(SELECT
LOCATION,
LAT_DEG,
LON_DEG
FROM
T1
WHERE
LOCATION<>'TEST P1') LOC2
WHERE
(SQRT(POWER((LOC1.LAT_DEG - LOC2.LAT_DEG),2) + POWER((LOC1.LON_DEG - LOC2.LON_DEG),2)) * 3960 * 3.141592654/180) < 500
ORDER BY
LOC2.LOCATION;
LOCATION DISTANCE
-------------------- ----------
POINT 1561 351.10
POINT 1562 49.17
POINT 1563 446.41
POINT 200 447.03
POINT 2066 327.96
POINT 2067 389.31
POINT 4970 436.30
POINT 7989 222.68
POINT 7990 274.47
POINT 8494 446.20
POINT 9855 272.91
POINT 9856 126.70
Rio de Janeiro 497.15
13 rows selected.
LOCATION DISTANCE -------------------- ---------- POINT 4847 487.345998 POINT 5207 194.867475 POINT 5208 371.803354 POINT 6714 306.023772 POINT 6715 110.18648 5 rows selected.
Still 13 rows returned, but the distances are a bit off due to the curvature of the earth. What happened to the other 17 rows… did I make a mistake, or is the world not flat?
Note that the original version of this article made a critical mistake – latitude values cannot exceed 90 degrees. While trying to implement my suggestion in a comment to use +-0.0002525254004999719 radians to speed up the query (note, still need to take into account the wrap-around for longitude values) I discovered inconsistent results. This article is corrected to fix that mistake.
Added June 14, 2011:
The speed up version of the query:
SET ARRAYSIZE 100 ALTER SESSION SET TRACEFILE_IDENTIFIER = 'SQL_10046-FAST';
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 8';
SELECT
LOC2.LOCATION,
ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960 DISTANCE
FROM
(SELECT
LAT_RAD,
LON_RAD
FROM
T1
WHERE
LOCATION='TEST P1') LOC1,
(SELECT
LOCATION,
LAT_RAD,
LON_RAD
FROM
T1
WHERE
LOCATION<>'TEST P1') LOC2
WHERE
LOC2.LAT_RAD BETWEEN LOC1.LAT_RAD-(0.0002525254004999719 * 500) AND LOC1.LAT_RAD+(0.0002525254004999719 * 500)
AND LOC2.LON_RAD BETWEEN LOC1.LON_RAD-(0.0002525254004999719 * 500) AND LOC1.LON_RAD+(0.0002525254004999719 * 500)
AND (ABS(ACOS(SIN(LOC1.LAT_RAD)*SIN(LOC2.LAT_RAD)+COS(LOC1.LAT_RAD)*COS(LOC2.LAT_RAD)*COS(LOC2.LON_RAD-LOC1.LON_RAD)))* 3960) < 500
ORDER BY
LOC2.LOCATION;
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT OFF';
The fetch required almost no CPU time, from the 10046 trace file:
PARSE #406878256:c=0,e=3157,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=1311020123,tim=2586341558538
EXEC #406878256:c=0,e=35,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1311020123,tim=2586341558744
WAIT #406878256: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2586341558793
FETCH #406878256:c=15600,e=7387,p=0,cr=94,cu=0,mis=0,r=1,dep=0,og=1,plh=1311020123,tim=2586341566207
WAIT #406878256: nam='SQL*Net message from client' ela= 11647 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2586341577955
WAIT #406878256: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2586341578017
FETCH #406878256:c=0,e=42,p=0,cr=0,cu=0,mis=0,r=12,dep=0,og=1,plh=1311020123,tim=2586341578052
STAT #406878256 id=1 cnt=13 pid=0 pos=1 obj=0 op='SORT ORDER BY (cr=94 pr=0 pw=0 time=7387 us cost=13 size=1950 card=39)'
STAT #406878256 id=2 cnt=13 pid=1 pos=1 obj=0 op='NESTED LOOPS (cr=94 pr=0 pw=0 time=3920 us cost=12 size=1950 card=39)'
STAT #406878256 id=3 cnt=1 pid=2 pos=1 obj=90785 op='TABLE ACCESS BY INDEX ROWID T1 (cr=3 pr=0 pw=0 time=23 us cost=2 size=25 card=1)'
STAT #406878256 id=4 cnt=1 pid=3 pos=1 obj=90786 op='INDEX UNIQUE SCAN SYS_C0038321 (cr=2 pr=0 pw=0 time=15 us cost=1 size=0 card=1)'
STAT #406878256 id=5 cnt=13 pid=2 pos=2 obj=90785 op='TABLE ACCESS FULL T1 (cr=91 pr=0 pw=0 time=3894 us cost=10 size=975 card=39)'
WAIT #406878256: nam='SQL*Net message from client' ela= 7604 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=2586341585780
CLOSE #406878256:c=0,e=13,dep=0,type=0,tim=2586341585850
The above is not a finished product, it is still necessary to take care of wrap-around values for the longitude coordinates.
Recent Comments