MapInfo Pro is fast. It uses spatial indexing
as well as tabular indexing to speed up tabular (non-map) and
map queries. It also has a truly superior ability to align
coincident maps on screen on-the-fly, even if the map layers
have dramatically different map projection and coordinates.
At the same time, no one's ever satisfied that any software
is "fast enough", particularly when using tables of 100,000's or
1,000,000's of records. For some users, e.g.,
wireless telecommunications users, the total number of records
is not great, but coverage region or other spatial objects are
extremely complex, having 100,000's of nodes.
MapInfo Pro's built-in database engine can
manage record sets of many millions of records, exceeding the
capacity of Access, for example, by far. Its map rendering
and map object analysis operations allow us to work with
regions, for example, composed of 1,000's of individual
polygons, including deeply nested and overlapping polygons.
A single region object may be comprised of more than 1,000,000
nodes.
This discussion will outline some ideas for
how to improve our performance, especially when using large data
sets. It is a discussion based on user experience and has
not been reviewed or approved by MapInfo Corporation.
Map and non-map queries are both CPU and
hard disk intensive. This is your chance to justify a
faster PC.
Faster CPU speed is good.
Faster CPU and fast hard disk is better.
A 10,000 or 15,000 RPM disk will be noticeably faster than a 5400 RPM.
A caching controller is best but rare on a desktop system.
Laptops have notoriously slow disk access
times. Ie., the rotation speed may be good, but the data
access may still be slow. Use a desktop if possible.
Is your network or file server slow?
Copy the data to your local disk.
Work in MapInfo TAB/DAT
format
It's convenient to be able to open and map
tables opened in native XLS, Text, Shapefile, Access, RDBMS or
other formats. But processing is significantly faster in
MapInfo's native DAT format.
Use File > Save As ... to convert to TAB/DAT
format. Remember to open the newly saved file (as it does
not re-open automatically). XLS and plain Text format data
files are exceptionally slow.
Many
deleted records in your table? Pack the table before
executing your queries
Deleted records are still part of your
table, although grey-ed out. Your queries will process
faster without them. Packing the table removes these
records permanently. Warning: Beware of Pack-ing if you
have Workspaces referencing this table *and* the Workspaces include
custom text labels (labels you have manually moved, changed
fonts, deleted or otherwise adjusted). For more
info on Labels (MapInfo Pro 7.5 and earlier)...
Use Table > Maintenance > Table Structure
... to set tabular indexes for columns you'll query against.
If your table is in Access or DBF format,
it is not enough to create Access or DBF indices.
MapInfo won't use them.
Indexing helps most when columns hold
highly-varied data values: e.g., a unique ID column
vs. a "Customer/Non-customer" column.
Note: SGSI has found
cases where the effect of indexing columns is minimal.
Optimize
your query syntax: put the indexed column reference last
in your 'where' clause
For example, if column 'Sales' in table 'Customer_Points'
is Indexed:
Syntax option #1: (Slower)
Select * from Customer_Points, Cov_Regions where
Sales > 10000 and
Customer_Points.Obj within Cov_Regions.Obj
into
qSales10000_Within1
Syntax option #1: (Faster)
Select * from Customer_Points, Cov_Regions where
Customer_Points.Obj within Cov_Regions.Obj and
and Customer_Points.Sales > 10000
into
qSales10000_Within2
Use 'within' or 'contains'
instead of 'partly within', 'entirely within', or 'intersects'
The 'within' and 'contains' operators are
somewhat less precise than 'intersect'. etc. but *much* faster.
This is because they consider a region 'within' if its centroid
is within. Do you really need the extra accuracy of
'partly within' or 'entirely within'. On average, the
simple 'within' syntax produce very very similar results.
Run comparative tests with your data to see if the speed
benefits are worth the small reduction in accuracy.
If regions have very complex shapes, remove unneeded
detail
Telecom coverage maps commonly have hundreds of thousands of nodes and 10,000+ polygons in a
SINGLE region. Each extra node that defines the region edge
adds to the processing drain. Do you need them all?
You can use MapInfo's Object processing menu to, for example,
remove coincident nodes or generalize the region edges.
Warning:
'Thinning' can create data problems if you have
adjacent regions that are generalized differently.
Thinning can change your query results, because it
changes the literal shape of the edge. Hence a point on
the very edge may be *outside* a thinned region but *inside*
the un-thinned region (or v. versa).
Split
complex regions, even if you don't simplify them on the edges
Before running your queries, manually "split" complex region
shapes by creating a "cookie cutter" grid or other object
(cf the "Grid Maker" MBX in Tools > Tool Manager) then
use "Object > Split. You can automate this via a simple MBX that tests the number of nodes in a region, then splits it if
and only if the number of nodes
exceeds a threshold you set. SGSI customers (only) should
contact us for details and sample code.
Splitting does not change the shape of the
region overall, so your point-in-polygon operations will return
the same results. However split regions process much much faster
in practice. This is probably
because the point-in-polygon code will operate against much simpler (ie, split out pieces) regions.
Use the
'region overlay' feature of Vertical Mapper 3 instead of a
Mapinfo Pro spatial query
This feature can produce results up to 10x
faster than MapInfo's built-in vector overlay process. It
is particularly valuable when computing 'area proportion' values
based on the overlap of two complex region map layers. E.g,
if computing the population falling within a complex telecom
coverage region.
Vertical Mapper currently lists for
$1195/user and has many other features including 3D
visualization, line-of-sight/intervisibility, DEM and other grid
analysis, contouring.
Optimize
your query syntax: e.g., alter the order of table
references
For example: All these queries (below)
create the same results, but query syntax #3 is 50% slower.
Your results may be different, so experiment.
Syntax option #1: (Fast)
Select * from Cov_Regions, Customer_Points where
Cov_Regions.Obj Contains Customer_Points.Obj into
qRegionPoint_Contains
Syntax option #2: (As fast)
Select * from Cov_Regions, Customer_Points where
Customer_Points.Obj Within Cov_Regions.Obj into
qRegionPoint_Within
Syntax option #3: (Much slower)
Select * from Customer_Points, Cov_Regions where
Customer_Points.Obj within Cov_Regions.Obj into
qPointRegion_Within
Process
records in sub-sets
For example, let's say you have 2,000,000+
points that need to be assigned a County name, based on
location. The following queries will process at least 2x
as fast if you process the points in chunks of 200,000 records
at a time.
Select * from CountyRegions,
MillionsOfPoints where CountyRegions.Obj Contains
MillionsOfPoints.Obj
Using
"Table > Update Column..." when tables have many columns?
Two-table queries often join tables for
purposes of updating one table with values from the other.
E.g.,
Select * from Table1,
Table2 where Table1.ID=Table2.ID into
qMyJoinedTables
Update qMyJoinedTables Set Table1.Name = Table2.Name
This "Table Update" process can be
dramatically slower if Table1 and Table2 have many columns.
To improve performance:
Narrow overly-wide columns;
Permanently remove unneeded columns.
Adding
records or updating values one record at a time? Drop
indexes.
Drop the indexes, add/edit all records,
then re-build the indexes. Why? It's more
efficient for MapInfo to index an entire table once than to
repeatedly index an large number of individual records.
Adding or
editing records? Set FastEdit ON (Advanced users only)
When table records are edited or added, MapInfo
builds a separate transaction file holding the edit. This
file disappears when you explicitly Save. By turning
FastEdit ON, you command MapInfo to skip the transaction file
and make all changes directly to the master table.
Use the MapBasic window to issue this
command:
Set Table [tablename] FASTEDIT ON
Note:
1. FASTEDIT setting apply to a single table. Issue
separate commands for faster editing with multiple tables.
2. Once ON, FASTEDIT remains ON until the table is
closed or you explicitly reverse things via the following
command:
Set Table [tablename]
FASTEDIT OFF
3. Without a
transaction file, you lose the option to Revert your edits.
This is a risk.
The speed at which the screen refreshes
certainly depends in part on the system hardware. Here,
though, we will mention issues that are more
user-controllable.
Map
layers can cover huge areas without hurting performance
MapInfo's spatial indexing is good, so it
takes approximately the same time to re-draw a neighborhood
street map whether the street map layer is state-wide or only
county-wide. MapInfo Pro does not spend any energy
drawing street lines or other layer features outside the
screen view.
Complex
map features can hurt performance
MapInfo spends energy drawing an entire map
region or other feature even if only one small part of it is in
view. This is particularly noticeable with MapX and
MapXtreme, when drawing highly-complex region shapes with 10's
of thousands of nodes. For best re-draw speed, split these
regions into separate records and limit the maximum number of
nodes.
Map
features with a display style of "None" still take work to
draw
Sometimes you need a map layer to be
"Visible" only so that the text labels are drawn. You can
make the map lines or regions invisible, but still see the text
labels, by setting the layer Display Style to "None".
However, the map lines and regions are still manipulated
internally in memory by the code, whether visible or not.
For best performance, be sure the zoom layering for the map
shapes is narrowed to just the zoom ranges set for the text
labels.
Two other options:
(Warning: Test only on a backup copy)
1. Update the map features, converting lines, regions, etc to
points: e.g., in the MapBasic window,
Update [tablename]
Set Obj =
CreatePoint(CentroidX(obj), CentroidY(obj))
2. Use the Labeler tool in
Tool Manager to create and save a permanent "text-only" layer,
dropping the shapes entirely.
Put records in ascending zip code order.
Mapmarker's address dictionary is apparently in zip code order.
Thus, if your input address table is also in zip order, the
processing will be *much* quicker.
Don't
create MapInfo point objects
Fill lat-lon columns initially. Afterwards,
use MapInfo's Table > Create Point menu option to create the
point symbols.
Relaxing ZIP Codes slows performance
When ZIP Code matching is
relaxed, MapMarker searches a wider area for possible matches so must
consider more records in the address data dictionary.
Relaxing
City Name slows MapMarker only slightly
With city names relaxed,
MapMarker evaluates more records in its data dictionary.
It evaluates street addresses with the specified ZIP Code for
all cities, not just the the city name in the input address.
Most users will relax city name because city names used by the
USPS often are different from the city names used 'on the
street' or the legal incorporated city name.
Relaxing Street Name slows down MapMarker
If street name is
'relaxed', MapMarker considers all street names that sound alike
(soundex).
Relaxing
House Number doesn't hurt performance much
Avoid
matching to street Intersections
If you opt to Match to Street Intersections,
MapMarker must compare two street names for every record, not
one. Choose this option only if your table contains a large
number of street intersections.
If you must geocode to intersections, do this
as a second pass. Ie., geocode all possible records
without it, then turn on intersection matching specifying 'only
unmatched records'.
Note: Remember that Street1 and Street2 must be contained
in the same field and be separated by an "&&". If you use
a single ampersand, expect strange results.
Don't even
think about leaving Mapmarker address files on the MapMarker
install CDs
No explanation needed.
Don't ask
MapMarker to fill in Output columns
Also, don't request MapMarker to blank out columns if address
in not geocodable. As you'd expect, the less writing to disc,
the faster.
Delete MapInfo indexes if
geocoding DBF-format tables Even if the indexed column is *not* relevant to geocoding,
the mere presence of a single indexed column in the DBF table
slows geocoding by a factor of 10x+.
There are always those steps we try which
we think may help, but which don't. Here are some things
not worth trying.
Don't
combine/merge adjacent regions
You might think that reducing the number of
regions would help. E.g., if your telecom coverage data
defines antennae sectors as separate regions, you could use
Object > Combine to merge them. Don't. It is the
complexity of each region that makes the most difference.
Merging sectors will slow down processing because it results in
even more complex regions.
Don't
bother disaggregating complex regions
A region can be composed of one polygon or
thousands of them. The Object > Disaggregate command lets
us convert a highly complex region into a set of simpler regions
with one polygon each. Don't bother. It doesn't
speed things up.