XML, GIS and You
by Adena Schutzberg




XML is here! And yet it may be nearly invisible to most of us. XML is a behind-the-scenes language used to move data between applications. In fact, it's very likely the transaction you made at the bank or the book you purchased online caused some XML to be written and passed between applications.

XML Basics

In the world of GIS we are still a ways off from extensive adoption of XML, as researchers and vendors explore and define how XML will be used. There are two subtle but keenly important differences between HTML (what web pages are written in) and XML (a language that holds data). A particularly clear explanation of their different purposes comes from SoftwareAG (makers of Tamino, an XML based database). The company explains the distinction this way: The motto of HTML is: "I know how it looks," whereas the motto of XML is: "I know what it means, and you tell me how it should look." Said another way, HTML is about making pretty presentations (bold text and neatly formatted tables, etc.) while XML is a semi-structured document that holds "content."

What type of content? In GIS, XML might be used to define a query. A query in XML would not, in substance, be different than what we ask of a GIS today: "take the area I describe below and please buffer it. Then, send me the answer." The response might also be delivered in XML. The content of the reply, again, might not be so different than a reply we see today: "Here's where to look for the bitmap showing the answer." The difference using XML is that the data (the geometry in question) and the request (in the example here, buffer) is included in a single XML document. The reply, basically a pointer to the bitmap, is also within an XML document. Further, these documents are in English and the format is defined.

ESRI chose XML to store metadata in ArcInfo 8. The data is in a raw format, and depending on your needs can be displayed in different ways. This highlights the idea that XML is about content; display is a separate issue.



The second distinguisher of XML from HMTL revolves around "extensibility;" this is the source of the "X" in XML. Extensible means that unlike HTML, which has a limited number of "tags" (that describe bold and blue and table and so on), XML supports the addition of custom "tags." So, if the GIS data that you want to describe in XML includes an attribute called "pH" you can create a tag in XML to hold that attribute information.

With GIS use distributed among so many industries it is easy to imagine a whole host of different XMLs for GIS. And that might, in turn, create a new "Tower of Babel." XML has a few ways to provide "documentation" or "metadata" describing what's inside in the XML document and where to find specific entries of interest (DTD and RDF are two). Because of this "documentation" XML can be fed into many XML savvy applications and be used.

The question is really can a document written in my flavor of XML be sent along to a second application and be understood? The answer is yes, so long as there is agreement on some basic things about the XML document. Certain fields may be required for the document to be understood and acted upon by the target application. Other "fields" holding information the application does not need or does not understand will be ignored.

XML and Standards

How does the GIS community develop these standards? The OpenGIS Consortium (OGC) has been working on a basic set of XML tags for GIS. The effort proceeded in parallel with the Web Mapping Testbed initiative. The result was a recommendation to use GML (Geography Markup Language) as a standard. GML has yet to be adopted as a specification because some of the more advanced parts are not fully mature. Still, this first pass covers simple features (OGC speak for basic geometry) and provides a well-defined starting point.

Galdos Systems, Inc. is one of GML's advocates. Ron Lake of Galdos points out that GML is special because you can read it: it's text-based, rather than proprietary. Further, he notes, there are all sorts of other things that become available when we use XML to move information: "We have XML technologies for drawing and visualization (like SVG, VML and X3D), we have XML technologies for data transformations (XSLT), we have XML technologies for schema expression (XML 1.0 (DTD), XML Schema, RDF/S), we have XML technologies for expressing queries (XPointer, XQL) and we have technologies for building large linked data sets (XLL)." And, with work being done to ensure XML can be run on many platforms, like PDAs, XML opens many doors.

As GML grows, more tags will be added. Lake likes to see GML as "something that is comprehensive, but such that [a user] need only support what you need for your own problem space." Vendors are able to add to it for their own use, as many have already begun to do. This nicely parallels OGC's commitment to keeping its specifications separate: vendors may choose to implement only the ones they feel are important for their users; they need not implement them all.

Viewing Maps in XML

The other half of the story for XML in GIS involves the display of XML graphics. Since XML is about text, to show vector graphics, like those typical of GIS, we need something more. Several groups have been working on XML based solutions for vector graphics. Perhaps the most talked about was VML, Vector Markup Language. Microsoft, to date, has done the most with VML. Office 2000 will create graphics in VML from Word and Powerpoint. On the viewing side, Internet Explorer 5.0 is the only program capable of viewing the graphics. Autodesk has recently announced an extension to AutoCAD Map that will output VML graphics.

In August, SVG, Scalable Vector Graphics, reached candidate recommendation status in the W3C--meaning it's available for comment and implementation. SVG is well on the way to wide use with a host of tools to create, edit and view it already available.

What's Happening with GIS Vendors

Dany Bouchard, of DBx Geomatics in Quebec is an early adopter of SVG for GIS. He's written soon-to-be released software to export MapInfo data into SVG. Say Bouchard, "As XML is being deployed on the internet, we should start to see many programs reading and storing spatial data in GML. SVG could be used to present GML data to the user by on the fly conversions using XML transformations (XML Schema, XSL Transformations (XSLT), etc.)." The best part of SVG, he argues is that it is "light years ahead of conventional bitmaps!" as a display mechanism.

ESRI invested heavily in XML while developing ArcIMS 3.0. Their version is ArcXML, or AXL, has been in development for three years and includes more than 20 person years of effort," according to David Maguire, Director of Product Planning at ESRI. AXL will eventually be a superset of GML.

AXL provides the glue to link the parts of ArcIMS. For those using ArcIMS who are not programmers, there is no need to "learn" AXL; it is written automatically. AXL stores configurations and display information for web applications as well as playing a part in the movement of data from the server to the client. The response from a service might be AXL describing candidate geocoding points along with their x,y coordinates, a pointer to a zip file of shape files or the location of raster map images.

Part of the reason that XML seems a bit hidden from the casual user is that to programmers it's pretty much old news. They've seen things like it, are used to HTML and have known about its potential far longer than the user community.

Sandra Johnson, who's active in the OpenGIS Consortium, notes that GML is the most concerted effort to put XML into GIS. It provides all of the "basics" one might need to describe points, lines and polygons. But, she notes, if you want to go further, you will need to add on. "For example, if you want to build or use topology," she explains, "you'll need to extend GML." For now, however, GML is a good starting point, a common ground for those embarking on an XML based solution.

Whether the client side of an internet mapping solution is thin, like a web browser, or thick, like a web enabled desktop product, such as MapInfo Professional, XML and its associated technologies can be used to transfer geographic data. The size of the client will determine the method used to display the map to the end-user, either via Windows graphic calls, Java graphic calls, or possibly an SVG plug-in.

GIS data storage in XML didn't seem a likely scenario to either Johnson or Steve Talbot of MapInfo. They suggest that there's no reason to fix something (proprietary storage) that's not broken. However, it's a straightforward process on either the client or server side to create XML to share with others. Their picture of the future, then, is that GIS data will be stored in its current form, and be transformed to and from XML as needed. Steve Talbot makes it clear that XML can put GIS in the same realm as traditional information technology. MapInfo, like ESRI, is more interested in XML as a way of presenting interfaces to high level GIS services rather than as a data distribution method. XML can offer a "find the nearest" or "give me directions" service to be presented on a retailers site, according to Talbot, but have the backend grunt work done elsewhere. This is what XML really brings to the GIS table. He summarizes it this way: "It should be as easy to add a GIS service to a website as it is to add a stock quote." And, it's XML that will make this happen.

Conclusions

XML is already sneaking into GIS, especially behind the scenes in Web applications. GIS end users, and even those setting up Web mapping solutions, may not come into close contact with XML or know about the work to standardize GML. They will however start to see that conversing between applications will become much easier.

As adoption continues, the apparently quick answer to a Web surfer's geospatial query will not tell the tales of a trip to several servers to produce the "answer." Unbeknownst to the surfer, nearly invisible XML may have made the response possible.

About the Author:

Adena Schutzberg is an independent consultant and Mapping/GIS Editor at TenLinks.com. She is a frequent contributor to CAD, GIS and surveying publications. She can be reached at adena@abs-cg.com.

XML Links

Images courtesy of ESRI