Archive for March, 2008

Alfa Test Drive Implementation Details

With the Alfa Test Drive officially underway, there were a few details leading up to this event that are worth talking about.

First of all, in all of the zoom levels I produced over the last couple of days, I didn’t actually have one that showed all areas when you fired up Google Earth.   So I produced one, and called it, elegantly enough, Z.kml, which showed a statistically representative number of wells, at a scale slightly smaller than Z0.kml.  All I had to do is modify doc.kml to link it in.

Speaking of which, doc.kml and Z.kml had gotten fairly large (three and one megabyte, respectively), so I re-ignited the KML vs. KMZ debate.   Lesson learned (I think); do not name the KMZ doc.kmz.  Anything but doc. There seems to be no end of confusion stemming from this (human and otherwise), so I eventually relented and simply called these files BC.kml and BC.kmz, respectively, and all seemed to be well after that.

Believe it or not, I ran into all sorts of problem trying to transfer these files to the web server with FileZilla.  Eventually, I just deleted the entire destination directory, and re-uploaded all of the data again.  The weird thing that was the doc.kmz file would be transferred to the web server, but it would not have ownership or privilege information shown, and it would subsequently disappear.  The delete/reload seem to cure it, but it was an odd one, for sure.

Once the Test Drive post was out, I seeded the test by forwarding link information to each of the four jurisdictions from which I’m trying to obtain data, as well as a select list of others that I think will be interested in this project (the list is being kept in Intellog - Alfa Test Drive distribution list).

Posted on 20th March 2008
Under: Developers' Journal | No Comments »

Alfa* Test Drive

You can now take a look at what we’ve been up to, recently.  It’s not even a beta, let alone a test, but rather, an early opportunity to see what we’ve been working on, to this point; hence we’ re calling it the Alfa* Test Drive.

If you haven’t done so already, you will need to install Google Earth (GE) — complete directions for downloading and installing can be found here, and it’s entirely free.  You only have to install GE once, before you look at the data the first time.  When you have successfully installed GE, click here to start the Alfa Test Drive.  This is also entirely free.

GE will launch, and the globe will orient so northeastern BC is closest to you.  This area will be coloured in with small red, green and white dots.  You can zoom into this area to reveal more detail.  The first time you zoom in, you might notice a slight delay as the more detail is loaded.  If you need help with GE itself, the User Guide is very helpful, along with the video tutorials.

The Test Drive data is for BC only, and it’s development data, so accuracy, nor completeness is guaranteed.  It will, however, give you an overall feel for how access to the data will work.  Of course, we are currently working with the respective agencies from Alberta, Saskatchewan and Manitoba in order to get wells in those jurisdictions on Intellog.

If there are others that you feel would be interested in taking a look, by all means, send them the link to this post, or if you prefer, you can just send them the following link (but make sure they know they’ll need to install GE, first);

http://www.intellog.com/E/data/well/v0.0/kml/BC.kmz

You are encouraged, and we appreciate you leaving your comments in response to this post.  You can be sure that any issues or concerns you raise will be promptly followed up.  Thank you very much for your interest!

*In case you’re wondering, this is not a misspelling, but rather, a tribute to one of the last Grand Marques, Alfa Romeo.  I just thought it fit in with the whole ‘test drive’ theme.

Posted on 20th March 2008
Under: Business Development | 8 Comments »

Site Taxonomy

In pursuit of a preview release, it’s now important to think a little about the taxonomy of the Intellog website.  The logic I’ve come up with, at least for the time being, is described below;

  • www.intellog.com/EIntellog will serve multiple industries, so the top level of the taxonomy defines an industrial domain; E being energy, and with others to be defined in the future.  By convention, this will be capitalized, and limited to one or two letters.  Data and applications that are common between industrial domains will be found in the folder base, which, by convention, is lower case.  The use of an upper case, tight abbreviation clearly distinguishes these folders from others than can be found at this level (eg. intellog.com/blog).
  • www.intellog.com/E/data  Each industrial domain will be served by either data or applications (or both), so the taxonomy basically takes one of two paths at this point.  The one being discussed here is data, but in the future, there will be an apps folder as well.  The only other possibility may be a folder called docs, that will contain documentation.
  • www.intellog.com/E/data/well  It’s assumes that each industrial domain will have different subject area subdomains addressed in the database, so the major subject area is the next level of the taxonomy.
  • www.intellog.com/E/data/well/v0.0  Version number is found below the subdomain, as it is entirely possible there will be different version numbering sequences for each subdomain.  For example, well data may be at v0.5, and weather may be at v0.4.   The v prefix, by convention is lower case.
  • www.intellog.com/E/data/well/v0.0/kml  Finally, below the version folder, will be a folder named for the format of the data found within that folder.  kml, for example, stands for Keyhole Markup Language, which is the format consumed by Google Earth.   It should be assumed that all data within a version folder will be in synch, regardless of format.

I could spend a ton more time thinking about this, but I suspect that will introduce complexity that is neither wanted, or needed, so let’s leave it at that, for now.

Posted on 20th March 2008
Under: Developers' Journal | 2 Comments »

Improving the Performance of E.WellDetail(INT)

I did an initial test on the full Z1 layer by generating Folder / NetworkLink combos using Untitled3.sql, and then editing them by hand into doc.kml.  Performance appears to be very good so far.  So good, in fact, that I think it’s fair to say that ‘job done’, at least for the time being, on paging.  I’m satisfied now that virtually a limitless size data set can be served directly off the website, and performance will be just fine.

Once substantial activity still remaining is to complete a detailed pop-up when the user clicks on a symbol at the tightest zoom.  In pursuit of this, I discovered on minor gap in the data, which was lack of a verbose well name in E.WellDetail(INT).  I tried to rectify this by including a reference to E.IdentityCurrent, but performance absolutely fell off a cliff — from 20-30 seconds to over six minutes.  So a rewrite/rethink of E.WellDetail(INT) was in order.  Mostly, this function simply makes reference the E.WellStatus, which turned out to be the real villain.

The resolution to this issue came from a most unexpected place; namely, scalar-valued functions.  E.WellStatus was really just a series of subqueries to the Event table structure, facilitated by E.WellEventCurrent view.  To get around this problem, a number of scalar-value functions were created; E.getWellEventNm(INT, UNIQUEIDENTIFIER) and E.getWellEventUdt(INT, UNIQUEIDENTIFIER), E.getWellIdentityTxt(INT, UNIQUEIDENTIFIER) and E.getWellSymbolUrlTxt(UNIQUEIDENTIFIER).  Using E.getWellEventNm as the example, it returns the current name for the set of event names defined by id_TypeCollection, for a given well, identified with the UNIQUEIDENTIFIER parameter.  All of these functions work in a similar fashion.

E.WellDetail(INT) was then rebuilt using these scalar-valued functions, and performance was dramatically faster.  Generation of the KML pages — which is heavily based on on E.WellDetail — went from a couple a minute to one every couple of seconds.  And in the end, the original objective was achieved, which was to get the verbose well name into this table-valued function.

Scalar-valued functions are not the whole story, however.  I added indexes to E.identifies on uid_Well and and id_Type (wellIdentifiesIdx, and typeIdentifiesIdx, respectively), which had a significant impact on the performance of the scalar-valued functions described above.

On a related note, I ran into the first gotcha with TOAD…pretty much all of the index management capabilities are bundled into the ‘commercial’ version.  Hmm.  It’s still a pretty effective tool, but like most good things in life, it really isn’t free.

With the generation of all the KML pages complete, I transferred all of the page files to the server, and gave the full BC data set a test drive — seems to work pretty well.  Certainly well enough to allow a test drive by a small community of alpha testers.

Posted on 20th March 2008
Under: Developers' Journal | No Comments »

Logo Merchandise Update

The much-anticipated Intellog logo merchandise is making steady progress through the development phase.  The sample logos came today from the manufacturer, and they look pretty good, although I’m going to ask if they can make the logo a little bit darker still.  But overall, the quality looks absolutely great.  In case you’re wondering, the first product will be an organic cotton, made in Canada, beanless hat.   Hard to say exactly when they will be ready, but keep an eye on The Intellog Blog for updates. 

And my apologies, in advance, for the crappy picture quality.

Posted on 19th March 2008
Under: Business Development | No Comments »

Regenerating All Zoom Levels with Modified E.getWellGroupKml

There are wells in the database that had ‘1′ as the first character of the label, and yet, these were NTS wells.  This should never occur, at least not to the best of my knowledge, so it’s obviously some sort of problem with my load logic.  I decided to purge these before I ran the scripts to generate all the KML pages.  If I’m going to turn end users loose on the ‘alpha’ data set, this bad data might just confuse them.With this out of the way, I created a utility (tentatively name Untitled2.sql) which generates the sqlcmd executable code.  You have to run it once for each zoom level, but that seems like a fairly minor compromise at this early stage.  I ran the resulting code into (and over) the night, and by early today, I had nearly all of the pages created.  There are a couple of hundred of Z0 to go, but they will be finished in the next couple of hours.

For those that think 10-12 hours of processing time will be onerous — after all, don’t they have to be rerun every time the data is updated? — it’s not, actually.  That’s because the underlying database makes rigorous use of date and time stamps, so when data is updated from source, you will only have to regenerate those changes that are impacted by the update.  The other interesting thing is that it seems like you have multiple scripts running at the same time, in separate windows, and overall throughput is higher than running everything serially through one window.

While those scripts were running, I started work on the code (tentatively called Untitled3.sql) which will generate the Region/NetworkLink combos to go into doc.xml, and control the calls to the KML pages discussed above.  At 2000+ pages files, just for the BC data, it’s vital that doc.xml be generated automatically.  Interestingly enough, it’s just a reformatting of the data coming out of dbo.temp1, and in turn, base.ZoomLevel which is also used to generate the sqlcmd scripts.  I’ve got to think of better names for all of these things.

Yesterday, I mentioned I finally relented from my all Microsoft stance (more-or-less) and basically dumped SQL Server Management Studio in favour of TOAD.  My first experience has been excellent.  Everything is slightly different that Management Studio, but you pretty quickly forget the differences, and just focus on the task at hand.  Frankly, if it eliminates the BSODs, that’s worth the price of admission alone.  To date, I’ve only found one thing which still seems to be implemented better in Management Studio, which was modification of table structures.  But I expect it’s my lack of familiarity with TOAD.  There was also a slight problem with  modifying table-valued functions, but again, I suspect that’s my lack of experience with the tool.

Posted on 19th March 2008
Under: Developers' Journal | 1 Comment »

Zoom Testing and Adjustment

I decided to be a bit more methodical, starting with the tightest zoom, which defined an area of roughly 50 square kilometres, and adjusted the level of detail (LOD) so it occupied about 80% of the display area.  This came to minPixelLod of 800.  In this exercise, 50 square kilometres covered latitudes 56.0000 through 56.0625, and longitude -120.000 through -120.125.  Once I had this working, I doubled the dimensions of the area being displayed (latitudes 56.000 through 56.125 and longitudes -120.00 through 120.25), and made sure these new dimensions occupied the entire screen.   A lower level of zoom could be turned off when it has doubled in size, that is minPixelLod 800, and maxPixelLod 1600.

I repeated this process until I reached the largest area I wanted on one KML page — a total of seven zoom levels.  These zoom levels and dimensions where all recorded in the table base.ZoomLevel, and organized into a group using base.ZoomLevelCollection.  Once I had the dimensions set as described above, I went back through base.ZoomLevel, and statistically reduced the number of symbols being displayed in the larger areas.  For example, and the lowest (loosest) zoom, only one in ten symbols where being displayed.  Similarly, label scales and icons scales were adjusted to give the illusion of symbols scaling up as zoom is tightened.  I then generated one sample page at each zoom level using the parameters defined in base.ZoomLevel, and put the whole thing together with doc.xml.  This results in smooth scaling of symbols from loosest to tightest zoom.  Also, the KML file at the tightest zoom is very tiny, meaning lots of additional detail can be packed into it in the future.

I figured I had probably named the zoom levels incorrectly — backwards in effect.  The tightest zoom that displays all the available detail should really be Z0 (zoom zero) because there should never be anything beyond that.  Lower zoom levels should then be numbered sequentially upwards.

Finally, more BSODs, so I have officially cried "uncle!" with SQL Server Management Studio and changed to TOAD, at least for table editing.  Seems to have the functionality I need, and no issue with BSODs.  I’m not quite at the point of de-installing Management Studio, but I’m close.

Next up, minor correction of some obvious problems with well labelling, and then scripts to generate the calls to sqlcmd.

Posted on 18th March 2008
Under: Developers' Journal | No Comments »

Modifications to E.getWellGroupKml, and a New Approach to Tight Zooms?

In addition to the small, highly detailed page files (there are 1,378 of those), there are going to be a fair number of page files for lower levels of zoom.  So it make sense to be able to store the characteristics of each layer so that the sqlcmd input parameters can be generated automatically.   This store of information should also, in theory, form the basis for the automatic generation of the the doc.kml.  It may not be necessary to automate the entire process entirely, at this point, but might as well go down the road a ways, if possible.

To this end, I created a new database diagram (Intellog - base - ZoomLevel) which was used to create two new tables; base.ZoomLevel and base.ZoomLevelCollection.  These were created under the base schema because they would seem to have applications outside of the energy industry which is represented by the schema E.   These tables will be used to generate sqlcmd-executable calls to E.getWellGroupKml.  To E.getWellGroupKml, I added the new parameters latitudeAmt and incrementLatitudeAmt, which provide an alternative to supplying the lbl parameter.  In the case of these new parameters, you can identify a specific latitude (let’s say 57), and then the increment from that latitude (let’s say 0.25), which will return all latitudes from 57. to 57.25.  An equivalent concept was implemented for longitude.  So, in other words, you can specify blocks of real estate of varying size simply by specifying the target latitude and longitude, and the increments from that target.

A multi-statement table-valued function was created, called dbo.temp1 on an interim basis, which takes a uid for base.ZoomLevel, and creates a table of parameters suitable for input to the E.getWellGroupKml.  In theory, this will allow for the regeneration of all the pages for a given zoom level.  This function is smart enough to know which pages will actually contain symbols, and only generates parameters for those pages.

Of course, it now occurs to me the work done earlier based on the lbl may not have been necessary.  If you specify a small enough increment for latitude and longitude, you get the same, relatively small number of symbols.  The advantage, of course, is that all blocks are the same size, saving you the trouble of having the figure of level of detail for each one.

Unfortunately, I was fighting some hardware problems.  Seemingly, there is an incompatibility between SQL Server Management Studio and my laptop video driver .  The whole process was brought to a screeching halt with two BSODs.  The problem seems to be isolated to the editing of data in tabular format.  If this is done rarely, and when done, carefully, the problem can be avoided, for the most part.

Posted on 18th March 2008
Under: Developers' Journal | No Comments »

Back to Lower (Looser) Zoom Levels

Having successfully cracked the problem of how to page in maximum levels of detail at high (tight) zooms, it’s time to return to the lower (looser) levels of zoom to ensure that they work correctly, as well.  I have a feeling in addition to the lowest zoom  levels — where only a statistically significant number of symbols (wells) are shown, as opposed to every symbol — there is likely something in between where selective groups of symbols are displayed, based on some arbitrary lat and long boundaries.

A query indicates that if you group by integer lat/long, the maximum number of symbols hits a high water mark of around 5000 symbols in one group.  By the time you get down to about the 15th, it’s a couple of hundred.  So integer lat/longs would seem to be a fairly good method of organizing the mid-level zooms.   Some testing revealed, however, that 5000 symbols still produced ‘notchy’ performance — that is, there is a noticeable delay as it pages in the 5000 symbols to display.

Finally settled on the idea that there will be several layers of zoom, each layer scaling the icons up (or down, depending on which way you’re going), and adding symbol density until such time that maximum (tightest) zoom is reached, and all data is displayed.  On an interim, test basis, I set up five levels of zoom, and it produced reasonably acceptable performance when zooming in and out.

Therefore, modified E.getWellGroupKml so that it would accept a latitude and longitude value, as well as a latitude and longitude increment.  For example, if you specify 56 for latitude, and an increment of 0.5, it would return all latitudes from 56.0 to 56.5.  If you specify an increment of 0.25 for the same value, it would return latitudes 56.0 through 56.25.  In the case of the latter, if you specify a latitude of 56.25, it would return all latitudes 56.25 through 56.5.  Same thing for longitude.  This makes E.getWellGroupKml suitable for generation of any level of detail at any zoom level.

Similar to the tight zoom, there will be a need to automatic generate the individual files that contain the groups of well organized by lat/long.

Posted on 17th March 2008
Under: Developers' Journal | No Comments »

Screen Shots Submitted to Data Providers

The raw data that is being used to populate the Intellog database is being sourced from the four government agencies in the four western Canadian provinces.  An interim objective is to clearly establish in their minds what precisely the nature of our work is going to be.  To that end, the screen shots that were produced the day before yesterday have been submitted to each of those four agencies for their review and subsequent discussion.

Posted on 17th March 2008
Under: Business Development | No Comments »