Archive for July, 2008

Minister Knight’s Response to Follow-Up (2008-07-29)

As described previously on this blog, there has been an ongoing dialogue with Alberta Energy Minister Mel Knight in pursuit of making petroleum well identification data freely available to the general public.  Initiated with an email on 2008-04-18, the Minister provided an initial response in the form of a letter dated 2008-05-29.  Frankly, the Minister’s response raised more questions than it answered, so a follow-up was written to him on 2008-06-10.  Minister Knight responded in writing on 2008-07-29, and a copy of the letter can be found here, or by clicking the PDF icon at the bottom right.

It would seem the goal of free and efficient access to this data is still a ways off.  What should be much more troubling for the citizens of this province is government policy which appears to be set and maintained by mid-level, unelected bureaucrats in powerful, quasi-judicial* agencies such as the ERCB.   Make no mistake about it, the ERCB cannot be faulted for fulfilling the mandate provided for it under provincial legislation.  However, it is up to our elected officials to sense when change is necessary, and instruct the agencies under their purview to respond accordingly.  As it stands today, the government is not providing the leadership we should all expect of them.

Your comments and questions are welcome and encouraged, and thank you very much for your support.  Every voice counts.

*This term is taken directly from the ERCB’s website page entitled About the ERCB.Click here for PDF version.

Posted on 31st July 2008
Under: Business Development, Data Sources, ERCB | 1 Comment »

Splitting base.encloses Into Industry-Agnostic and Industry-Specific Tables, and Enclosure Conventions

Turns out there was a need to split base.encloses into two separate tables; base.encloses itself is used where a given base.Tile encloses other instances of base.Tile, which is an industry-agnostic concept.  E.encloses was created, and is used when a base.Tile encloses a well location.  This is an energy-industry related concept, and therefore belongs to the to the E (energy-related) schema.  To populate base.encloses and E.encloses, the stored procedures base.putEncloses* and E.putEncloses were created.  base.putEncloses creates a instance of base.encloses when a tile is completely enclosed within another tile.  In this case base.encloses.uid_Tile identifies the tile which does the containing, whereas base.encloses.uid_Tile_child identifies the base.Tile which is enclosed.  E.putEncloses creates an instance in E.encloses for each well that falls within the boundaries of a tile, subject to the convention described immediately below.  E.encloses.uid_Tile in this table identifies the tile doing the containing, and E.encloses.uid_Well identifying the well which is being enclosed.

By convention, if a well lands absolutely and precisely on a boundary running north/south, the well belongs to the tile to the immediate west.  Similarly, if a well lands absolutely and precisely on a boundary running east/west, the well belongs to the tile to the immediate north.  Tiles, however, are considered to enclose other tiles even if they share one or more boundaries.   Take the simple example where a tile is defined by the southeast (SE) of -101, 49 and northwest (NW) and -102, 50.  If this tile is divided into four tiles, the southeast tile will be defined by SE of -101, 49 and NW of of -101.5, 49.5.  Note the SE corner is identical, and the south and east boundary of both tiles overlap.  Yet, the larger is still considered to ‘contain’ the smaller tile, even if that may not be geometrically correct.

An additional note with respect to E.putEncloses; by necessity, it works its way through all of the instances of ZoomLevel for a given ZoomLevelCollection, and does it from loosest (ZoomLevel.seq = 0) to tightest zoom (ZoomLevel.seq = MAX(ZoomLevel.seq)).  This is because each well is only ever associated with precisely one tile, regardless of the level to which the tile belongs.  Starting at the top, for example, the ZoomLevel.amt column determines the ‘density’ of wells to be represented at that level.  Assuming a value of 500, for example, this means one in 500 wells will appear, leaving 499 remaining to be displayed at some point when the zoom is tightened.  In the second and subsequent levels, the specific wells associated with a given tile are taken from the population of wells not yet associated with a tile. In other words, one or more of the remaining 499.  When the KML is eventually generated off of this structure, it will be based on the notion of tighter zoom levels providing ‘fill in’ wells for the previous level, while the wells from the previous level will continue to be displayed.

Code Shavings  base.getFloatTxt() was written in response to an apparent bug in the way CAST works converting FLOAT values to NVARCHARs.  If you cast a simple number with more than three decimal places, the resulting NVARCHAR faithfully reproduces the fourth and subsequent digits of precision.  However, when you calculate a value within the CAST, it either rounds or truncates to three digits to the right of the decimal.  base.getFloatTxt() takes a FLOAT as an argument, and faithfully (if not slowly?) reproduces all digits of precision.

*Those following the naming standard for stored procedures may be puzzled by the use of base.putEncloses as opposed to base.putencloses, given the verb (put, in this case) should be followed by the table name (encloses).  However, the use of camel case trumps strict adherence to case conventions, as the resulting construct is much more readable.  putencloses really looks like some sort of pasta dish ("I’ll have the Pasta Putencloses, thanks!") or perhaps the German word for a latch on a gate.

Posted on 30th July 2008
Under: Developers' Journal | No Comments »

The base.putTile Stored Procedure

There is now a structure for the storage of tile information, as well as a method of determining the dimensions of all the potential tiles within a given geographic area.  The base.putTile stored procedure — in a nutshell — correlates the population of base.VirtualTile with the wells found in E._WellDetail.  The stored procedure determines if a base.Tile instance should be created, or not, depending on whether one or more wells fall within the boundaries of a given tile.   base.putTile takes @uid_ZoomLevel as an input parameter, and therefore relies on the systematic population of base.ZoomLevel with the dimensions of the tile for each zoom level. 

It was decided to start in the middle of the zoom levels, with tiles consisting of one degree of longitude, and 0.6 degrees of latitude, which made the tiles roughly square.  When correlated with the Saskatchewan currently data residing in E._WellDetail, this produced 93 tiles populated with at least one well.  Working ‘out’ to lower zoom levels, the increments were doubled until just a single tile resulted; in this case, incrementLatitudeAmt was 9.6 and incrementLongitudeAmt was 16.  Then, working ‘in’ to tighter zooms, the increments were halved until the ratio of tiles to wells approximated one.  In this latter case, incrementLatitudeAmt and incrementLongitudeAmt were 0.0046875 and 0.0078125 respectively, resulting in 62,769 tiles with at least one well.

What’s important to understand is the number of tiles resulting from a given set of tile dimensions will be dependent on the location and population of wells being represented.   Or think of it this way; the well locations are overlayed on top of a theoretical matrix of tiles, and the tiles which do not contain wells drop out of the matrix; the theoretical matrix is ‘thinned’ by the presence or absence of data.  When well data from other jurisdictions is present in E._WellDetail, running base.putTile will result in a different number of base.Tile instances.   To that end, base.putTile was constructed so the first thing it does is purge the base.Tile instances associated with a given ZoomLevel prior to the correlation of the current population of E._WellDetail with base.VirtualTile and recreating the base.Tile instances.

The stored procedure name of base.putTile implies, of course, it will populate the base.Tile table, but it actually does quite a bit more than that.  It also creates instances in base.Point, base.delimits, base.Locus and base.locates.  It takes advantage of the structure in already in place for the storage of well location information.  By convention, the northwest (NW) and southeast (SE) corners of the tile were used to locate each base.Tile, so every base.Tile maps through to the these two points using the Point-delimits-Locus-locates-Tile table combination.  base.putTile rationalizes the data as it populates these tables; a single instance (record) of base.Point, for example, could be either a NW or SE corner of a tile, and a given base.Point could define many tiles of many different dimensions if they share the same NW or SE corner.

The strength of the structure described immediately above is it’s flexibility; it’s major weakness, however, is the five table JOIN required to relate Tile information to the Points which define it.  The VIEW base.TileDetail encapsulates this JOIN so it can be ‘assumed away’.

Code Shavings   The base.comment stored procedure was developed to provide a simple debugging tool — it takes @txt and @verboseFlg as input parameters.  If @verboseFlg is set to true, the content of @txt is echoed to the console, otherwise it’s silent.  ♦  The weirdo sequencing of the parameters in base.VirtualTile is based on the  convention of parameters are arranged alphabetically, by name.  It was determined a long time ago that this very arbitrary order was at least predictable, rather than trying to guess at the logic as to why parameters are presented in a given sequence.  ♦  The first attempts run base.putTile resulted in base.Point instances which were minutely out, which appeared to be as a result of floating point math issues.  To control this, ROUND was added to the base.putTile logic.

 



				

Posted on 29th July 2008
Under: Developers' Journal | 1 Comment »

Tiles, and now Virtual Tiles

In the previous post, the implementation of the tile concept (and its implementation in base.Tile) was described at some length.  Now attention turns to the practical matter of creating all the instances of base.Tile.  Starting out with a seed of an idea presented in Dr. Tom’s Workshop, a table-valued, user-defined function called base.VirtualTile was created.  This function rigorously grinds out a nm, eastlLongitudeAmt, northLatitudeAmt, southLatitudeAmt and westLongitudeAmt for prospective tiles covering a defined area.  For example, if you;

SELECT
	*
FROM
	base.VirtualTile(-101, 0.6, 1, 60, 49, -111)

the function will return 190 instances (rows), each representing one VirtualTile.  Based on the example shown above, the tiles completely cover the area from 49 through 60 degrees of latitude, -101 through -111 degree of longitude, and in increments of 0.6* degrees of latitude, and one degree of longitude, as shown in the snippet of the result set, below (click table for larger image).

Click table for larger image.The boundaries of these tiles are shared; the western boundary of one tile is precisely the same as the eastern boundary of the adjacent tile to the west, for example.  The decision to use a table-valued function in this situation permits it to be the object of a JOIN clause when correlating virtual tiles with objects they enclose.

Also, the use of the term ‘virtual’ in this context is based on the idea the tile really doesn’t exist until it contains something.  Or, explained another way, base.VirtualTile produces the list of possible tiles so they can be correlated with the things they potentially enclose.  If a tile encloses something, it then becomes a real tile (and hence exists in the base.Tile table), otherwise it simply vaporizes, as if it had never existed in the first place.

In a subsequent post, base.VirtualTile will be used to grind out a matrix of tiles for each of the zoom levels.  These virtual tiles will then be used to determine which wells fall within a given tile for a given zoom level, and also, which tiles fall within other tiles at various zoom levels.

*The oddball choice of an increment is based on producing a tile that is roughly square.  In this area, 0.6 degrees of latitude is approximately the same distance as one degree of longitude.

Posted on 29th July 2008
Under: Developers' Journal | No Comments »

The Tile Concept as Implemented in the Intellog Database

In one of yesterday’s posts, there was talk of how the frequently-used page terminology had been migrated to the use of the word tile.  It wasn’t an idle exercise; in addition to being in synch with the lingua franca of the Google Earth world, the tile concept is becoming more complex, necessitated by the hierarchy of pointers required to efficiently download the individual well KMLs.  To that end, the table base.Tile was added to the Intellog database, and a relationship drawn between it and base.ZoomLevel.

In the diagram to the immediate left (click on the diagram for a full-size view), base.Tile can be seen as having a many-to-one relationship with base.ZoomLevel – a given ZoomLevel will have zero or more Tiles associated with it.  On a side note, the table Tile is owned by the base schema* because it thought to be a concept that crosses industrial boundaries. 

Tiles contain things — that is, enclose them — and also have dimensions.  It may be possible, in some cases, to derive one from the other, but there are enough situations where they are independent  to warrant this seemingly redundant structure.  For example, a given Tile may enclose a single Well — a single point on a Cartesian plane.  As we all remember from high school geometry (right?) such points have no dimension.  Therefore, it’s not possible for the single point to define the dimensions of the Tile.  A Locus, therefore, locates the Tile while the Tile concurrently encloses one or more Wells.  In turn, there will be two Points associated to the Locus through the base.delimits table.  These will be the SE corner of the (id_Type = 135), and the NW corner (id_Type = 136) of a ‘rectangular geographic area’.

Code Shavings  It was also necessary to add base.locates.  This was  in addition to E.locates which was created some time ago.  This is because it associates instances between two tables which belong to the the base schema.  ♦  Tile.nm must be unique for a given ZoomLevel, and it’s enforced with a unique index.   ♦  It may be desirable to enforce the two-points-per-Locus rule described immediately above through some sort of trigger or check constraint.

*The schema name to which a given table belongs is shown in brackets after the table name.

Posted on 24th July 2008
Under: Developers' Journal | 2 Comments »

Saskatchewan Well Index and Well KMLs Available for Preview

In the spirit of trying a lot of different things to see which are the most useful, an index of all the Saskatchewan well data has been uploaded and can be found here or at the link below.  It currently contains 100,042 well listings, and the file is about 15 MB in size.  It’s nothing fancy, but rather the very simplest version of an index; a single, static HTML page containing the well identifiers and the associated well names, alphabetized by the latter.

It’s possible to simply scroll through the well index, but it’s also possible to use the find function of your browser to quickly scan the list to look for a particular well of interest.  The find function on most popular browsers can be launched with Ctrl-F. 

The well identifier is set up as a hyperlink, which if clicked, will automatically download the associated KML.  If you have it installed, the download will trigger the launch of Google Earth, and zoom it to the location of the well selected from the index.  Once the zoom is complete, clicking on the well will show some additional information about the well.  As usual, please do not hesitate to leave comments below, and thank you for your interest and support.

Saskatchewan Well Index

Posted on 24th July 2008
Under: Developers' Journal | No Comments »

Yet Another Scheme for Managing Large Amounts of KML Data

A new ZoomLevelCollection was established for Saskatchewan data, to provide yet another opportunity to tweak the paging mechanism to see if performance can be improved for these very large data sets.  The script to generate and modify the instances of base.ZoomLevel associated with this collection was stored under ..\Saskatechewan\ZoomLevelDefinition.sql.

To summarize up to this point, wells have been grouped geographically using progressively smaller tiles* as the user zooms in.  It makes sense that as the tile decreases, the likelihood of a single well occupying the tile increases.  Given this, why not come at the problem from the other direction?  In other words, assume exactly one well per KML in all cases, and then use KML’s NetworkLink construct as a mechanism for determining whether a given well is displayed, or not.  Also, assuming this approach is adopted, each KML will have its own unique URL, and therefore is potentially visible to search engines.  That is, if the Sitemap.xml thing can be made to work as advertised. 

To accommodate the two types of zoom level, the id_Type column was added to base.ZoomLevel, and two types of zoom level established;

  • Tile  This is the current type of zoom level, and is based on the division of the viewing area into equally-size tiles containing one or more wells.  All artifacts with the tile are either displayed, or not.  It will typically be used for lower (looser) zoom levels to convey distribution patterns using a statistically significant number of wells without having to display all wells.  The name of the resulting KML reflects the co-ordinates of one of the corners of the page.  [id_Type = 133]
  • Point  This new type is based on establishing a zone around a single well, and having the well appear based on this zone being active (or not).  Typically used for the tightest (highest) zoom level to convey all relevant data associated with the well.  There is a maximum of one well per KML, and the name of the resulting KML is based on Well.uid.  In the case of this type, the amt column became optional, as the number of wells is always one.  The check constraint CK_ZoomLevel was added to enforce this business rule.  [id_Type = 134]  

Generating the Point-type KMLs turned out to be quite a different job than generating Tile-type KMLs.  Rather than further butcher the logic of E.getWellGroupKml, a new stored procedure, E.getWellKml, was developed more-or-less from scratch.  It’s able to crank out Point-type KMLs in the blink of an eye — good thing, because there are potentially so many of them — for the 100,042 wells found in Saskatchewan, there will need to be 100,042 associated KMLs.  

Next, E.getWellRegenerateTxt was coded to generate the .BAT text which contains all of the calls to E.getWellKml, which, in turn, generates the individual KML files.  A file called regenerate.bat was created to contain this text, and let loose to generate all of the KMLs.  It was finished just minutes later, or so it seemed, for a total of 100,042 KMLs, one for each well in Saskatchewan.

The potentially vast number of KMLs is not a problem, in and of itself, except the physical handling of this large number of files tends to time-consuming.  Uploading them to the Intellog server was the biggest challenge.  Of all the different approaches tried (CDs and the like), the best one still seems to be good ol’ FileZilla.  100,042 files were queued for upload, and while it took some time to do it, FileZilla eventually uploaded all 100,042 files.  It didn’t take that long, either, given the excellent Internet connection that’s available.  About the biggest problem was FileZilla reporting  421 Too many connections (8) from this IP every couple of hundred files.  The latter problem eventually seemed to clear itself in each case, and the uploads continued on.  After 36,766 uploads, for instance, 88 had failed for this reason and had to be requeued.  The problem was sent over to the ISP to see if the number of IP connections could be bumped up.

Finally, code was hacked together (and stored under Untitled3.sql, for the time being) which contains the logic to generate all of the NetworkLinks.  The first attempt to use the resulting file — with it 100,042 NetworkLinks in it — seemed to indicate it was activating all of the NetworkLinks, but some more testing is required in this regard.

One certain problem encountered with this approach was the seeming inability of a KML containing the NetworkLink to cascade style information into the linked file.  The KML Reference clearly indicates that styleUrl is a legitimate child of NetworkLink.  In other words, if you want the Placemark in the linked file to be scaled at 0.80 with one NetworkLink, and scaled to 0.25 with another NetworkLink in the same file you should be able to do it.  But, alas, it does not seem to be possible.  About the closest thing is to have a styleUrl in the linked file make reference back to style information in the file containing the NetworkLinks themselves.  This approach was as succinctly described by TJ1 on KML Developer Support.

Code Shavings  The user-defined function base.getFilenameTxt() was incorporated into E.WellSymbolDetail VIEW.  It seems to me, though, that it should really be named base.getFileNm()?  ♦  Although it was done some time ago, the column id_Type_extendedData was added to the base.ZoomLevel table.  There are currently two codes associated with it; Full and AbbreviatedFull is the entire set of extended data associated with the Google Earth (GE) Placemark which generates the well symbol, and Abbreviated is a slimmed-down version which was implemented to make the KMLs a little smaller.  ♦  E.getRegenerateTxt was modified to incorporate a temporary table to store results prior to returning them.  It’s so the style generation and the page generation calls to sqlcmd are returned as one set of records, rather than two.  ♦  The column styleLbl to E.WellSymbolDetail.  It’s basically the same as the nm column, except no extension.

*In previous posts, these were referred to as pages.  It would seem as though the common parlance is ’tile’, however.  Therefore the term ’tile’ will be used in this and future posts.

Posted on 23rd July 2008
Under: Developers' Journal | 3 Comments »

Sitemap.xml Revisited

In a post a couple of weeks ago, the rhetorical question was asked as to whether Sitemap.xml can be used to expedite access to the KML files containing Intellog well data.  It would appear as though the submission process worked perfectly, and the Google Webmaster Tools (GWT) indicated a regular crawl of the two Sitemap.xmls was occurring.  However, in the intervening period, none of the KMLs referenced in the Sitemap.xmls showed up in the Google index, as expected.

To attempt to resolve this problem, the existing Sitemap.xml references were deleted from Google, also using GWT.  A new Sitemap.xml was then created and submitted to Google containing just a single URL, and the Sitemap.xml file itself was placed in the root directory of the Intellog web site.  If this one KML gets indexed, then additional KMLs will be added to Sitemap.xml.  Some research on the physical location of Sitemap.xml hints it should be high up in the hierarchy.  Previously, they had been buried well down, alongside the KMLs themselves.

Although it may be coincidence, the new Sitemap.xml was crawled within a minute or two of it being uploaded.  The previous submissions of Sitemap.xmls took at least a couple of hours to be crawled for the first time.  But regardless, the KML to which the Sitemap.xml points is still a no show, to the desired result is still not yet achieved.   More news in this regard as and when it happens.

Posted on 21st July 2008
Under: Developers' Journal | No Comments »

Saskatchewan Data — Viva la Difference

In pursuit of a preview version of Saskatchewan data on Google Earth as quickly as possible, and continuing from the previous work in this area, the stored procedure SER.put_WellDetail was created.  ERCB.put_WellDetail was taken as a starting point, with the intention of using an abbreviated methodology such as described in the last paragraph of Inspiration.  Furthermore, populating E._WellDetail from SER._Well with SER.put_WellDetail was done without a great deal of concern for how it will be updated when the next version of the file is received from SER — that will be worked out next time ’round.

SER’s EM05WMMS has provision for 10 horizons (WM-HORIZON(1) through WM-HORIZON(10)), each of which is associated with a distinct substance, in the column called WM-REC-TYPE.  Quite arbitrarily, the substance provided for the first horizon will be used to identify the one substance displayed on Google Earth.  In a future version of the data, the various substances represented by multiple horizons will have be depicted in some way.

The biggest difference between the Saskatchewan data, and the Alberta data loaded previously, is the association of multiple lbls with a single license, whereas in Alberta, a given license maps seems to map to a single producing zone.  In the case of Saskatchewan license 97C135, for example, there are six different wells identified;

111/02-20-006-10W2M/00
111/03-20-006-10W2M/00
111/05-16-006-10W2M/00
131/06-16-006-10W2M/00
131/15-17-006-10W2M/00
191/06-16-006-10W2M/00

In the current (June) data set, there are 96,729 unique licenseLbls, and 100,042 unique lbls, which means of the 96,729 uniquely-identified licenses, 3.4% contain references to multiple wells.

Also, in the documentation supplied by SER, EM05WMMS’s WM-GRD-ELEV (grade elevation) is shown as being an integer value, whereas and WM-KB-ELEV (kelly bushing elevation) is shown as one-decimal place.  But yet, when you compare the two values for a given well, they really have to be the same — both expressed with one decimal precision.  This was further corroborated by consulting the Wikipedia entry for beautiful Saskatoon, Saskatchewan, which shows the elevation of 481.5 metres.  This is pretty close to the values in WM-GRD-ELEV, if you assume one decimal place.

Taking all of the above into account, and a few other relatively minor things, the stored procedure SER.put_WellDetail was completed.  The biggest speed bump in the process was taking time out to create new symbols for the Well Symbol Library for the various combinations well status, substance and types.  The library now consists of 227 different unique symbols reflecting these combinations.

Code Shavings  Just for interest, MIN(elevationAmt) from SER._Well shows a value of 2.7 metres, for the well  EMPRESS NO. 1.  A check of nearby Empress, Alberta shows an elevation of 615 metres, which would tend to indicate this particular value in the SER data for this well is incorrect.  This is a very old well, though…it’s license number indicates activity started on it in 1915.  ♦   If you exclude the data which could well be incorrect as described immediately above, it’s interesting to compare MIN(elevationAmt) to MAX(elevationAmt).

Posted on 17th July 2008
Under: Developers' Journal | No Comments »

New Symbols Added to Well Symbol Library

Intellog announces the release of a significant update to the The Well Symbol Library (WSL), bringing the total number of freely downloadable* well symbols to 227.  First released in May of 2008,  This is the first major update since the beginning of beginning of June, and reflects all the permutations and combination of symbols required, to date, for the four Western Canadian provincial jurisdictions. 

For those who may think 227 unique symbols is overkill, it’s worth taking a few minutes to understand the methodology by which the symbols are created.  There are three essential elements;

Gas Well Base Symbol  This is the base symbol for the well location, which indicates an active, producing well, and also defines the substance being produced.  These symbols are based, in part, on the industry standard petroleum well symbols.  The WSL also contains symbols for oil, bitumen, coalbed methane and other substances.
Gas Injection Well Type Attribute  This attribute defines the precise nature of the well in the event it is something other than a standard production well.  For example, the Type Attribute shown to the left denotes an injection well.  There are also attributes for disposal, observation, test and other types of wells.
Licensed Gas Injection Well Lifecycle Attribute  This attribute provides information as to where the well is at with respect to its lifecycle.  In the example shown to the left, this indicates the well has been licensed.  There are also versions of this attribute for cancelled or amended licenses, drilling, cased and completed as well as other stages of the well lifecycle.

In addition to the above, washed out versions indicate the abandoned equivalent.  So, if you add it all up, there are lots of different combinations, and therefore lots (and lots) of symbols.  As always, readers are encouraged to provide comments below.  If there is a specific symbol you would like to see, let us know, and we’ll get it into the library as quickly as possible.

*Released under the Creative Commons license.

Posted on 17th July 2008
Under: Business Development | No Comments »