Archive for February, 2009

Slightly Rethinking the Retrieval of URLs

Well, fortunately, not everything.  However, as the PageIndex concept was coming together, it became obvious that keeping the URL in a separate file (Library.xml) made little or no sense.   wrap.sh was wrapping all of the metadata (eg. document source & type, page sequence, etc.), but the URL was being rationalized out to Library.xml.  As described previously, when the results of a search were being styled with searchResult.xsl, the uid was being used as a key to reach out to Library.xml for the associated URL. 

The arguments in favour of making a change to this approach are as follows; 1) The logic of data retrieval and styling is getting mixed up in searchResult.xsl; they really are conceptually separate ideas.  2) The performance of the join logic will likely degrade when the number of documents increases and Library.xml gets very large.  3) Solr is doing a great job retrieving the rest of the metadata anyway, so it might as well haul back the URL, as well,  and finally 4) keeping Library.xml in synch with the index doubles up on the maintenance.  The main counter-arguments are 1) changing the location of the file referenced by the URL requires the entry to be reindexed in Solr*, and 2) it doesn’t ‘feel’ normalized, in a traditional RDBMS sense.   The former is really no big deal, as Solr handles reindexing elegantly and quickly, and the latter is esoteric, at best.  But old habits die hard.

Making the change started by adding some metadata (sizeAmt, sourceLbl, typeLbl  and Udt) to the instances of base.Document referencing all of the ST1, ST49 and ST96 documents.  This metadata was all stored in Document.xml.   The procedure with the unwieldy-but-surprisingly-descriptive name ERCB.getReindexStatisticReportExecutableTxt was created to automate the generation of the content of wrapBatch.sh and index.sh for any instance of base.Document referencing any of the statistical ST reports.  The most significant capability of getReindexStatisticReportExecutableTxt is the inclusion of urlTxt as well as all the other metadata in the output.  wrapBatch.sh then feeds these parameter sets to wrap.sh, which creates a Solr-compatible XML.  index.sh then feeds these XMLs to Solr, thence reindexing the entries.  Solr’s schema.xml was edited to add the urlTxt field.  A sample of 10 was tested, right through to ensuring urlTxt could be queried out of the Solr index.  It all tested out OK, so the reindexing of the ST reports is ready to go.  To this end, wrapBatch.sh was run against the entire set of ST reports, generating on XML per report — 8688 in total.

Code Shavings  A new scalar-valued function, E.getTypeLbl was developed, which generates a standardized typeLbl from the characteristics of urlTxt.  ♦  One of the bigger hassles was bringing in the file size of the statistical (ST) reports.  I ended up hacking together a spreadsheet (fileSizeHack.xls), which takes the output of dir ??.txt /s, and massages it so you’re left with the path name and the size of the file.  This was then imported into SQL Server as dbo.Filter$ (huh?), and then merged with base.Document.  ♦  As noted above, adding the urlTxt to Solr necessitates the reindexing of entries.  I figured while I was doing this, I might as well migrate the files from E.intellog.com/data to the new standard data.E.intellog.com, described a while back.  ♦  Interesting fun fact — using BucketExplorer to change the Access Control for the objects in data.E.intellog.com took longer than uploading the objects themselves!

*Currently, Solr does not have the ability to update a single field in its index.  The entire entry has to be reindexed.  However, updating a single field is a capability which will likely find its way into the application in the future according to solr-user@lucene.apache.org.

Posted on 24th February 2009
Under: Developers' Journal | No Comments »

Testing Prior to Next Beta Release

It may seem these blog posts are coming less and less frequently.  It’s been over a week since the last one, and I wish I could point back to one, single major accomplishment over the period.  Instead, a series of small projects have been undertaken as a bi-product of comprehensive integration testing of the next beta release.  Hopefully the frequency will pick up again once the beta release is out.

Intellog.php/Onramp.php Shared Code Libraries  PHP code which is shared between pages had been stored in [siteRoot]/php/Library.php and [siteRoot]/[appRoot]/php/Library.php.  The former contained PHP code that was shared across an entire suite of applications, whereas the latter contained code specific to a given application.  It was simply too confusing having two files with the same base name, so these were renamed to [siteRoot]/php/Intellog.php, and [siteRoot]/[appRoot]/php/[appName].php.

New Approach to Application Title  To fix a minor problem with the page title, I ended up reworking how titles were created and associated with the <TITLE> tag at the top of each page.  Previously, an IFRAME was populated with a call to getPageElement.php.  The title was then parsed out of the IFRAME contents, and then assigned to document.title with JavaScript.  That’s an obscure approach, at best.  The revised approach is illustrated in the diagram to the left, with the orange boxes illustrating the blocks of PHP logic, which are found in the shared libraries described immediately above.

Taxonomy Changes  The organizational structure for the application has evolved considerably, since it was first described back in October.  The most significant change is the complete eradication of the [marketRoot] and [appSuiteRoot] concepts, in favour of subdomain prefixes.  Before; intellog.com/E/app/Onramp/php/inputSearchCriteria.php.  After; app.E.intellog.com/Onramp/php/inputSearchCriteria.php.  This change was precipitated by a change in hosting arrangements for the site.  At some point in the future, these changes will be documented more fully.

Code Shavings  I ran into a vexing problem with the passing of external parameters to an XSL from PHP.  I was using the setParameter method of the XSLTProcessor class to pass pageLbl to the XSL.   applicationLbl was also passed, but only in the event pageLbl belonged to an application as opposed the core Intellog system.  <xsl:if> logic in the XSL tests for the existing of this latter parameter.  But in cases where the test failed, that was the end of the road for the XSL — it processed no further instructions.  Only by adding <xsl:param> ‘declarations’ at the beginning of the XSL,for both parameters, did the logic perform as expected.  ♦   Found what seems like the definitive article on CSS layers.  Well, even if it’s not definitive, then it’s certainly pretty good!  Thanks to echoecho.com for this.

Posted on 19th February 2009
Under: Developers' Journal | No Comments »

Further Work on the Intellog Help System

As described in the previous post, there are currently two major elements* to the Intellog help system.  The helpTxt elements embedded in ApplicationConfiguration.xml, and Glossary.xml, which provides  a centralized repository of definitions of the Intellog vernacular.

Cross references within the glossary, and references from helpTxt to the glossary are common.  Therefore, it made sense to implement a shorthand method of creating these links;  <DIV class='termNm'>some text in here</DIV> was chosen.  Any text surrounded by this tag is transformed into valid hyperlink syntax when ApplicationConfiguration.xml and/or Glossary.xml are transformed with outputHelp.xsl and/or outputGlossary.xsl (respectively).  If text tagged in this way does not exist in Glossary.xml, a short message is displayed to this effect.  This provides a visual reminder of the surrounded text either being misspelled, or a glossary definition has not yet been written.

A hang up in the development process was lack of clarity with how the cross reference system would work.  The entry point to the glossary will most often be a link from a helpTxt element.  But similar links must also exist within Glossary.xml itself.  At first, it seems the same XSLT transformation code could be used for both, but this is not the case.  With helpTxt, the link needs to be accompanied with logic to handle the creation of the glossary pop-up window, whereas with a link within Glossary.xml, it can be assumed the glossary pop-up is open already.  It must be, otherwise, you wouldn’t be able to see the link to click.   Somehow, that obvious fact eluded me for a couple of hours.

After some debate, it was determined only one glossary pop-up window would be open at any one time.  If a glossary entry is clicked, the glossary pop-up window is created if it doesn’t already exist.  If it does exist, then it’s simply brought to focus, and brought to the top level.  The glossary pop-up window maintains its own page history, so it will be possible to implement forward, back and other functionality to take advantage of this history.

Code Shavings  The more I look at them, the more outputHelp.php and outputGlossary.php look like the same code.  They can likely be rationalized into one program (outputReference.php?) to which you pass parameters to determine what kind of reference information to display.  ♦  Further to previous point, the outputHelp.css and outputGlossary.css files were merged into one, called outputReference.css.  ♦  And further to that point, the term reference is becoming the collective term used to describe the various elements of the help system. ♦  Thanks to Gary’s House of Wacks for the excellent article on the coding of Javascript pop-up windows.

*It’s likely a third element will be added; reference documents.  These are complete, verbose documents describing various aspects of the Intellog applications, which the user can read if they want the ultimate level of detail.

Posted on 11th February 2009
Under: Developers' Journal | 1 Comment »

The Intellog Help System

The time had finally come to put together some user documentation to support the release of the new and improved search application (Onramp).  There were three primary design objectives for the help system;

  1. Context Sensitive — Only information directly relevant to the user at any given in time should be displayed.  It’s assumed long, verbose help  won’t get read, at least not past the point where the user has enough information to continue.
  2. Maintainable — The repository for the help text should be in close physical proximity to the implementation code, so changes to the functionality can quickly be incorporated in the related documentation.
  3. Rich and Robust — The content should be rich and robust enough to be the sole source of user documentation, given there are no plans to produce a printed user manual.  If a given user feels they absolutely must see a printed version, they should only have to print specific help information they need.

The content is physically located in ApplicationDefinition.xml, given this file aleady reflects the overall structure of the applications.  It was only necessary, therefore, to add the helpTxt element in each place where help information would be useful.  Specifically, in the case of the Onramp application, this meant adding the /ApplicationDefinition/Onramp/helpTxt element to describe the Onramp application overall, ../Onramp/phpPg/helpTxt to describe the specific pages, and ../phpPg/inputFrm/fld/helpTxt elements for each field on the form.  Although it may not be added in the initial release, there will also likely be ../phpPg/applicationActionBstp/btn/helpTxt elements as well, which will describe the individual application action buttons located on the footer bar at the bottom of each application page.

The user accesses the page-level help by clicking the Help application action button.  In addition to the description of the page found in ../Onramp/phpPg/helpTxt, all of the populated ../phpPg/inputFrm/fld/helpTxt and (eventually) ../phpPg/applicationActionBstp/btn/helpTxt elements found on the particular page are displayed.   In short, clicking the Help button provides a comprehensive display of all available help for a given page.  In the case of an input form, clicking the labels of the form will bring up the../phpPg/inputFrm/fld/helpTxt related to that field on the form.  Eventually, shift-click (or similar) of the application action buttons will bring up the ../phpPg/applicationActionBstp/btn/helpTxt for that particular button.  helpTxt itself is intended to be valid XHTML, with the containing <HTML> and <BODY> tags being implied*.

In order to streamline the helpTxts, verbiage repeated frequently is rationalized out into a glossary.  For example, when the documentation says something like "…buttons are found in the footer bar of the…", the words footer bar are hyperlinked to the glossary, which contains a more verbose definition.   Given the glossary doesn’t really follow the structure of the application, the  information is located in /ApplicationDefinition/xml/Glossary.xml.   If necessary, application specific glossaries will eventually be located at /ApplicationDefinition/[applicationLbl]/xml/Glossary.xml.

The implementation of the help system is accomplished with four related blocks of logic.  It starts with the Javascript function outputHelp, located in Intellog.js.  This function sets up the characteristics of the pop-up window, and then hands off the flow of control, along with the URL parameters fieldLbl and pageLbl to outputHelp.php.  outputHelp.php lays out the basic elements of the help page, and transforms ApplicationDefinition.xml with outputHelp.xsl, which extracts the helpTxt elements.  Finally, the physical appearance of the help dialogue box is styled with outputHelp.css.  A similar, parallel set of objects is use to output glossary information (eg. outputGlossary.php etc).

*In other words, the first element in helpTxt would be a <P> element.

Posted on 9th February 2009
Under: Developers' Journal | 1 Comment »