Archive for May, 2009

Letter to Alberta Premier Ed Stelmach (2009-05-28)

Back in July of 2008, a letter from Alberta Minister of Energy Mel Knight brought to a temporary conclusion an effort to free up well identification data from the ERCB so it is freely available to the general public.  After a short hiatus, the effort was continued by bumping the issue up to the Minister’s boss, which is to say Alberta Premier Ed Stelmach.  The letter can be found here, or by clicking the PDF icon at the bottom right of this post.  Any response from the Premier will be the subject of a future post. 

As always, your questions or comments are welcome and encouraged.Click here for PDF version.

Posted on 28th May 2009
Under: Business Development, Data Sources, ERCB | 1 Comment »

Return to PHP and SimpleDB

It’s been a very long time — last August, to be precise — since Amazon’s SimpleDB was first mentioned.  At the time, it was being investigated as a potential method of implementing full text search.  A lot of time, effort and money has passed under the bridge since then — not to mention that search functionality was eventually implemented with Solr.  But it’s now time to return to SimpleDB, but for a completely different application; to store user profile and session information required to provide secured, session-based access to the Intellog website.  After re-reading all the SimpleDB-related blog posts, I now realize there wasn’t a lot of detail on server configuration for PHP/SimpleDB, so that will be fleshed out now.  The most relevant post was Software Mirepoix, and it’s worth taking another look to provide some context for the notes below.  The objective is to establish a more-or-less standard approach to deploying PHP/SimpleDB apps — at least an approach which can be migrated to the production environment when the time comes.

But first…there was the question of which specific PHP/SimpleDB library to use.  There were three options identified; the PHP Library for Amazon SimpleDB, the php-sdb library by David Meyers, and finally the Zend Framework (ZF).  I couldn’t find a lot of relevant information on the use of ZF with SimpleDB, despite a post to the SimpleDB Discussion Forum.  David Meyers’ library looks great, but it’s strength — masking the underlying complexity of SimpleDB interactions — was actually seen as a barrier to clear understanding of the SimpleDB interface, at least for now.  By default, therefore, the Amazon-supplied PHP library was the way to go.

There have been quite a few updates to the Amazon’s PHP Library since it was last employed, so the latest version (amazon-simpledb-2009-04-15-php5-library.zip) was downloaded, and unzipped into a folder of the same name (minus  the .zip, of course).   Within this folder, there was another called src, and the ReadMe.html.  Within src, in turn, there was a folder called Amazon.  The latter folder was the one copied from C:\Program Files\PHP\include, and the include_path line in php.ini was modified by appending C:\Program Files\PHP\include to its existing definition.

.config.inc.php was put in C:\Program Files\PHP\include\Amazon\SimpleDB.  This is the same place as the library class files Client.php, Model.php etc., rather than in the folder(s) where the Roundabout application is located.   Because .config.inc.php contains the Amazon Web Services (AWS) access key and the secret identifier, it was felt it was better to keep it out of the web-accessible hierarchy folders under DocumentRoot.  Incidentally, .config.in.php also contains the __autoload function which according to the inline documentation "is responsible for loading classes of the library on demand".   It’s not 100% clear what this means, but the net effect of the function is to make all the classes in the library available to the application code. 

To provide an initial test of the configuration, one of the samples provided as part of the library –  ListDomainsSample.php — was copied over to the app.E.intellog.com/var/www/html folder.  Just one line needed to be changed; include_once('.config.inc.php'), needed  to be modified to read include_once('Amazon/SimpleDB/.config.inc.php').  Keep in mind, the include_path in php.ini, modified above, tells PHP where to start looking for a class when it can’t be found in the local folder, so there is no need to be more explicit with the include_once statement.

Oh yes, and it’s also necessary to define the $request variable in ListDomainsSample.php, but that’s a one liner, as per $request = new Amazon_SimpleDB_Model_ListDomainsRequest(); But with that done, the code lit right up and was able to produce a listing of domains associated the SimpleDB account.  If you have any questions or comments, please do not hesitate to contribute them below, and thanks for reading!

Code Shavings  Executing the sample application above initially resulted in an error message which, amongst other things, said "[u]nable to find the socket transport ’ssl’ - did you forget to enable it when you configured PHP?"  Some Googling revealed this is due the lack of the OpenSSL extension in the PHP configuration.   This problem was addressed by upgrading PHP in the development environment from the original, installed using php-5.2.6-win32-installer.msi, to a slightly more up-to-date version, installed with php-5.2.9-2-win32-installer.msi available from php.net.  The real trick, though, was to make sure when it got to the step in the installation where it extensions are selected, that OpenSSL was one of them.  Somehow, I missed that the first time around.  The installation script even configures php.ini so it knows about the OpenSSL extension.   ♦  To this point, I wasn’t clear on precisely the way included files worked in PHP.   Turns out the file Amazon\SimpleDB\Client.php (for example) contains a single class called  Amazon_SimpleDB_Client.  Note the name of the class mirrors the directory structure, except the slashes have been replaced with underscore characters.  This pattern appears to be adopted for all files in the Amazon library.  ♦  As with OpenSSL, the installation script for PHP is smart enough to know to add extension=php_xsl.dll to the php.ini file.

Posted on 26th May 2009
Under: Developers' Journal | No Comments »

Collections of People in the Intellog Database

Click for larger image.Collections of people are represented in the Intellog database by resolving a many-to-many relationship between base.Person and the base.Team using the base.populates table.  This simple, three table structure is illustrated in the diagram to the immediate left.  It allows a given person to be a member of an unlimited number of teams, and for a given team to have an unlimited number of members.  All three of these tables already existed in the Intellog database, but a little cleanup was required.  In particular, the xml attribute in Person and Team had to be upgraded from the iSentence schema collection to iSentenceV1.  A migration of this type is still surprisingly awkward unless I’m fundamentally missing something about casting between XMLs belonging to different schema collections.  It involved creating a second XML attribute validated with the iSentenceV1 schema collection, migrating the data from the old attribute to the new one, dropping the old one and finally renaming the new attribute to the same name as the old attribute.  Whew.

In addition the cleanup exercise, it was considered necessary to create a base.putTeam stored procedure which is used to populate the base.Team table.  It implements a few assumptions;  in addition to a verbose name, each team is identified with a unique four character label*.  It just seemed to be a handy middle-ground between the full uid, and the verbose name of the team.  When base.putTeam creates a new instance of base.Team, it queries to see if the lbl exists, and if it does, skips the creation of the new instance, and displays the existing instance.  Surprisingly, neither the lbl or nm occupies its own attribute yet in base.Team — they are both found embedded in the xml attribute an iSentence.  This may seem a little quirky to some.  However, while the database is in an evolutionary phase, it doesn’t make sense to be adding and deleting attributes (ie. columns) all the time.  Also, base.putPerson has not been usable with base.Person for quite some time, so it was fixed up to make adding instances to base.Person a little easier.

With this done, Person and Team were cleaned up and their population completed to support the testing of the Roundabout application, the development of which is ramping up again.  To make it easy to verify the accuracy of the population of all tables, the base.TeamMember VIEW was created which INNER JOINs base.populates to base.Person and base.Team and displays the verbose names of both.

*This started life as the ERCB’s ST104A code, but was extended to cover off companies not in the ERCB database.

Posted on 15th May 2009
Under: Developers' Journal | No Comments »

The Basic Login Experience

The previous post described some of the objectives of Intellog customer* authentication, and this post provides some guidelines for the design and implementation of the login process.  This is a tad more than simply identifying a screen layout (see mock up, left), and describing functionality.  It also tries to capture some of the other aspects of the ‘experience’ which will have an impact on the customer’s interaction with Intellog.  Everything below assumes the customer profile has already been set up.  Also, the process for changing the customer profile is beyond the scope of this post.

The single most visible artifact of the login process is the string of characters (that is, the user ID) used to identify a given customer to the site.  Because it is top-of-mind, the customer’s email address is the user ID on Intellog.  Once the customer enters their email address, and clicks the Login button at the bottom right, the email address is used as a key to retrieve the customers profile information.  If no profile can be retrieved based on this key, then it’s assumed the email address was either typed incorrectly, or no profile exists for that customer.  The login screen is recycled with some error messages displayed.  No information from the profile is displayed to the user until the round trip to the identity provider** (IP) described immediately below.

Amongst a variety of other information, the customer’s profile contains the IP to which they normally authenticate.  Which specific IP they are using will determine what appears next.  If they are authenticating to Google, for example, the Google-specific login pages will appear.  If they are authenticating to an Intellog-provided OpenID, Intellog-specific login pages will appear.  Noticeable by its absence is a field in which to enter a password.  Supplying a password, or other authentication information is delegated to the login page(s) from the IP.  This also leaves the IP with sole responsibility for handling a ‘hot’ combination of user ID and password  and/or other authentication information.

The Remember Me checkbox really isn’t doing much on this screen — if it has been checked, it’s assumed the value from the Email Address field entered previously (and stored in a cookie, most likely) is automatically used to retrieve the customer profile.  This is followed by automatic navigation to the IP associated with the email address.  If it’s the same customer using the client machine as last time, they will be able to provide credentials to the IP and be authenticated.  If it’s a different customer using the machine, they obviously won’t have a clue what credentials are required, and they’re dead in the water. 

Unless…unless…the IP also has its own Remember Me or equivalent facility, which allows credentials to be cached on the local machine, and supplied automatically.  If the customer elects to use the IP’s Remember Me facility, they have tied the security of their account to the physical security of their client machine.  That’s their choice, of course, but one which would be hard to recommend to anyone in their right mind.

It’s assumed the customer authenticates to their IP, and they have elected to send some of their IP-stored profile information back to an Intellog page.  The profile information returned by the IP is compared to what had previously been stored in the Intellog customer profile.  If there are any differences, the information from the IP is considered authoritative and overwrites the Intellog customer profile information.  This is aligned with a philosophy that  in the future, in a galaxy far, far away, security credentials will be managed in exactly one place, and they will propagate out to whichever services a given person uses.  Change your security credentials in this fabled ‘one place’, and they will automagically appear everywhere.

With all that done, the customer will come back to the first application page available to logged in users.  The most telling evidence of this fact will be the appearance of the users first name, last name and company affiliation in the top right of the standard Intellog header bar, and a Logout button on the footer bar.

*User had a very pejorative ring to it, to it is henceforth expunged in favour of the much jauntier sounding customer.

**See previous post for full discussion of the concept of identity provider.

Posted on 12th May 2009
Under: Developers' Journal | No Comments »

Intellog User Authentication Manifesto

The Parapet After a round of marketing meetings related to the Onramp beta release, attention returns to development issues.   Next up, finalizing the details for the user authentication process previously discussed in the blog posts …Roundabout OpenID Login Workflow and Implementing OpenID with PHP.   Much of what is contained below reflects some rapidly evolving trends in identity management, coupled with a more developed understanding of how users will interact with Intellog.

As much as I would like to say it’s going to be an OpenID world — some day, I hope it is — there is clearly going to be a lot of competition for the user identity parapet.  The likes of Google, Microsoft, Yahoo, Twitter, AOL and others all see their native user identification as the centre of their customers’ universe.  They won’t encourage user migration to an open standard they do not control, because there is just too much potential value in knowing what other sites their customers are visiting and using.   In addition, the shear number* of users with identities on these major sites means OpenID will have to co-exist with these other de facto identity providers for quite some time to come.  Therefore, the Intellog authentication system needs to reflect not only OpenID, but the identities provided by the other major sites.

It’s logical to assume, therefore, users coming to Intellog will already have an online identity (of which OpenID is just one possibility) and may want to use it, instead of physically re-entering all their user information for an Intellog-issued identity and profile.  To the greatest possible degree, therefore, the details of an existing online identity should be transferable to the users’ Intellog identity and profile, subject to the user’s approval, of course.

At the same time, there will be users who have an online identity as noted above, but don’t care to use it anywhere other than the site which issue it to them.  It’s also possible a very small percentage of potential users will have no online identity at all, or at least one of which they are aware.  In either of these cases, it would be desirable for Intellog to issue a branded user identity of its own.  This is preferable to sending them off to a third party to get an identity and potentially lose them on the return trip to Intellog.  Identities issued by Intellog should be OpenIDs, to remain aligned with the trend to separate identity management concerns from content provision, which is likely to be of increasing importance in the future.

Subsequently, when an Intellog user logs in, they will identify their particular identity provider (IP), and will then work through the authentication process required by that particular IP.  Assuming they successfully authenticate themselves, they would then automatically come back to Intellog, with some sort of identifier which could be used to retrieve their Intellog profile.  Following this, an Intellog session will be initiated to maintain their ‘logged in’ status, and other session-specific information.  This session will exist until such time they completely close their browser.

In summary, and to use OpenID parlance, Intellog must have the ability to act as a relying party for OpenID and all the other de facto identity providers.  It would also be desirable for Intellog to be an OpenID identity provider, which would enable potential users to create an OpenID which would then be used to log into the Intellog site.  In this latter case, it will be important to have the creation of the OpenID and subsequent login as ‘branded’ an experience as possible.  This is because there may be some users unfamiliar with OpenID and shared identity concepts, who may be put off by seemingly having their personal information shared between multiple parties.

Next up, some specific technologies which would appear to address the requirements outlined above.

*A quote attributed to Scott McNealy (Sun) was "a standard is anything shipping in numbers".

Posted on 11th May 2009
Under: Developers' Journal | 1 Comment »