@madpilot makes

Why web apps can get away with being in beta for a long time

An article was recently posted on the Wall Street Journal website entitled “WSJ.com – For Some Technology Companies, ‘Beta’ Becomes a Long-Term Label” which asked the question how can companies get away with leaving software in a “beta” state for so long. Google is notorious for doing it – Gmail has been around for at least two years and is still tagged as beta. In fact, many people are using Gmail as their primary email address. There are many real-world analogies as to why you wouldn’t do this in the article, so I won’t bother repeating them here.

It does bring up an interesting point though. How come web software can get away with, nay, thrive on, releasing beta software?

Traditionally, the main thrust of software was the underlying business logic. It has only been in recent times where user interfaces and user experience has become important. It is often impossible to know exactly how a user is going to react when they start to use an interface. Being in a constant state of beta allows designers to be constantly tweaking elements of the design. If things change drastically, users will put it down to the fact that the site is still in beta.

This is relevent when adding features. As a developer, you may find a new “must have” feature that you are sure that your web site requires. It is a more effective use of your time to build a version quickly, so that you can gauge it’s popularity, and then concentrate on making it robust if you know it is worth-your-while. You couldn’t get away with this without being in a beta state.

Obviously, this isn’t a mechanism for everyone. I would even think about using a security product that was in beta. But for web apps that aren’t doing anything mission critical then it should be fine.

Cookie-less sessions in PHP

There is a fairly good chance that all but the most trivial dynamic sites will use sessions to emulate a stateful environment. Before the creation of sessions, the developer would have to manually pass all of the variables the next page needed via hidden input fields or cookies. As you can imagine, this is a bit of a security risk. Not to mention that cookies are limited to 4k each – if you have a large number of variables, this can quickly run out.

PHP, like most server languages can handle your sessions for you – but how does PHP keep track of sessions? There are two ways – using cookies and hidden fields. The former is the preferred way as it is easiest and is more secure (But not with out it’s risks. Check out the PHP session documentation for more info on the safe use of session variables). PHP writes a cookie which holds the PHPSESSID variable. It looks for this value when start_session() is called and if it exists, matches it to the stored session. But what happens when cookies aren’t available or are turned off? PHP is smart enough to work this out and will re-write all links and forms on the fly to ensure that the PHPSESSID is passed on every action that will refresh the page.

Well, almost. When I code, I prefer to postback to the same page, do some processing (such as error validation and storing of required session variables) and then use header(“Location: “) to re-direct the user. This makes coding much easier because all of the business logic for that page is kept in one file. Unfortunately, PHP isn’t smart enough to re-write these redirect, so we will have to help it out a bit.

append_sid

The following function will look at a URL (Whether it be relative or absolute) and append the SID if and only if PHP is in cookieless mode. I couldn’t find a PHP function that will tell you whether cookies are working, however after a bit of thought it was quite easy. PHP has a super-global called $_COOKIES. In this super-global, the PHPSESSID variable is referrenced. By checking whether it exists (After a call to session_start) we can work out whether we are in cookie or cookie-less mode:

function append_sid($link) {
    if(session_id() !== NULL && !isset($_COOKIE['PHPSESSID'])) {
        if(strpos($link, "?") === FALSE) {
            return $link . "?PHPSESSID=" . session_id();
        } else {
            return $link . "&PHPSESSID=" . session_id();
        }
    } else {
        return $link;
    }
}

Basically what happens, if the function checks to see if the session has been started – if it has, it then checks if the cookie entry exists. If it does, it just returns the link unchanged, otherwise it appends the session id. The inner most if statement checks for a ‘?’. If it exists, we can assume that the link already has some get parameters, so we will append the PHPSESSID using an ‘&’. Otherwise, just append it with a ‘?’.

JavaScript

Don’t forget that if you are re-directing client using javascript or using sessions when making AJAX calls you need to pass the session id as well. The best way to do that is client side, especically if you are generating parameters on the fly. The same theory applies, except we will use JavaScript this time:

function append_sid(link) {
    <?php if(session_id() !== NULL &#038;&#038; !isset($_COOKIE['PHPSESSID'])) { ?>
        var session_id = '<?php echo session_id ?>';
        if(link.indexOf('?') == -1) {
            return link + '?PHPSESSID=' + session_id;
        } else {
            return link + '&PHPSESSID=' + session_id;
        }
    <?php } else { ?>
        return link;
    <?php } ?>
}

The reason there is PHP code interleaved (I know. I hate it too) is because we need to work out whether the session has actually been started and we need the session_id. However, we don’t want to display the session id UNLESS there is a requirement for it as it can be a security risk.

A caveat – passing session id by URL requires the session.session.use_trans_sid tio be set to 1. Most shared hosts turn this off, but you can usually override it by putting:

php_value session.use_trans_sid = 1

in the .htaccess file. You can also try the function:

ini_set('session.use_trans_sid', '1')

On every PHP page.

Now, hopefully, you won’t be cutting off those users that don’t use cookies!

Developing for Accessibility

Accessibility, as the name suggests is about making websites accessible to as many users as possible. Unfortunately, not everyone on the planet has 20/20 vision, or full use of their hands, which can make using traditional web sites difficult – why should they be disadvantaged by not being able to harness the plethora of information out in cyberspace? Even beyond that, there are many able-body web users that are using non-standard browsers and software to access websites. PDAs, mobile phones and hand held game machines (such as the Sony PSP) are examples of systems that have restricted screen sizes and memory footprints which makes packing a full blown web browser impossible. Another benefit of an accessible website is it makes the search engine robots’ job much easier.

Different ways of using the web

The web is no longer confined to the walls on universities, or geeky tech type people – 14 million people in Australia have access to the internet [Nielsen/NetRatings]. Even if you ignored 5% of that audience, that is potentially 700,000 people that cannot access your website. This is especially relevent to government websites who theoretically should be targeting everyone.

There is no way of guaranteeing that all of these users are using standard desktop setups either: some variations include:

  • Screen readers: These programs are used by sight-impaired users. They read in the HTML markup and read it out using a synthesised voice. These system tend not to understand JavaScript or Flash.
  • Braille readers: Again used by sight-imparied users. Very much like a screen reader, except they output the page in Braille
  • PDA/Mobile phone: very handy for those who are out on the road a lot. These usually have a reduced screen size, and limited memory. They also have different method of inputting data, so things like JavaScript menus will probably not work.
  • Slow internet connections: People out in the bush (Or even the suburbs) may not be able to get broadband, or reliable modem connections. So there may still be a number of people that try to reduce bandwith by switching off images.
  • Text-only browsers: Not really sure why you would still use these other than because you can – although, they can provide a good indication of what your site looks like to a search engine spider.

Accessibility and Standards

The basics of accessibility can be acheived by following the World Wide Web Consortium (W3C) standards documents. These standards are about creating semantic, structured layout that can easily be intepreted by any system that understands the standards. The standards break up the structure layer and the presentation layer, by usingl Cascading Style Sheets (CSS), which effectively gives the designer control over how an element looks without destroying the document structure. The names of HTML tags have been carefully selected to decribe document elements. Unfortunately, in the old days, many designers abused the default display properties of some elements, misusing them to present information in a certain style.

For example, an old trick to indent text would be to wrap it in tags. It is hard to image how a screen reader would handle this – it comes across some text it expects to be an un-ordered list, only to find a block of text. The correct way of doing this would be to wrap it intags, and use the CSS property padding-left to indent the whole block.

Before CSS, it was common to mix up the order of items in the HTML to fit the presentation of the page. This means that a user running a screen reader would have the contents read out in the wrong order… Not very usable at all.

Don’t rely on colour

Colour blindness is a condition that stops the sufferer for perceiving certain colours. This can create two potential problems for the web designer – foreground-background contrast and colour for notification. Any user with normal vision knows how hard it is to read dark blue text on a black background – because the contract between the two colours in not enough. However, there are certain colour combinations that would be fine for non-colour blind people, but would work for those that did have the deficiency.

To be continued… [I’ve to go to my exam now – this will be interesting]

Deployment Systems

I recently read about the deployment model that Flickr uses. After picking my self up from the floor, it made me think about my own deployment methods, and I have started implementing a new system, which I though I would share.

First off all, let me describe the types of projects that I have to deploy:

  1. Internal MadPilot or Personal projects – i.e. stuff hosted on the server at my house (which also doubles as my development server)
  2. External projects that I can take a local copy of to work on
  3. External projects that I am forced to work on “off-site” i.e on the clients server.

The deployment system I will describe here covers project type 1 and 2. Not much I can really do with 3, as I have no control over other people’s servers…

Most of the projects I work on, I’m the only developer, but the system described is designed to scale to multiple developers.

Step 1: Set up a subversion repository

I have my subversion repository structured as such: [svn_root]/MadPilot/Clients/[client_name]/[job_name]/

Because I do a lot of outsourced work, I use the job_name to denote different jobs for the same client. If the client is a one off (i.e. the client isn’t another web company) I will usually leave this off. The website code will be stored in the Website directory. This allows me to store documentation and other bits and bobs outside the development tree.

Step 2: Checkout the empty tree

Next I create development and test directories with in my webserver: usually of the form [web_root]/clients/[client_name]/[job_name]/dev/[developer_name] and /clients/[client_name]/[job_name]/test.

I then check out the empty subversion tree to all of the directories so that they become under source control.

Step 3: Setup the webserver

At this point I will set up a virtual host for the client of the form [developer].dev.[job_name].[client_name].clients.madpilot.com.au and test.[job_name].[client_name].clients.madpilot.com.au (Yes, this does end up with stupidly long URLS, but I prefer them to be descriptive. No one sees them other than the client anyway…)

Step 4: Import the initial tree

It is at this point I download the existing site, or start a fresh if it is a from-scratch job. I then add all the files to the repository.

Step 5: Setup any databases

I try to make sure that every developer has a separate database, so does the test system. Usually named [client_name_[job_name]_[development_name]

Using the system

When you start working on a project, you make sure you login to your working directory on the server (usually via SSH) and do a subversion check out. When you finish, you check your work back in. When the system is at the point of testing by the client, you check everything out to the test site.

Using subversion means that you have a versioned backup of all the code, and you can manage multiple developers and a test system at the same time.

Going to production

This can be the tricky bit. For projects on my server, I usually just do an export into the production directory. If the system was tested well enough, if should just work. Gotchas include paths to local files and permissions, but this should be documented anyway.

On external systems, this can be a little more difficult. At the moment, I export the code into a temporary directory, and manually FTP the files up. I hope to be able to automate this in some way, even if it will just reduce me typing.

Next thing to do is to write a nice web front end to subversion so I can do “one-click” deployment… Ahhh :)

So that is my plan – what does everyone else do?

PHP 5 and MVC

I quick entry today, as I really should be working.

I came across a PHP MVC (Model/View/Controller) framework for PHP 5 – You can think of it as the Rails bit of Ruby on Rails. It is this sort of stuff that will continue to push PHP into the spotlight and allow it to compete with the big boys…

It is called Agavi PHP MVC Framework

Web 2.0 – The Ultimate Collaborative Development Environment?

Web 2.0 is about data exchange and classification. Taking a high-level look at software design process one of areas that is good in theory, that fails in practice is data exchange and classification.

Think about the last software project you worked on – would a new programmer be able to pick up the documentation and work out what is going on? Could they be able to find the documentation? Was there documentation at all?

Documentation is all about data exchange – I bet if you went through your email or project mailing lists you could piece together a decent amount of usable documentation (Implementation decisions, solutions, bug reports etc) – Can Web 2.0 provide the glue to ease the burden of documentation?

Imagine being able to tag bugs and design decisions so searching for a bug returns a link that points to the design documents and comments entered at check-in. This gives the developer background information quickly – and these docs are living and more easily maintainable. As a developer, you don’t have to change to documentation mode – it is already part of your work flow (Other than remembering to enter check-in comments – but you already do that don’t you?)

We don’t necessarly have to limit this to documentation either. Project managers could use it to gauge where the project is at, clients could actually report bugs with out having to learn an obscure bug reporting interface or software terminology. How about timesheets? Add an entry stating you started work at a particular time, and then enter another entry saying you finish at a particular time – tag it wit the project name, maybe what part of the project you were working on – Voila! Or meeting minutes – these often have action items interleaved – tag them and make them searchable. Tag them as completed when you are done.

But I think the best bit is that it wouldn’t require anything more than an email client or a blog-like web interface. EVERYTHING IN ONE PLACE! I know I personally hate having to log in to a different apps for bugs, and meeting minutes etc. We should be letting the software organise our data, and leveraging search.

I think I will be revisiting this one soon…

Design Patterns in PHP

I was going thorugh the posts from my old, now-defunct blog, seeing if there was anything I could bring over here — it is amazing how much can change in a year. There was an article I wrote about over use of cool techniques. In that article, I made mention to some new fangled technique called “Design Patterns”. At that point, I had no idea what they were and frankly couldn’t care.

Well, after being forced to look at them more closely for a uni assignment, I’m kind of a little hooked.

Design patterns are abstract solutions to common problems. Huh? Yeah, that is what I said. Many programmers strive towards code re-use. Design Patterns encourage thought re-use. Why re-invent the wheel? And because they are abstract (i.e. no code), they can be “ported” to different languages easily.

The Gang of Four introduced 23 of the things. They have put the challenge out for the discovery of more, and they haven’t been too successful as it is believed that almost all problems can be broken down to fit a composition of these rules.

To implement many of these design patterns correctly, you really need OOP features such as Abstraction and Interfacing. As I have pointed out many times, PHP4 doesn’t have these. PHP5 does. However! if you think about it, you can still use design patterns in PHP4 — you just need to be a little bit careful.

I won’t go through ever design pattern just now, but, I’ll outline one, and maybe add to them in the future. [By the way — an awesome book to use for getting your head around design patterns is “Head First Design Patterns” — look it up on Amazon]

The Stategy Pattern

Programmers often get into the habit of extending classes to change functionality — maybe because it is the easiest OOP function to understand. It can lead to problems though when you need to change implementations in a base class. I’ll use an example from a system I was writing the other day: the ubiquitous PHP email sender. The job I was doing required two types of email to be sent — a run-of-the-mill text email and a text email with an attachment.

Because I’m always trying to build my code libraries up, I decided to create set of classes that will do the job. Now, pre-design patterns, something like this may have happened:

Possible Email Class design using extension

Nothing wrong so far, but what happens if we want to add a HTML email? We could probably add another class below the attachments, because the HTML component is an attachment I guess, so we add the HTML file as one of the attachments at design time. This isn’t a great solution though, because one of the necessary bits to a HTML email is the HTML content, and you could theoretically remove it. How about we add another variable specifically for the HTML copy? That would also work, but what if the client decided that they wanted to select between HTML and text emails dynamically at run time? You would end up with something like this code wise:

switch($mailType) {  
  case &#8220;text&#8221;:  
    $email->setBody($text);  
  break;  
  case &#8220;html&#8221;:  
    $email->setHTMLBody($htmlText);  
  break;  
}

(This is a bit of a contrived example, so run with me here)

now what happens if you want to add another type of email? You have to modify the case statement and test the whole thing all over again. Everyone hates testing – especially code that you know used to work that you had to change. Two keywords: CHANGE and TESTING. Minimise both and we as programmers are happy.

Now my solution was to create a class that has everything an email needs – To, From , Subject and a repository for headers (Plus a few other bits and pieces, which I won’t mention to reduce clutter). It also has a function called send(). Wow, ground breaking. Where the design gets a little strange, is there is another variable, called $emailType. This variable stores reference to a class of type EmailType (Well it pretends to, PHP4 doesn’t support interfaces). So, any class that implements EmailType can be stored in that variable. One of the abstract classes (again, let’s pretend it is abstract, PHP4 won’t understand) is the createMessage() function. This is where the magic occurs…

Each class that implements that interface know exaclty how it message needs to be contructed. The base email class doesn’t care – as long as it gets a string to tack on to the email it is happy. The creation of messages is de-coupled, meaning you can create a new email class without changing any exisiting code (As long as it implements the interface correctly).

This is an implementation of the strategy pattern!

Formal Description [Ganf of Four]: The strategy pattern defines a family of algorithms, encapsulates each one, and makes them interchangable. Strategy lets the algorithm vary independently from clients that use it.

Which is what we just did… More on design patterns later.

Dead-man walking… bar the 11th hour reprieve.

It would be pretty safe to say that PHP is the language of choice for freelance developers and boutique designers. I suppose that this stems from the fact that is freely available, easy to set up and and easy to administer, which results in almost every web host (in Australia at least) running it. In fact, most hosts over here ONLY run PHP.

It would be interesting to see why ASP.NET hasn’t got more market penetration than it has – as much as I hate to say it, I’m putting my money on the fact it costs money.

CGI/PERL scripts are quickly fading away into obscurity – I remember when they were the only choice you had. It’s a security and maintainence thing. RIP mod_perl.

ColdFusion support over here is Oz is virtually non-existent. There are two hosts here in Perth, although it still seems popular in government and places that host their own servers.

New, funky languages such as Python and Ruby are still in their minority. From a configuration stand-point, Python is no better than PERL. It is still effectively a scripting language that has been extended to work with the web. At least PHP was specifically designed as such. Ruby is too new and support intensive. I hope that this changes. It looks cool (Although, I don’t like the syntax – I’m a fan of the curly bracket. So shoot me).

It seems that PHP was in the right place at the right time. When it arrived, dynamic web was still in it’s infancy – you had PERL and that was it. It allowed pretty complex systems to be built easily and cheaply (Sure, Coldfusion and ASP were around then, but were usually out of reach of the tinkerer), now it has so much history, no one wants to let go.

So we are left with old-trusty PHP…

Now, don’t get my wrong – PHP has treated me well. In fact, up until recently, PHP counted for 95% of the coding that I did. But, the lack of some object oriented features urked me – in particular exceptions and interfaces. It is very difficult to implement design patterns properly without the ability to abstract, overload or interface.

Now before any of you point out that PHP 5 has implemented most of this stuff, let me cut you off at the pass. I know PHP 5 supports many of the thinks that I’ve been waiting for, but I’m still stuck using PHP 4 for day-to-day work. Why? BECAUSE YOU CAN’T RUN THEM BOTH AT THE SAME TIME – well, not easily anyway.

I’ve been working with the ASP.NET 2 Beta for the last couple of months and it has no problems co-existing with ASP.Net 1.x, all you do is change a setting in IIS and you are away. Why then, do you have to jump through so many hoops to get PHP 4 and 5 to co-exist? It is for this reason, that most web hosting companies will not support 5 for a long time. From an economic point-of-view, they would be stupid to. Most applications run on 4. Not all of these applications will run of 5.

I have come up with a work-around, which is a pretty neat solution (I’ll present it later on) – it does require two different apache binaries to run and is a bit of a hassle to setup, but it does the job quite well (It would do it even better if I had more than one IP address), but I shouldn’t have to use work arounds.

Perhaps we will be left stifled by the shortcomings of PHP 4, at least in our bread-and-butter jobs, for a while yet. Perhaps one of the “killer” frameworks will eventually hit critical mass and knock it off it podium. Then again. Maybe not…

Blog number 1 Part II

Talking to the folk at Port80 tonight, I have been re-inspired to start blogging again. This may have something to do with all of the excitement around web essentials 05 (Which I missed – Phooey!). You see all of those lucky enough from Perth to go, have gotten me even more excited about the web industry, and I want to contribute back. So in a way you have Miles, Kay and Adrian to thank for this… he he he.

This blog will mainly be work related. Some mis-guided ramblings I suppose, but that is what blogging is all about isn’t?

I was amused to notice that the last time I started a blog was almost exactly one year ago (10/10/2004 to be exact). The reasons I did it last time surely had to do with avoiding work on my thesis. Ironically, I have another thesis to hand in in a couple of weeks – similar topic different degree… Looking into Gestalt theory again, but this time, I’m adding a part about remote usability testing. I’ve written a cool little AJAX based data logger, which I might atlk about another time.

Anyway – I’m going to try to get some sleep :)

Previous