27
Aug

Another purpose of preservation…?

Interesting snippet from the news this week: BBC Micros used in retro programming class

Old machines at the National Museum of Computing are being used as part of the Computing A-Level. Old tech beating new tech in the pursuit of education. What a novel and great use of preserved hardware. Nothing in the article about preserved software though – perhaps as the focus of the class was to code a game of battleships!

16
Aug

Flexible values

Canon in 2D

Just the thought of the effort and cost required to implement software preservation measures can seem high when compared to the potential benefit. The costs are self-evident, whereas the benefits are not. One way of dealing with this impasse is to not to splurge now, but to invest a little to give you the option of doing something more in the future. In essence, get value from flexibility.

How does this work in practice? Take for instance the end of a project where you’ve built and used some fragile and not-particularly-well-documented software. It did the immediate job, but isn’t ready for anything else. The follow-on project that should build on the software doesn’t start until next year, or is contingent on getting more grant funding. The traditional options might be to either do nothing with the software and go straight into the next activity (i.e. abandon ship and let the software and tacit knowledge fester), or to spend a couple of weeks fully documenting the software and making it robust.

Instead, do the minimum necessary for high-level documentation and basic test data. Commit to reviewing the software’s prospects in a year or so. If there has been subsequent interest (internal or external) in the software that could lead to sizeable benefits (institutional reuse, kudos through external use, etc.) then whatever additional effort is required to achieve those benefits can be considered. If not, then little has been lost. Without the upfront effort, you either have an uphill struggle to go from scratch (= big lose!) or you’d have saved a small amount of effort (= small win).

Giving yourself some flexibility – by leaving code in a recoverable state rather than abandoning it – could well be an effective strategy for many. It has minimal upfront costs but retains the potential benefits. The value of such flexibility can be sizeable: a stitch in time saves nine.

09
Aug

Where have all my photos gone?!

Image courtesy of auggie tolosa under Attribution-Share Alike 2.0 Generic

A preservation expert recently told me that he and his team used to have trouble conveying the imperative for digital content preservation. There was no everyday story that made the issue real for the man on the street. Then digital photography took off, and people – everyone – started to store digital photos on their computers. Instantly there was a great example: everyone can relate to the fragile nature of beloved photos being stored on a computer, which is prone to breakdowns, viruses, non-compatible updates and other techno-disasters. Losing the only photo of that family gathering on the hot summer’s day when little Johnnie said his first words is a horribly vivid thought.

Persuading people to keep photos and other files in a non-proprietary format and to make regular backups has become much, much easier. In other words, it has become a whole lot easier to sell the concept of digital preservation. My question is: what’s the equivalent for software preservation? Is there any software so ubiquitous and so valuable that it can be used to sell the concept of software preservation? Not necessarily to the masses, but at least to a broader audience? Answers please on a digital postcard…

02
Aug

Sustainability: is it always about the money?

The recent JISC Innovation Forum, held at Royal Holloway on 28-29 July 2010, had a whole theme devoted to Sustainability and Impact. I was invited on to one of the panels, along with Ross Gardler (OSS-Watch and Apache Software Foundation), Sarah Fahmy (Strategic Content Alliance) and Richard Goddard (talking about MrCute). There were some excellent presentations given (see below for links) but one interesting question raised during the discussion was whether the topics sustainability and funding could and should be split.

Continue reading ‘Sustainability: is it always about the money?’

02
Aug

Does Agile Development lead to throwaway code?

Image courtesy of boocal

At the JISC’s Innovation Forum last week (JIF2010) there was a debate between Agile Development and PRINCE2 especially for software development(*). This gave me a chance to question one of the major worries that many have expressed over Agile Development approaches like Scrum: namely, does this lead to less robust code, that doesn’t last as long and is harder to sustain and preserve? After all, if one of the main benefits touted for Agile Development is that offers working software quicker, one has to wonder what shortcuts were taken and if the flip side of a short-term strength is a long-term weakness.

The main responses to this provocation were:

  • A lot of Agile Development is for pre-production code, and wouldn’t be considered for production systems and services anyway. Because of its use in  invention and proof-of-concept most of the projects will fail anyway. If a heavier weight approach had been used at these stages, then a lot more effort would have been wasted. Instead a lot more experimentation and testing can be done in the same effort to find something that is really worth preserving.
  • Agile Development doesn’t mean no documentation. The JISC’s Rapid Innovation Strand emphasised the use of blogs for regular, short progress reports. This meant that key design decisions were recorded (and publicly at that) so that someone could go back later and reconstruct the thinking behind the software development. The use of blogs also had the bonus of more encouraging personal reflection and lesson identification. All this fits very nicely into our principle of ‘knowledge not just code for software preservation’. Of course these blogs need to be kept available for an appropriate length of time!
  • Where software development projects succeed they are likely to do so because they have naturally built a community around them in a ‘opt-in’ rather than ‘dragging along’ fashion. This increases the likelihood of sustainability. At this point further thought and investment can be made in future-proofing the software.

So it seems that (as with any other approach) Agile Development may not offer intrinsic sustainability or preservation of software. However, when used well there are strong arguments that it will help, not hinder, this agenda…

* The consensus, if you’re interested, is neither Agile Development or PRINCE2 is better. Each has its place in different types of project, and should be used in a pick’n'mix style anyway.

27
Jul

Preservation in the cloud

Image courtesy of ewen and donabel

Cloud services are talked about a lot at the moment, and it is interesting to consider if there are any consequences for software preservation and sustainability. Three immediate thoughts come to mind:

Firstly, cloud services that are Infrastructure-as-a-Service (IaaS) can very easily provide virtualised environments for running code. You can even choose free pre-built virtual machine images (in effect, snapshots of a machine at a point in time) if you don’t want to build your own. It would make sense that cloud service providers will continue to allow old ones to be run if customers express demand for them. This could provide very useful quasi-emulators for a few years, with zero overheads and no migration issues for the developer. Some questions over the current lack of standards for virtual machine image creation remain.

Second, cloud services that are Platform-as-a-Service (PaaS) offer a standardised environment for running code. Again, a great boon to the developer who can focus on developing good and interesting code rather than the execution environment… until the platform is subtlety changed by the cloud service provider with no warning. I can’t imagine guarantees being given on continued code compilation and execution, but the cloud provider does have an incentive to keep the customer happy.

Finally, cloud services that are Software-as-a-Service (SaaS) wouldn’t usually offer access to any code. Indeed, ensuring access rights to the data on such services is often an issue. Service upgrades again happen outside of your control and sometimes without you even knowing. If you’re using SaaS is there anyway of ensuring that you or anyone else is able to re-execute a particular version of the underlying software at an unspecified point in the future? It seems unlikely. Fortunately a major use case – code reuse – isn’t applicable so re-execution of an old version is perhaps less necessary. For other use cases, for example audit and accountability, might be that SaaS providers will build such guarantees as additional services – i.e. you pay to have retro-functionality on demand.

There you have it: three ideas about software preservation and cloud services, each with advantages and disadvantages. What other long-term issues to do with software and the cloud that you’re aware of?

19
Jul

(Computer Aided) Design for Preservation

Image courtesy of Todd Ehlers

Last week I attended the superbly run Digital Preservation Coalition event ‘Designed to Last: Preserving Computer Aided Design‘. This came across to me as a new and important topic, and the event served as a focal point for various interested parties. CAD underpins several disciplines and industries: architecture/construction, product design/manufacture, and archaeology were represented.

The focus of the event was on data preservation, and CAD software is seen as an enabler to data preservation. A very long-term view is necessary: in each of the disciplines and industries represented ‘long-term’ means decades or longer. CAD software is mainly commercial, and there is one dominant supplier. However a general lack of backwards incompatibility is making data preservation hard. A key message coming out from the event is to understand and address the preservation requirement early on.

Creating heritage value was one purpose of preserving CAD. Interestingly there was something of a mismatch between architects (who are content with PDF outputs for seeing their work) and archivists (who want the complete CAD model).

Another purpose was around improving estates management. The move from using CAD for drawing, to use CAD for modelling means there is a lot of useful through-life building information handled in CAD models. With better preservation (the term ’sustainment’ was also used) this information could be used for ongoing estates management: finding defects, recording later construction/decoration work, optimising energy usage, etc.

The third clear purpose of preservation was around achieving regulatory compliance. An aerospace example (the LOTAR project) highlighted that being able to prove flight worthiness to the aviation authorities not just now, but in 10, 20, 30, etc years is thought critical to be able to keep planes in the air. Or to put it another way, if an incident happens in 30 years time and the design and test data is not available and executable, the fleet will be grounded and the aircraft manufacturers business model destroyed. Validation and verification across different versions of CAD software is fundamental. This example also showed again that preserving design rationale is more important (and harder) than software preservation or data preservation.

All in all, thought provoking stuff for this project – and a big challenge to those who work in these areas!

14
Jul

Digital Lives and Software Preservation

Image courtesy of gnuchris2

Last week (before our workshop, hence the delayed blog post!) I was invited to talk at a seminar by the Digital Lives Research Project which is funded by the AHRC and led by the British Library. My talk and discussion on software preservation went well and the topic was described as ‘very important and relevant to us archivists and curators’ by the organiser.

The theme of the seminar was ‘Authenticity, Forensics, Materiality, Virtuality and Emulation’. There is alot of interesting work going on in this area as ‘born digital’ material becomes increasingly prevalent. Whilst data preservation is the main requirement, software curation is starting to become important. The US is possibly ahead of the UK with an early example being the Stephen Cabrinety collection with 16,000 games titles. NIST are forensically imaging and taking hashes of each of these titles on behalf of Stanford University where the collection resides. Another great example is the Salman Rushdie collection at Emory University. Here they have emulated three of Salman’s computers and allow researchers to search and use his computers as he would have. Since the computers and software is no longer supported this is a superb way of preserving and presenting the past.

Perhaps the key thing I took away for this project was the importance of the legal and authenticity requirements, for example:

  • maintaining the chain of custody (eg tracking the taking off of the shrink-wrap)
  • having the rights to use the software (eg going beyond ‘fair use’ can happen very quickly)
  • recording the dependency chain (eg deciding and recording which version of the software to keep)

This goes beyond preserving the functionality (perfect reproduction or otherwise) and provides a clear set of requirements from the archiving community. After mainly looking at research software so far there is a useful compare and contrast exercise to do…

09
Jul

Four quick things

Image courtesy of qisur

I’ve just reviewed my notes from the workshop, as I start to pull together all the information we collected in order to process properly. Four things stand out clearly:

  1. The size of the challenge – the vision of universal good practice around software preservation and sustainability requires significant behavioural and cultural change. People seemed agreed though that the size of the challenge is matched by the benefit. On a much smaller scale, this point was captured nicely by a conversation on whether documenting your code costs or saves you time…
  2. Not just a challenge for developers – for such a change to happen it will require many things by many people to happen. At the top-level, the need for expectations (and policy?) to be set by funding and research councils. Below them, institutional norms, training, professional practice, etc were mentioned several times.
  3. Understanding the requirement(s) and use case(s) – an analogy with steering oil tankers (!) was illuminating for me: it is possible, but you need to know where you want to be ahead of time because last minute changes (eg building in preservation) are really hard. As well as different purposes, should we be talking more about different requirements and use cases?
  4. The similarities between transparency and preservation – the idea of moving from closed to open was mentioned many times, particularly with research software. Openness provides transparency as a bonus benefit! Open Source licensing and Open Development practices make it easier to preserve software by removing barriers to others taking on the preservation of the code, and makes it more likely that the understanding of the code is also captured. Its not a silver bullet though!

I’m sure there’ll be much more to extract from all the great contributions we had, and this blog is the place to see them…

09
Jul

Preserving software slides are now available on SlideShare

The three presentations given at the Preserving Software workshop on 7 July are now available on the Software Sustainability Institute’s SlideShare account.

Click on the images above to see the presentations.

Keep an eye on this blog for further information from the software preservation workshop.