The recent JISC Innovation Forum, held at Royal Holloway on 28-29 July 2010, had a whole theme devoted to Sustainability and Impact. I was invited on to one of the panels, along with Ross Gardler (OSS-Watch and Apache Software Foundation), Sarah Fahmy (Strategic Content Alliance) and Richard Goddard (talking about MrCute). There were some excellent presentations given (see below for links) but one interesting question raised during the discussion was whether the topics sustainability and funding could and should be split.
At the JISC’s Innovation Forum last week (JIF2010) there was a debate between Agile Development and PRINCE2 especially for software development(*). This gave me a chance to question one of the major worries that many have expressed over Agile Development approaches like Scrum: namely, does this lead to less robust code, that doesn’t last as long and is harder to sustain and preserve? After all, if one of the main benefits touted for Agile Development is that offers working software quicker, one has to wonder what shortcuts were taken and if the flip side of a short-term strength is a long-term weakness.
The main responses to this provocation were:
- A lot of Agile Development is for pre-production code, and wouldn’t be considered for production systems and services anyway. Because of its use in invention and proof-of-concept most of the projects will fail anyway. If a heavier weight approach had been used at these stages, then a lot more effort would have been wasted. Instead a lot more experimentation and testing can be done in the same effort to find something that is really worth preserving.
- Agile Development doesn’t mean no documentation. The JISC’s Rapid Innovation Strand emphasised the use of blogs for regular, short progress reports. This meant that key design decisions were recorded (and publicly at that) so that someone could go back later and reconstruct the thinking behind the software development. The use of blogs also had the bonus of more encouraging personal reflection and lesson identification. All this fits very nicely into our principle of ‘knowledge not just code for software preservation’. Of course these blogs need to be kept available for an appropriate length of time!
- Where software development projects succeed they are likely to do so because they have naturally built a community around them in a ‘opt-in’ rather than ‘dragging along’ fashion. This increases the likelihood of sustainability. At this point further thought and investment can be made in future-proofing the software.
So it seems that (as with any other approach) Agile Development may not offer intrinsic sustainability or preservation of software. However, when used well there are strong arguments that it will help, not hinder, this agenda…
* The consensus, if you’re interested, is neither Agile Development or PRINCE2 is better. Each has its place in different types of project, and should be used in a pick’n’mix style anyway.
Cloud services are talked about a lot at the moment, and it is interesting to consider if there are any consequences for software preservation and sustainability. Three immediate thoughts come to mind:
Firstly, cloud services that are Infrastructure-as-a-Service (IaaS) can very easily provide virtualised environments for running code. You can even choose free pre-built virtual machine images (in effect, snapshots of a machine at a point in time) if you don’t want to build your own. It would make sense that cloud service providers will continue to allow old ones to be run if customers express demand for them. This could provide very useful quasi-emulators for a few years, with zero overheads and no migration issues for the developer. Some questions over the current lack of standards for virtual machine image creation remain.
Second, cloud services that are Platform-as-a-Service (PaaS) offer a standardised environment for running code. Again, a great boon to the developer who can focus on developing good and interesting code rather than the execution environment… until the platform is subtlety changed by the cloud service provider with no warning. I can’t imagine guarantees being given on continued code compilation and execution, but the cloud provider does have an incentive to keep the customer happy.
Finally, cloud services that are Software-as-a-Service (SaaS) wouldn’t usually offer access to any code. Indeed, ensuring access rights to the data on such services is often an issue. Service upgrades again happen outside of your control and sometimes without you even knowing. If you’re using SaaS is there anyway of ensuring that you or anyone else is able to re-execute a particular version of the underlying software at an unspecified point in the future? It seems unlikely. Fortunately a major use case – code reuse – isn’t applicable so re-execution of an old version is perhaps less necessary. For other use cases, for example audit and accountability, might be that SaaS providers will build such guarantees as additional services – i.e. you pay to have retro-functionality on demand.
There you have it: three ideas about software preservation and cloud services, each with advantages and disadvantages. What other long-term issues to do with software and the cloud that you’re aware of?
Last week I attended the superbly run Digital Preservation Coalition event ‘Designed to Last: Preserving Computer Aided Design‘. This came across to me as a new and important topic, and the event served as a focal point for various interested parties. CAD underpins several disciplines and industries: architecture/construction, product design/manufacture, and archaeology were represented.
The focus of the event was on data preservation, and CAD software is seen as an enabler to data preservation. A very long-term view is necessary: in each of the disciplines and industries represented ‘long-term’ means decades or longer. CAD software is mainly commercial, and there is one dominant supplier. However a general lack of backwards incompatibility is making data preservation hard. A key message coming out from the event is to understand and address the preservation requirement early on.
Creating heritage value was one purpose of preserving CAD. Interestingly there was something of a mismatch between architects (who are content with PDF outputs for seeing their work) and archivists (who want the complete CAD model).
Another purpose was around improving estates management. The move from using CAD for drawing, to use CAD for modelling means there is a lot of useful through-life building information handled in CAD models. With better preservation (the term ‘sustainment’ was also used) this information could be used for ongoing estates management: finding defects, recording later construction/decoration work, optimising energy usage, etc.
The third clear purpose of preservation was around achieving regulatory compliance. An aerospace example (the LOTAR project) highlighted that being able to prove flight worthiness to the aviation authorities not just now, but in 10, 20, 30, etc years is thought critical to be able to keep planes in the air. Or to put it another way, if an incident happens in 30 years time and the design and test data is not available and executable, the fleet will be grounded and the aircraft manufacturers business model destroyed. Validation and verification across different versions of CAD software is fundamental. This example also showed again that preserving design rationale is more important (and harder) than software preservation or data preservation.
All in all, thought provoking stuff for this project – and a big challenge to those who work in these areas!
Last week (before our workshop, hence the delayed blog post!) I was invited to talk at a seminar by the Digital Lives Research Project which is funded by the AHRC and led by the British Library. My talk and discussion on software preservation went well and the topic was described as ‘very important and relevant to us archivists and curators’ by the organiser.
The theme of the seminar was ‘Authenticity, Forensics, Materiality, Virtuality and Emulation’. There is alot of interesting work going on in this area as ‘born digital’ material becomes increasingly prevalent. Whilst data preservation is the main requirement, software curation is starting to become important. The US is possibly ahead of the UK with an early example being the Stephen Cabrinety collection with 16,000 games titles. NIST are forensically imaging and taking hashes of each of these titles on behalf of Stanford University where the collection resides. Another great example is the Salman Rushdie collection at Emory University. Here they have emulated three of Salman’s computers and allow researchers to search and use his computers as he would have. Since the computers and software is no longer supported this is a superb way of preserving and presenting the past.
Perhaps the key thing I took away for this project was the importance of the legal and authenticity requirements, for example:
- maintaining the chain of custody (eg tracking the taking off of the shrink-wrap)
- having the rights to use the software (eg going beyond ‘fair use’ can happen very quickly)
- recording the dependency chain (eg deciding and recording which version of the software to keep)
This goes beyond preserving the functionality (perfect reproduction or otherwise) and provides a clear set of requirements from the archiving community. After mainly looking at research software so far there is a useful compare and contrast exercise to do…
I’ve just reviewed my notes from the workshop, as I start to pull together all the information we collected in order to process properly. Four things stand out clearly:
- The size of the challenge – the vision of universal good practice around software preservation and sustainability requires significant behavioural and cultural change. People seemed agreed though that the size of the challenge is matched by the benefit. On a much smaller scale, this point was captured nicely by a conversation on whether documenting your code costs or saves you time…
- Not just a challenge for developers – for such a change to happen it will require many things by many people to happen. At the top-level, the need for expectations (and policy?) to be set by funding and research councils. Below them, institutional norms, training, professional practice, etc were mentioned several times.
- Understanding the requirement(s) and use case(s) – an analogy with steering oil tankers (!) was illuminating for me: it is possible, but you need to know where you want to be ahead of time because last minute changes (eg building in preservation) are really hard. As well as different purposes, should we be talking more about different requirements and use cases?
- The similarities between transparency and preservation – the idea of moving from closed to open was mentioned many times, particularly with research software. Openness provides transparency as a bonus benefit! Open Source licensing and Open Development practices make it easier to preserve software by removing barriers to others taking on the preservation of the code, and makes it more likely that the understanding of the code is also captured. Its not a silver bullet though!
I’m sure there’ll be much more to extract from all the great contributions we had, and this blog is the place to see them…
The three presentations given at the Preserving Software workshop on 7 July are now available on the Software Sustainability Institute’s SlideShare account.
Keep an eye on this blog for further information from the software preservation workshop.
Yesterday, we held a workshop on preserving software at Brettenham House in London. We were extremely happy to find that the enthusiasm and energy that the 28 attendees brought with them, was paralleled only by their ideas and insights into the benefits, drawbacks and solutions for preserving software.
We are approaching software preservation and sustainability in a new way. However, it appears to be one that is of great interest to the development community. We had no trouble filling the spaces for the workshop (in fact, we were a good deal oversubscribed), but – most importantly – the people who managed to get a place at the workshop showed a great deal of passion for the subject.
So what next? The first thing is to disseminate the slides from the workshop. These will soon be added to the SSI’s SlideShare account – keep an eye on this blog for details. We will then start to digest the huge amount of information we collected and identify a set of examples of software preservation. After getting in touch with the relevant projects, we will write up our findings as a set of case studies. Of course, we’ll keep adding to this blog as we discover other outcomes – and any other thoughts we’ve had about the workshop.
Thanks, once again, to everyone that attended.
A big part of this project is to raise awareness about software sustainability. Whilst intellectual arguments and weight of evidence are great for ‘sealing the deal’ – they aren’t that useful for grabbing attention. That’s the job of a ‘hook’, and in this post I want to explore what a good hook might look like.
First up, is the name software preservation. It’s a very new term, so it’s not widely used (yet!) and may not be meaningful to many people. Does it get developers’ juices running? Given this project’s focus on developers, we need something that resonates with them. I’ve also seen similar and related terms used: software curation and software evolution. Finally, does it really capture the first of our two meanings of software preservation – that of living or active preservation? Software sustainability is perhaps better for this purpose, though the Software Sustainability Institute have cleverly snapped this one up as their core activity.
We could have a tag line that relates to the benefits of software preservation. However, none of the the benefits identified so far are universal. In some instances one might seek a particular benefit, such as efficiency through reuse, and in other instances one might seek another benefit, like intrinsic heritage value. There is no common benefit across all use cases and scenarios. This means we might need multiple hooks covering different audiences with different needs.
Would we want a tag line that focuses on the benefit and opportunity of extending the life of software, with none of the negative connotations of the extra design, coding and maintenance efforts that may be required? After all, of all the people I know who work with code, not one of them refers to their role as a software maintainer – even though maintenance accounts for over 50% of all software costs.
Perhaps we should abstract from the activity and the benefits, and look at the core concept. For me, this is about the long-term dimension: whether software still works in the future, and whether it can be made to work. The tag line we’ve trialled for our workshop on 7 July is future proof your software which captures that concept nicely. It is aspirational – since all software becomes obsolete eventually – but it perfectly conveys the concept of our work.
Do you have a better idea? Our search will go on, and this is just one of the areas on which we would like to hear your ideas and feedback.
It’s one week to go until our workshop for software developers. We’ve been amazed by the level of interest in it – clearly showing that developers are keen to contribute to good practice on how to make software more durable and longer-lived. We’ve been busy planning how to make this event a great success and just want to share a few practical details at this point.
The agenda has firmed up, and looks like this:
1030hrs – Registration opens
1030-1100hrs – Tea, coffee and initial networking
1100hrs – Welcome and introduction to the workshop
1115hrs – Activity on software development for long-term use and reuse
1145hrs – Activity on different approaches to software preservation
1300-1345hrs – Lunch
1345hrs – Activity on arguing the case for software preservation
1500hrs – Activity on delivering software preservation
1545hrs – Final summary and thank you
1600hrs – End of workshop
Each activity is a mix of information, discussion and summarising designed to produce developer-led good practice in software preservation and sustainability. By building on attendees’ experiences and views we aim to offer effective guidance that helps others. For attendees there will be a great chance to contribute to our and other’s work, and to learn from others.
Over the next week we’ll be putting the finishing touches to the presentations and exercises… watch this space!