7 Feb outputs: final discussion

Workshop photo

After the formal elements of the day were over, we kicked back, ate brownies and had a final full group discussion. This one had no set agenda, but the team took questions, answered them as best possible and let the discussion flow.

The key questions and points arising were:

  • Is it better to preserve the source code or the byte code? Preserving source code is good, especially if the developers documentation and commenting is helpful. Such documentation and commenting must be independent of format/language of source code – one attendee had the fantastic example of being given well-written code to maintain but all the commenting was in Portuguese! Source code is one component of a bigger picture, and would require consideration of preserving compilers, IDEs, etc. The best practical answer to the question is to actually try and keep both source code and byte code!
  • It is possible that user power can help sustain software. For example, the Windows 98 End of Life was much later than Microsoft initially wanted, but because there was sufficient users and usage then the date was pushed back.
  • Are there any good examples of software reuse? Yes! NASA has done a lot of work on reuse, and have developed guidance including a set of Reuse Readiness Levels. The computer games industry often reuses libraries (in part or whole). The NAG libraries are a classic case of software reuse: reusing processes, reusing chunks of code, encouraging better documentation and offering best practise support of code.
  • How does software development differ between academia and the commercial world? There is a distinct difference between software development and research. For example if you are a researcher, rather than a developer, then often software development is about making programs for quick and easy use, rather than something which is built for maintainability, portability, reuse, etc. In addition within libraries and universities there are often project constraints, which can lead to software being functional but unstable.
  • Data preservation is easier than software preservation. Whilst sometimes the software is needed (eg to see data) it is possible make software preservation into more of a data preservation problem. Reducing the unknowns (eg documenting the significant properties) and adding structure (eg modularisation, use of common platforms, etc) can make software preservation more like data preservation.
  • In some circumstances software that needs to be preserved will be linked to specialist equipment (mass spectroscopy, NMR, robotics). In this instance, the software and its data only makes sense in the presence of this external hardware. Both specialist knowledge of the data and practical knowledge or experience of the hardware is necessary when carrying out software preservation.
  • An extreme approach to software preservation is to give the entire responsibility for preservation to the research group developing and using the code. The opposite extreme is to give the problem and responsibility to a specialist preserver. An in-between approach is for the specialist preserver and research group (or other developer if in the research domain) to work together.
  • Because of the complexity of software preservation, and in particular the difficulty of knowing in advance what key aspects are relevant to a particular case, it inadvisable to think of software preservation as a one-off decision with easy answers and a predictable outcome. Instead software preservation should be seen as a learning experience with an associated learning curve. Very often the realisation of what doesn’t work will come too late. As more is learnt within one team or organisation about what works and what doesn’t work, the better the placed they will be for making future decisions.
  • Are some particular platforms and/or languages better than others for software preservation? One answer is that platform and language choice is dependent on the community as use drives sustainability, therefore ask ‘what is everyone else using?’. Another answer is to more abstractly question whether the language have a future and whether it is easy to sustain? Many would argue that Perl is difficult to sustain and that there are very few COBOL developers left. This might mean that Perl and COBOL would be poor choices. However there is volatility in language use; for example many thought that C was on the decline, but with the rise of the iPhone and iPad a new life has been given to the C language (via Objective-C which is the primary language for Apple’s Mac OS X and iOS).