With more and more libraries seeing a drop in student numbers using the physical facilities, libraries are increasingly moving more towards digital repositories that are meant to last forever. There are clear advantages of having a digital repository. Such as being able to search for titles easily, if there is a copy in the archive then there is no worry that you won't be able to check it out if you have the appropriate authorisation and many people can check out the same copy of work if needed. There are probably a whole host of other reasons why its a good idea.

On the flip side, setting up a long term storage system with archival qualities is a nightmare. There seems to be a range of issues with

  • data preservation
  • tagging data with metadata
  • authentication and or authorisation for copyrighted works
  • disaster recovery issues
  • backups
  • possible audit trails if metadata is edited
  • vendor problems, you will need to commit to a system for a long time (software and or hardware vendors), so who and what do you commit to?
  • preserving data formats and tools to manipulate and convert these formats
  • need for HSM? are spinning disks better than tapes?

The list of problems probably goes on. So how do people deal with this problem? The only game in town right now seems to be dspace or fedora commons for the software side. For the hardware it seems to be a bit more vague, I get the impression people are just buying robust mid ranged kit that mostly fits their needs for archival, then upgrade a few years down the road with the equivalent hardware and then just migrate the data or replicate it to the new system.

How do people tackle these issues? Do people build on top of dspace or fedora commons with mid ranged hardware solutions from a tier 1 vendor like IBM, HP, sun or whatever tier 1 vendor? Do people outsource it to a tier 1 company such as the previously listed ones?

Bookmark and Share