I am underway with the first bit of work for SHARD. This involves building up a knowledge base based on various investigations, one which involves looking at legacy data. Before any of this can happen of course we must get our hands on the data. A basic yet essential requirement. Most data understandably due to its sheer size is kept locally and often doesn't make it to a centralised storage or a repository unlike the findings or conclusions which are usually well maintained and stored in a suitable repository ensuring access to this over time.
A simple procedure such as moving a dataset internally from one drive to another may not always be as simple as that sounds.
This made me think up some rules of internal data sharing.
1. Context matters. Different people have differing needs, it is not a one size fits all approach. All data is not equal, some is more special/unique than others.
2. Procedures matter. There should be procedures and these should be agreed on. The oral tradition of remembering belongs in a folk not a data archive.
3. Metadata matters. Information about data matters. Almost as much as data, so ensure that this is shared as well as the data. Otherwise the data can be meaningless and without context.
4. Trust: hard to win, easy to loose and very hard to regain. Personal connections often gain trust so be nice to each other.
That's all for now. More very soon!