11 thoughts on “Metadata does not exist

  1. Thanks very much for a helpful post with some great examples.

    Could I please just clarify a point that I found a little confusing? To quote your post:

    “Now, if we agree that metadata does not constitute a separate kind of things, and coming back to Liskov substitution principle, then we must conclude that metadata should be treated as any other data.”

    This is what’s confused me… Inheritance is about moving from the general (i.e. data) to the specific (i.e. metadata) – hence metadata (by your argument) should have all the properties of data, PLUS a few special ones of its own that distinguish it from other forms of data.

    To use your own example – saying “metadata should be treated as any other data” is equivalent (or at least dangerously near) to saying: “Plums are purple. Plums are fruit. Therefore all fruit is purple”, isn’t it?

    So should “metadata should be treated as any other data” read something more like: “all the basic, fundamental ways of treating data generally can be applied to metadata”? If so, that’s a bit different to saying “metadata can be treated the same as any other type of data”… Isn’t it?

    On an incidental note, if you fancy a properly brain-mangling discussion of the whole ‘meta meta meta meta’ thing, I can (sort of) recommend ‘Godel – Escher – Bach’ by Douglas Hoffstadter.

    Thanks once again – the examples you’ve given are very helpful indeed.


    1. Many thanks for your comments, David.

      When I say that “metadata should be treated as any other data”, I mean something similar to “plums should be treated as any other fruit”. That is, we don’t need special tools to deal with plums; those of generic fruit should work fine.

      But please note that I also said in my post that this is a bit of an exaggeration, and that metadata still may benefit from specialised tools in some circumstances, very much like a pineapple benefits from a specialised pineapple corer over a generic fruit tool.

      I hope my point is clearer now!

      As an aside, please bear in mind that the Liskov Substitution Principle is about subtyping rather than inheritance. These two are often confused, but they are very different things.

      And thanks for the recommendation of Gödel, Escher, Bach. I have read it three times!


  2. I always enjoy hearing how others explain the concept of metadata. I wholly agree that the standard definition of metadata as “data about data” is unhelpful.

    I feel your explanation tends to get the concept backgrounds. Rather than say “metadata doesn’t exist” is it more accurate to say “data doesn’t exist — without metadata”. The source of the confusion is the failure to distinguish discussion of things in the real world from information objects about things. Metadata *is* a big deal in the IT world when discussing information objects, such as emails or digital photos. Other IT people create data models without necessarily using the term metadata, but may use an equivalent term such as a data dictionary. Data models can look a lot like metadata schemas, though they sometimes are problematic because the data model didn’t define what the entities and properties refer to, only how they fit together.

    Let’s consider your example of data and things. You mention data about dreams. Dreams have no data. Dreams live in your imagination only. However, an informational description of a dream could have metadata. The information object relating to a dream could be a text description or an electrocardiogram, but the raw data of the information object will only be meaningful and comparable to other similar ones through the use of metadata. Metadata allows data to be collected, compared and summarized. Data can’t exist without metadata, which is why the “data about data” definition is so confusing.

    I agree there’s a difference in the entity and instance level of metadata. What metadata does is provide a framework to indicate what the properties are about. That’s necessary about both the instance level, and when characterizing types of instances as entity categories. One can’t assume that real world things emit data on their own. A shared understanding of what properties exist, and what they are called, is necessary.


    1. I agree that “data can’t exist without metadata” only if you take “metadata” in the first sense as per my post. That is, I agree that data can’t exist unless you have established a data structure first. No problem there.
      However, I suggest in my post that this first sense of “metadata” should not be used, because it is confusing and because we have better and clearer terms for the concept, such as “data schema” or “data structure”.
      Now, if we move to the second sense of “metadata”, then I disagree with you, since metadata (in this sense) is optional. For example, I can collect data about my dreams (when they happened, how long they lasted for, what themes appeared in them, etc.) and not collect metadata (such as when each dream was documented or who did it). In this regard, metadata is an optional layer of added information that may be very valuable (even indispensable) for some goals, but it’s clear that the dependency happens from metadata to data: metadata describes data, so there can’t be metadata without data, not the other way around.


  3. Thanks for this post. I largely agree with your assessment that the distinction between data & metadata is not a property inherent in the (meta)data, but derived from the role the data plays in scenarios. The problem I see is in your conclusion however, when you write “we must conclude that metadata should be treated as any other data. We can use the same tools, languages, approaches and techniques to deal with metadata as we do for plain data”. It appears to me you first conclude that data should be discussed as function rather than essence, but then in your conclusion you return to essence to say that because essence is the same, we do not need specialised tools. However, as before that you write metadata is a function, it seems to me we still need “different tools, languages, approaches and techniques” for these functions. Maybe we should not speak of data & metadata, but of data with functions & meta-functions?


    1. You are right. As I say in my final sentences, I was exaggerating a bit, and I agree that metadata, as a special “kind” of data, may benefit from a specific treatment. Specialized things can often be treated with specialized tools. I was emphasising the point that the same kinds of tools should work, and that we should be critical of those saying that they don’t. Sorry if I went over the top with my exaggeration.


      1. Thanks for replying. I think we pretty much agree; whenever you want to do a “data-analysis”, you can use any data-tool, even when you’re working on what’s traditionally “metadata”.
        An example I find very clarifying in this discussion is full-text search: traditionally, metadata was the stuff to locate books, but with full-text search, you search & find books by their contents, traditionally the data, so the data has become the metadata. But that is not to say that suddenly you should start including the entire contents of the book as part of the metadata.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s