Microdata, RDFa or JSON-LD Appropriate or best usage?

28,698

Solution 1

Schema.org is a vocabulary that can, like any other vocabulary, be used in many forms. The website http://schema.org/ has examples using Microdata and the RDF syntaxes RDFa and JSON-LD, but these are not the only syntaxes it can be used with. You could, for example, use it with any other RDF syntax like Turtle or RDF/XML.

There is no best syntax. They all have advantages and disadvantages. See for example my answer about differences between Microdata and RDFa. Note that you can use different syntaxes (and vocabularies) in the same document.

Now, if you have a specific consumer in mind, you should consult their documentation. However, support of syntaxes comes and goes, and not everything they might support is necessarily documented, and not everything that is documented necessarily works.

In case of Google, you are probably interested in their Rich Snippets. Their documentation about Rich Snippets mentions Microdata, Microformats and RDFa. However, note that not all linked examples use the Schema.org vocabulary, but the older Data-vocabulary.org or Microformats (as you can’t use vocabularies like Schema.org nor Data-vocabulary.org with Microformats). And there are also some Rich Snippets that aren’t listed on that page, like the Sitelinks Search Box, for which they even recommend the JSON-LD syntax.

As general advice: Search engines typically favor visible content over hidden metadata. For example, having keywords as hidden metadata easily allows authors to claim that their documents are about something different than they really are (either because of trying to trick the search engine, or because authors forget to update content in both places). Therefore, uncoupling the metadata from the content, like it’s the case with JSON-LD, could (possibly!) lead to the same issues current search engines have with hidden metadata. (If or which search engines actually handle it like that is a question which is off-topic on Stack Overflow.)

Another possible advantage for coupling the metadata with the content (for example, with RDFa), is that you could easily and automatically generate the same information in JSON-LD, Turtle etc. because everything’s just RDF. Just parse the RDFa, convert to formats of your preference, and embed (in script) or link (with rel-alternate) it if it makes sense.

But yes, adding RDFa is often more complex than adding a JSON-LD blob, because you have to adapt it to the existing markup. (However, it should not "break validation" unless you’re making mistakes.)

Solution 2

The lines between Microdata, RDFa, and JSON-LD are indeed currently very blurry and that there is still no widely accepted de facto among the three. This will have to wait for now. Perhaps a couple or more years.

Meanwhile, Microdata should not be labeled with Schema.org like you mentioned because those two are different things. Schema.org is a vocabulary so it can be used for Microdata, RDFa, and JSON-LD.

Using Schema.org as the vocabulary and using JSON-LD as the data representation is probably the most anticipated pair because of two common aspects about them:

  1. Easy to read for humans; and
  2. Lightweight machine-readable

but even so there are still disconnects between the two like this example.

Regarding the JSON-LD support, since Bing, Google, Yahoo!, and Yandex acknowledges the use of schema.org then perhaps it is safe to say they are also supporting it like in this example.

2017 Update

Google has been very pro-active in promoting JSON-LD-schema.org these past couple or three years.

Solution 3

It seems Google is leaning towards the use of JSON-LD but it hasn't implemented it for every use-case!

Google is in the process of adding JSON-LD support to more markup-powered features. So far, JSON-LD is supported for all Knowledge Graph features, sitelink search boxes, Event Rich Snippets, and Recipe Rich Snippets; Google recommends the use of JSON-LD for those features. For the remaining Rich Snippets types and breadcrumbs, Google recommends the use of microdata or RDFa.

http://developers.google.com/structured-data/schema-org

Solution 4

Google uses JSON-LD as reference examples for Structured Data SEO for their Knowledge Graph (companies and people). See https://developers.google.com/structured-data/customize/overview

I personally use a combination of JSON-LD and Microdata for my sites (for the time being).

I would say they have other means to identify if the information you provide through JSON-LD is relevant to their search engine (like checking your page is actually talking about what it claims to talk about).

Solution 5

(updating answers!)

About "popularity", please see this question/answers.

Microdata today is the most popular: in a universe of 34 million of domains, 5.63 million (~17%) use "content markup" (I will use the jargon markup) by RDFa (0,9 million), Microdata (2.5 million) or Microformats, and less than half use separated semantic descriptors, noticing the most popular as JSON-LD, with 2.12 million (6%).
PS: we prefer "per-domain statistics" (instead per-page statistics) because pages in same domain in general have same templates and other local-authority convention enforcements.

In a universe of "domains expressing semantics" (7,75 million) the statistic profile is:

  • 73% markup semantic
  • 27% separated semantic
  • (... intersection as mix "separated+markup" can be zero to simplify...)

Rule of thumb in 2017

Use markup semantic with Microdata and, after it, if you need to express something more to machines, use JSON-LD.


Use markup semantic because it is the most popular, and because marked contented will be verificable/auditable simultaneously by humans and machines.

Important: remember that Microdata, RDFa (a W3C standard) and JSON-LD (a W3C standard) can be (easily) translated to RDF, so all these formats are compatible.


PS: for HTML tables see also W3C's tabular-metadata. For open non-HTML resources, as CSV files, use RDF-compatible W3C's tabular-data-model and/or frictionlessdata/specs.

Share:
28,698
Grzegorz
Author by

Grzegorz

Updated on January 30, 2020

Comments

  • Grzegorz
    Grzegorz over 4 years

    I have been wondering which of those formats is "best"? Schema.org, Microdata, and RDFa are bit of a pain to implement. They can break validation and require quite an effort to put into documents.

    JSON-LD is, at last for me, a way better to implement structured data. But does it work? What level of support is there for it (at least by Google)?

  • Grzegorz
    Grzegorz over 9 years
    About mistake, I had problem with schema.org/openingHours . As they use <time datetime=""> property. Which should be in ISO format to be valid.. But schema.org got own format, which is not compatible "Mo-Tu 11:00-22:00" for example. Anyway, very good answer. Thank you for your time. Did not know about difference between syntax and vocab. And Indeed. JSON-LD could lead to overuse like meta tags and descriptions got overused. But Microdata can also (by hidden content in CSS for example). And you can, i think, easier tell difference in content and JSON-LD than between content and content
  • unor
    unor over 9 years
    @Gacek: I reported the issue you mentioned last month; note that this is not an error with Microdata or Schema.org per se, it’s only their example that is wrong. You can (and should), of course, use the openingHours property with any other suitable element.
  • SuperUberDuper
    SuperUberDuper about 8 years
    Microdata is deprecated
  • Arnaud Leyder
    Arnaud Leyder about 8 years
    What is your source?
  • Hendy Irawan
    Hendy Irawan about 8 years
    Google does recommend JSON-LD representation. However, I see no mention of Bing, Yahoo!, Yandex support JSON-LD. While they do support schema.org vocabulary, historically they support microdata representation. CMIIW.
  • dhaupin
    dhaupin about 8 years
    @HendyIrawan Truth. Bingbot [still] doesn't understand JSON-LD. Bing structured markup tester also does not register the data. This is prob the same for Yahoo. bing.com/webmaster/help/…
  • Samia Ruponti
    Samia Ruponti almost 8 years
    what's the current status? it seems like google is very slow in updating documentations, and I still see in docs that they are "in process" of updating to JSON-LD.
  • runios
    runios over 6 years
  • djmj
    djmj over 5 years
    In the overwiew section there is a recommend label at JSON-LD.
  • Nathan
    Nathan over 2 years
    As of January 29, 2021, data-vocabulary.org markup will no longer be eligible for Google rich result features. To be eligible after January 29, 2021, you need to replace data-vocabulary.org markup with schema.org markup. developers.google.com/search/docs/advanced/structured-data/…