Abandoned Power Station

Data Is or Data Are

A recent episode of one of my favourite podcasts, More or Less discussed the changing use of the word “data”, and whether it is used as a mass noun (like rice) or a count noun (like sheep). If you’re a latin purist then data is obviously data is the plural of datum, and so data is already plural (“the data are available”, “lets see what the data tell us”), but gradually lots of people have moved toward using data as a singuar, mass noun (“the data is available”, “lets see what the data tells us”). Recently the FT changed their style guide, to treat data as a mass noun, but changes have been brewing for longer than that.

In literature (British English books from Google Ngrams), data as a singlar noun has been gaining traction steadily for decades. Since the “big data” era from the 2000’s onwards the plural form has dropped in frequency rapidly, to around the same level as the singular.

Google Ngrams for data is vs data are

And looking at more colloquial use in search (via google trends) the singular option has been dominant for as long as the data is/are available.

Google Trends for data is vs data are

The changing use makes sense to me. When data were a set of known facts (each one constituting a single datum), it made sense to think of data as a plural noun. Now that we often think of data as the new oil / gold / currency / bacon (according to google autofill), where a single data point is as atomic as, well, an atom - the singluar form for the substance that is data seems to fit better.

, — Mar 1, 2023