The Marketing Technologist
Posts
An Introduction to JSON for Marketers

An Introduction to JSON for Marketers

Step-by-step guide on how to read and write JSON, and when to use it instead of CSV.

Javier Leung
April 25, 2024

If your job involves working with data, you might be familiar with CSV files as a way to transfer data between one application and another. But what about its squiggly cousin, JSON?

Many non-technical marketers see the curly braces and decide that it's better left for developers. Indeed, JSON’s code-like syntax can be intimidating, especially for a marketer who doesn’t write code. If you’ve experienced the fear of deciphering what’s between the squiggly brackets—you’re not the only one!

But fear no more, my friend! In this post, I’m going to break down CSV and JSON from first principles so that you can start working with them like a pro. Once you get the hang of its basic syntax, JSON is surprisingly straightforward—and it’s more similar to CSV than you might think.

Knowing how to read and write JSON is an extremely useful skill because it’s the de facto data format of the modern web. As a marketer, knowing JSON can help you:

Read web tracking and analytics data
Understand and set up integrations to APIs
Take Generative AI to the next level

By the end of this post, you should be able to understand and confidently translate between these two formats. Let’s dive in.

What is CSV?

CSV, or Comma-Separated Values, is a file format that has been used since the early days of personal computers to store tabular data. Since its introduction in 1972, CSV has grown to become the default data exchange format for exporting data from databases or data processing applications.

The simplicity of CSV has led to its widespread adoption in situations where a simple, readable, and portable format is necessary without the complexity of spreadsheet or database software.

Despite its name, CSV is not strictly tied to commas. Other delimiters like semicolons or tabs can also be used, depending on the regional settings or specific requirements of the software being used.

Excel spreadsheets, relational databases and CSV files are three of the most common ways to manage structured data.

CSV Records, Headers and Rows

Each line in a CSV file corresponds to a data record, and each record consists of fields separated by commas. This format allows data to be saved in a straightforward, text-only format, which makes it easy to import and export in many applications and systems.

For example, you can easily modify data in a spreadsheet editor like Excel or Google Sheets, export the data to a CSV file and import the file into a system such as your CRM or marketing automation tool.

Because CSV doesn't standardize many details, there are variations in how CSV files handle data encapsulation, encoding, and escaping. These variations can sometimes complicate the use of CSV files between different programs that do not handle these differences in the same way.

Let’s look at the following dataset, which you might find in an Excel spreadsheet. The dataset has three fields and only contains one record.

FirstName	LastName	EmailAddress
Barbara	Roberts	[email protected]

When you export this data to a CSV file, it looks like this:

FirstName,LastName,EmailAddress
Barbara,Roberts,[email protected]

The first row of this file is a CSV “header”, which is a list of column names separated by commas. Each subsequent row is a separate record, with field values separated by commas and listed in the same order as the columns.

Next, let’s look at how we would translate this same structure into JSON.

What is JSON?

JSON, short for JavaScript Object Notation, emerged in the early 2000s as a simpler alternative to XML (eXtensible Markup Language) for transmitting data between servers and web applications.

JSON was designed to be easily parsed and generated by JavaScript, the prevailing programming language for web applications. Its syntax is derived from JavaScript “object literals” (i.e. textual representation of JavaScript’s data structures), making it both familiar for developers working with JavaScript and easily interoperable with JavaScript code.

Put simply, JSON is a data format that can be easily written and read by both computers and humans, making it ideal for web applications, APIs and software configuration files.

Douglas Crawford, the developer who named JSON, intended it to be pronounced like “Jason”. In spite of that, JSON is pronounced "JAY-sawn" in the technical community.(Source: Wikipedia)

JSON Objects

Let’s take a look at how the following tabular data translates to JSON.

FirstName	LastName	EmailAddress
Robert	Oppenheimer	[email protected]

{
  "FirstName": "Robert",
  "LastName": "Oppenheimer",
  "EmailAddress": "[email protected]"
}

This data structure is called a JSON object, and it corresponds to a record or row in a CSV file. Here are a few things might stand out:

Curly braces: Each JSON object (i.e. record) is enclosed within a pair of curly braces {}. The opening curly brace { indicates the beginning of a record definition, and the closing curly brace } indicates the end of a record definition.
No header: Unlike CSV, where columns such as FirstName and LastName are placed in a header row, in JSON this data is defined as part of the record itself, such as “FirstName”: “Barbara”. This structure is called a key-value pair.
Double quotes: All strings (i.e. text) within the object need to placed between double quotes. This goes for both the record’s field names as well as field values.
Commas: There is a comma after each key-value pair, except the last one in the object. Like CSV, commas are used to delimit data—but whereas in CSV they delimit values, in JSON they delimit key-value pairs.

Key-Value Pairs

A key-value pair stores two related pieces of data: a key, which tells us what the data is about, and a value, which is the data associated with the key. Each key is a string which acts as a unique identifier for a value. Values can be one of several data types — strings, numbers, lists, objects, etc. — and do not need to be unique across pairs.

You can think of keys as being like columns in a spreadsheet, but instead of being defined only once in the header like in CSV, the columns in JSON are defined in every record of the dataset.

Each key-value pair is separated by a colon. To the left of the colon is the key, and to the right of the colon is its corresponding value. If an object has more than one key-value pair, then each key-value pair is separated by a comma.

"FirstName": "Robert",
"LastName": "Oppenheimer"

Whitespace such as newlines and indentations can be freely inserted between key-value pairs for readability. For example, the same object can be compressed into a single line by eliminating whitespace, like so:

{"FirstName":"Robert","LastName":"Oppenheimer","EmailAddress":"[email protected]"}

Data Types

Unlike CSV, which stores all data as text, JavaScript supports several data types which helps enforce reliability and efficiency in computer programs, and also leads to better long-term data quality.

In an object, keys are always strings, and values can be one of the following data types:

Object
Array
String
Number
True
False
Null

Nested Objects

You might be thinking—what?? An object within an object? Impossible!

Nested objects are in fact very common in JSON and are what allows it to store rich, semi-structured data. There is no theoretical limit to how many levels deep JSON objects can be nested, but there are practical limits determined by how much computing memory is available for a decoder to parse a JSON file. That being said, it’s very typical to work with JSON data that is nested 3 to 4 levels deep.

To given an example of a nested object, let’s say we want to store address information with three sub-fields, “Type”, “City” and “State”.

Using JSON, we can store the address as its own object with three fields, “Type”, “City” and “State”, like so:

{
  "Type": "Home",
  "City": "Berkeley",
  "State": "California"
}

To add this information to parent object, we simply create a new key, “Address”, and append it to the existing object as a value.

{
  "FirstName": "Robert",
  "LastName": "Oppenheimer",
  "EmailAddress": "[email protected]",
  "Address": {
    "Type": "Home",
    "City": "Berkeley",
    "State": "California"
  }
}

In a table format, the data would look like this:

Even with one-level nested objects, we are already seeing an example where JSON can handle data that CSV cannot support natively. In order to translate this data in CSV, we would need to transform the Address columns so that they are flattened out to the top level (e.g. “AddressType”).

Arrays

What if we want to store multiple addresses? For example, let’s say we also have a work address, and we want to store both the home and work address together as a list on the record.

In JSON, lists are stored using the array data type, which is defined using square brackets []. Here we have an array that contains two Address objects— home address and work address.

[
  {
    "Type": "Home",
    "City": "Berkeley",
    "State": "California"
  },
  {
    "Type": "Work",
    "City": "Santa Fe",
    "State": "New Mexico"
  }
]

An array is an ordered list of elements. Elements can be any of the supported data types, and you can have elements of multiple different data types within the same array.

Just like with nested objects, we can add an array to an existing object by defining a key and adding the array as its associated value. Let’s rename the Address key to Addresses and append the above array to our object.

{
  "FirstName": "Robert",
  "LastName": "Oppenheimer",
  "EmailAddress": "[email protected]",
  "Addresses":
  [
    {
      "Type": "Home",
      "City": "Berkeley",
      "State": "California"
    },
    {
      "Type": "Work",
      "City": "Santa Fe",
      "State": "New Mexico"
    }
  ]
}

In a table format, the data looks like this:

This is again a relatively simple data structure, but something that is impossible to capture in CSV.

Booleans, Numbers and Nulls

Since JSON is heavily used by programs, it has certain data types which make it possible to be processed efficiency. These include booleans, numbers and nulls.

Boolean is a data type that has one of two possible values: true and false. You might be thinking—couldn’t you also store these values as strings, like “true” and “false”? Indeed you can.

But since the primary use case for boolean fields is conditional logic—i.e. “If the contact has marketing consent, then add to campaign”—using strings to store truth values is not only inefficient, but highly prone to error.

For example, imagine all the different ways in which a truth value can be recorded by humans and programs:

true: “True”, “true”, "T”, “Yes”, “Y”, “1”
false: “False”, “false”, “F”, “No”, “N”, “0”

Source: Reddit

Not only would programs need to add an extra step to parse the truth value from a string, but they would also need to make sure that they are parsing the truth value in the same way so that no values are missed out. Using the boolean data type enforces both data quality as well as efficiency.

Similarly, JSON’s number data type allows programs to execute operations such as summation on number fields efficiently, without having to parse numbers from strings.

In JSON, empty fields are not allowed. When a key is defined, there needs to be an associated value. So how do we encode records which have missing or empty field values? JSON’s null data type serves exactly this purpose, allowing you to specify that no value exists for the key.

JSON vs CSV

So when should you use JSON over CSV?

Choosing between JSON and CSV for data storage or interchange depends on the requirements of your application and the nature of your data. Here are some scenarios where JSON is generally preferred over CSV:

Hierarchical or Nested Data: As we saw with the object data type, JSON supports complex data structures with nested or hierarchical data. If your data includes multi-level objects or arrays, JSON can represent these directly and clearly, whereas CSV struggles with this and might require complex workarounds.
Data Interchange in Web Technologies: Since JSON integrates seamlessly with JavaScript and many modern programming languages offer built-in support for parsing JSON, it’s a natural fit for data interchange in web technologies. Use cases like API integrations and content personalization typically use JSON.
Key-Value Pairs: JSON’s key-value pairs provide clear mapping between field names and values, which makes it especially useful when the data consumer needs explicit and reliable field identification. This not only helps applications dynamically parse incoming data, but also allows humans to easily read and update data on individual records.
Schema Flexibility: The fact that JSON is schema-less—meaning each record can contain a different set of fields—gives it a flexibility that is beneficial for applications where the data might evolve over time, allowing for additions or deletions of fields without affecting existing records.
Metadata and Additional Context: JSON can include metadata and additional context in a straightforward way by adding extra key-value pairs. This can be helpful for passing extra settings or options along with data.

In contrast, CSV is typically preferred when the following features are important.

Simplicity and Lightweight: CSV is a simpler format and usually takes up less space than JSON when dealing with large volumes of flat, tabular data.
Data Analysis and Processing: CSV files are widely supported by data analysis tools and software, including spreadsheet applications like Microsoft Excel, making them a popular choice for data analysis tasks.
Uniform Data Structure: When all data records have the same structure and there are no hierarchical data requirements, CSV is an efficient, straightforward choice.

That’s a wrap!

In the next issue, we’re going to get hands-on with your newly acquired JSON skills by exploring how we can use it to level up our usage of Generative AI.

Make sure to subscribe to stay updated.

EDIT: In true developer fashion, a previous version of this article had a JSON example containing misplaced commas. Those have now been fixed.