Glossary

Data types and structures

Tabular data - Anything that can be represented in cells, organized into columns and rows (like a spreadsheet).

CSV and TSV - Common formats for tabular data. Cells are separated with commas (.csv or comma-separated values) or tabs (.tsv or tab-separated values).

Graph data - Data that can be represented by nodes and edges or branching structures.

JSON - (JavaScript Object Notation) A format for representing and storing graph data, designed for use with the C family of languages (including JavaScript).

XML - (eXstensible Markup Language) A data format for representing graph data that uses syntax similar to HTML.

Other definitions

Tidy data - A specific system for cleaning and standardizing data for statistical analysis using the R programming language, developed by Hadley Wickham. See the original paper or an informal version.