Data Cleaning Terms Explained
This glossary covers common operations when cleaning data files, including validations, repairs, and transformations. Entries with the icon include code snippets in JavaScript.
Subscribe and stay up to date with the latest tips and news from Dromo.
Concatenating Fields
Merging two or more fields in a dataset.
Dropping Invalid Rows
Removing rows from your dataset that do not meet certain validation rules or conditions.
Dynamic Data Validation
Validating data against a dynamic set of rules or conditions.
Field Name Matching
Match the fields in a dataset to a set of expected fields.
Finding Text Within a String
Identifying and possibly replacing specific sequences of characters (substrings) within a larger string.
Flagging Invalid Rows
Identifying and marking rows that do not meet certain validation criteria.
Formatting Numbers
Modifying how a number is displayed without altering the underlying data.
Imputing Missing Values
Replacing missing or null values with substituted values.
Normalizing Cases
Converting all text data to a uniform case, such as lower case or upper case.
Removing Duplicates
Eliminating duplicate entries from your dataset.
Requiring Values
Ensures specific fields in a dataset are not empty or undefined.
Splitting Fields
Separating the contents of a single field into multiple separate fields.
Standardizing Date Formats
Standardizing dates into a single, consistent format.
Trimming Fields
Removing unnecessary characters, typically whitespace, from the start and end of a string.
URL Validation
Checking if a field's value matches the structure of a valid URL.