Understanding ENUM in Databases: When and How to Use It
When building a database schema, you will often encounter situations where a column should only contain specific values. For example, if you're managing user roles, payment statuses, or different types of accounts, you'd want to restrict the possible entries to valid options, such as 'Admin', 'Member', etc. This is where ENUM
fields in SQL databases can be incredibly useful.
In this post, I'll dive into what ENUM
is, when to use it, and how to implement it effectively in your database. I'll also discuss the pros and cons of ENUM
fields, complete with examples to illustrate its functionality.
What Is ENUM
?
ENUM
is a special data type in Postgres databases that allows you to define a column with a predefined set of values. Only these values can be stored in the column. If a value outside this set is attempted, the database will reject the operation and return an error.
This is very similar to how we define and use enums in other programming languages, like C# and TypeScript.
Think of ENUM
as a built-in way to validate data at the database level, ensuring your columns contain only valid entries.
Example:
Let’s say you’re building a marketplace and need to track the type of Stripe account users have:
- Possible values:
standard
, express
, custom
You first need to create a new ENUM type in your database:
CREATE TYPE account_type AS ENUM ('member', 'admin', 'guest');
You can define an ENUM
column to enforce these values:
CREATE TABLE stripe_accounts (
id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
user_id BIGINT UNSIGNED NOT NULL,
stripe_account_id VARCHAR(255) UNIQUE NOT NULL,
account_type account_type NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
);
Here, the account_type
column can only contain standard
, express
, or custom
as values. If you try to insert something like premium
, the database will reject it.
When to Use ENUM
1. Small, Fixed Sets of Values
If the possible values for a column are limited and unlikely to change, ENUM
is a good choice. Examples include:
- Payment statuses:
pending
, completed
, failed
- User roles:
admin
, editor
, viewer
- Product categories:
physical
, digital
, subscription
If the values are more 'dynamic' however and have a higher chance of being modified, then ENUM's might not be the way to go. More on that below.
2. Data Validation at the Database Level
By using ENUM
, you reduce the risk of invalid data entering your database. This built-in validation logic can you save a few extra lines of code. Also having validation at the database level reduces the chances of malformed data causing issues down the line.
3. Readable Queries
ENUM
values make your queries easier to read compared to numeric codes or foreign keys.
SELECT * FROM stripe_accounts WHERE account_type = 'express';
This is much clearer than querying with numeric values like account_type = 2
.
It can also, potentially, increase performance as you are reducing the amount of joins needed when working with lookup tables.
Pros of Using ENUM
- Validation: Prevents invalid data from being stored in the column.
- Compact Storage: Internally,
ENUM
values are stored as integers, making them space-efficient.
- Readable Queries: Makes SQL queries more intuitive compared to using numeric codes or unrelated foreign keys.
- Performance: Since
ENUM
is stored as an integer, comparisons and lookups are faster than plain VARCHAR
fields.
Cons of Using ENUM
If you are going to use enumerations though, there might be a couple of downsides to consider.
- Harder to Modify: If you need to add, remove, or rename an
ENUM
value, you must alter the table structure. This can be cumbersome for frequently changing datasets.
- Portability Issues: Not all databases support
ENUM
, so if you’re planning to migrate to a different database (e.g., from PostgreSQL to SQL Server), it might cause compatibility issues.
- Limited Flexibility: Unlike a separate lookup table,
ENUM
doesn’t allow for additional metadata (e.g., descriptions or relationships).
Examples of ENUM
Usage
Creating a Table with ENUM
Here’s how you define an ENUM
column in PostgreSQL:
-- Create ENUM type if it doesn't exist yet
CREATE TYPE order_status AS ENUM ('pending', 'completed', 'canceled');
-- Create the orders table
CREATE TABLE orders (
id BIGSERIAL PRIMARY KEY, -- Replaces BIGINT UNSIGNED AUTO_INCREMENT
user_id BIGINT NOT NULL CHECK (user_id > 0), -- No UNSIGNED in PostgreSQL
status order_status NOT NULL, -- Uses the ENUM type
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
Inserting Data
You can only insert one of the predefined values:
INSERT INTO orders (user_id, status) VALUES (1, 'pending');
Attempting to insert an invalid value will result in an error:
INSERT INTO orders (user_id, status) VALUES (2, 'processing');
-- Error: Invalid value for ENUM column 'status'
Querying Data
To retrieve orders with a specific status:
SELECT * FROM orders WHERE status = 'completed';
Altering an ENUM
If you need to add a new status, you’ll need to alter the enum type. However, it is important to note that you cannot remove values from any ENUM type once they have been added.
ALTER TYPE order_status ADD VALUE 'processing';
Alternatives to ENUM
If you anticipate frequent changes to the allowed values, consider using a lookup table instead of ENUM
. For example:
Lookup Table:
-- Create the order_statuses lookup table
CREATE TABLE order_statuses (
id SERIAL PRIMARY KEY, -- SERIAL replaces TINYINT UNSIGNED AUTO_INCREMENT
status_name VARCHAR(50) NOT NULL UNIQUE
);
-- Create the orders table
CREATE TABLE orders (
id BIGSERIAL PRIMARY KEY, -- BIGSERIAL replaces BIGINT UNSIGNED AUTO_INCREMENT
user_id BIGINT NOT NULL CHECK (user_id > 0), -- No UNSIGNED in PostgreSQL
status_id INT NOT NULL, -- Use INT instead of TINYINT UNSIGNED
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (status_id) REFERENCES order_statuses(id) ON DELETE CASCADE
);
With this setup, you can easily add new statuses by inserting a row into the order_statuses
table without altering the orders
table.
Conclusion
ENUM
is a powerful tool for ensuring data integrity and simplifying queries when you’re dealing with small, fixed sets of values. It’s especially useful for fields like user roles, account types, and payment statuses. However, its rigidity can be a downside if your data is prone to frequent changes.
When deciding whether to use ENUM
, consider the trade-offs between simplicity and flexibility. If you’re confident the values won’t change often, ENUM
is a great choice. Otherwise, a lookup table might be a more future-proof solution.
Do you use ENUM
in your database schema? Let me know your thoughts or share examples from your own projects in the comments below!