Learning Focus

Use this lesson to understand Storage Consideration with practical syntax and examples.

MySQL Data Type Storage Considerations: Comprehensive Guide

Numeric Data Types

Integer Types

Type	Storage (Bytes)	Signed Range	Unsigned Range
`TINYINT`	1	-128 to 127	0 to 255
`SMALLINT`	2	-32,768 to 32,767	0 to 65,535
`MEDIUMINT`	3	-8,388,608 to 8,388,607	0 to 16,777,215
`INT`	4	-2,147,483,648 to 2,147,483,647	0 to 4,294,967,295
`BIGINT`	8	-9.2e18 to 9.2e18	0 to 1.8e19

Considerations:

Use the smallest integer type that fits your data (e.g., TINYINT for ages).
UNSIGNED doubles the positive range but doesn't affect storage size.

DECIMAL

Precision (Digits)	Storage (Bytes)
1-9	4
10-18	8
19-27	12
28-36	16
37-45	20
46-54	24
55-63	28

Formula:

Storage = 4 bytes per 9 digits (rounded up).

Example:

DECIMAL(10,2) -> 10 total digits -> 8 bytes.

Best Practices:

Avoid excessive precision (e.g., DECIMAL(65,30) wastes space).
Prefer DECIMAL for financial data (exact precision).

String Data Types

CHAR vs VARCHAR

Type	Storage	Max Length	Use Case
`CHAR`	Fixed: N bytes per character	255 chars	Fixed-length codes (e.g., ISO country codes)
`VARCHAR`	Variable: 1-2 bytes (length) + data	65,535 bytes	Variable-length text (e.g., names, emails)

Examples:

CHAR(10) with utf8mb4: 10 4 = 40 bytes (even if storing 'A').
VARCHAR(255) with 'Hello' (5 chars in latin1): 1 (length byte) + 5 = 6 bytes.

Row Size Limit: Total row size (all columns) cannot exceed 65,535 bytes.

Example:

A table with two VARCHAR(32767) columns in utf8mb4 will exceed the limit (32767 4 2 = 262,136 bytes).

TEXT and BLOB Types

Type	Max Size (Bytes)	Storage Overhead	Use Case
`TINYTEXT`	255	1 byte	Short notes
`TEXT`	65,535	2 bytes	Articles, comments
`MEDIUMTEXT`	16.7 million	3 bytes	Large JSON/XML data
`LONGTEXT`	4.3 billion	4 bytes	Books, logs
`BLOB` types	Same as `TEXT`	1-4 bytes	Binary data (images, files)

Considerations:

TEXT/BLOB are stored off-page in InnoDB if > ~8KB, reducing row size limits.
Avoid SELECT * on TEXT/BLOB columns to minimize memory usage.

ENUM and SET

Type	Storage (Bytes)	Max Values	Use Case
`ENUM`	1-2	65,535	Single-choice lists (e.g., statuses)
`SET`	1-8	64	Multi-choice flags (e.g., permissions)

Example:

ENUM('red', 'green', 'blue') uses 1 byte (3 options 255).
SET('read', 'write', 'delete') uses 1 byte (3 bits).

Date/Time Types

Type	Storage (Bytes)	Format	Range
`DATE`	3	`YYYY-MM-DD`	1000-01-01 to 9999-12-31
`TIME`	3	`HH:MM:SS`	-838:59:59 to 838:59:59
`DATETIME`	5-8*	`YYYY-MM-DD HH:MM:SS`	1000-01-01 00:00:00 to 9999-12-31 23:59:59
`TIMESTAMP`	4	`YYYY-MM-DD HH:MM:SS`	1970-01-01 00:00:01 to 2038-01-19 03:14:07 UTC
`YEAR`	1	`YYYY`	1901 to 2155

DATETIME uses 5 bytes in MySQL 5.6.4+, 8 bytes in older versions.

Best Practices:

Use TIMESTAMP for auto-updating audit columns (e.g., created_at).
Prefer DATETIME for dates outside 1970-2038.

Spatial and JSON Types

Type	Storage	Use Case
`GEOMETRY`	Variable	Geographic shapes
`JSON`	LONGTEXT-like	Structured data (validated)

Considerations:

JSON adds validation overhead but supports efficient querying via generated columns.

Storage Optimization Tips

Choose the Smallest Type:
- Use SMALLINT instead of INT if values 65,535.
- Prefer VARCHAR(50) over TEXT for short text.
Avoid Over-Precision:
- Use DECIMAL(10,2) instead of DECIMAL(20,5) for currency.
Character Sets:
- Use utf8mb4 for Unicode, latin1 for compactness if possible.
Row Format:
- Use COMPRESSED or DYNAMIC row formats in InnoDB for large rows.
Indexing:
- Avoid indexing large VARCHAR/TEXT columns. Use prefix indexes:
```
CREATE INDEX idx_name ON users (name(10));
```
Normalization:
- Store repetitive values in lookup tables (e.g., status codes).

Common Mistakes

Wasted Space: Using CHAR(255) for variable-length strings.
Row Overflow: Exceeding 65,535 bytes with oversized VARCHAR columns.
Implicit Conversion: Mixing CHAR and VARCHAR in joins, causing full-table scans.

Example: Storage Calculation

Table Schema:

CREATE TABLE users (
    id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,  -- 4 bytes
    username VARCHAR(50) CHARACTER SET utf8mb4,  -- 1 + (50  4) = 201 bytes
    country CHAR(2) CHARACTER SET latin1,        -- 2 bytes
    created_at DATETIME                          -- 5 bytes
);

Row Size: 4 + 201 + 2 + 5 = 212 bytes.

Implications: Efficient for large tables (1 million rows 212 MB).

Summary

Numeric: Use smallest integer/DECIMAL precision.
Strings: Prefer VARCHAR for variable text; avoid oversized CHAR.
Date/Time: Choose based on range and functionality (TIMESTAMP vs DATETIME).
Optimization: Normalize data, index wisely, and monitor row size limits.

Next Steps:

Explore advanced topics like indexing strategies, partitioning, or query optimization to further enhance performance.

Concept Map

flowchart LR
    A[Schema Context] --> B[Storage Consideration]
    B --> C[Query Pattern]
    C --> D[Validation]
    D --> E[Production Use]

Common Pitfalls

Pitfall	Consequence	Prevention
Executing queries without validating sample rows	Logic errors reach production data or reports	Start with `SELECT ... LIMIT 10` and inspect edge cases
Ignoring NULL and duplicate behavior	Aggregations and filters return misleading results	Test with `NULL`, duplicates, and empty sets explicitly
Using advanced syntax before checking schema	Queries fail due to missing columns/indexes	Verify structure with `DESCRIBE table_name;` and adapt query design

Quick Reference

CREATE TABLE sample_storage_consideration (id INT PRIMARY KEY, value STORAGE_CONSIDERATION);
DESCRIBE sample_table;
SHOW CREATE TABLE sample_table;

What's Next

Next: Performance Implication - Continue to the next concept with incremental complexity.
Module Overview - Return to this module index and choose another related lesson.

Numeric Data Types​

Integer Types​

DECIMAL​

String Data Types​

CHAR vs VARCHAR​

TEXT and BLOB Types​

ENUM and SET​

Date/Time Types​

Spatial and JSON Types​

Storage Optimization Tips​

Common Mistakes​

Example: Storage Calculation​

Summary​

Concept Map​

Common Pitfalls​

Quick Reference​

What's Next​

Numeric Data Types

Integer Types

DECIMAL

String Data Types

CHAR vs VARCHAR

TEXT and BLOB Types

ENUM and SET

Date/Time Types

Spatial and JSON Types

Storage Optimization Tips

Common Mistakes

Example: Storage Calculation

Summary

Concept Map

Common Pitfalls

Quick Reference

What's Next