Skip to main content
Learning Focus

Use this lesson to understand Storage Consideration with practical syntax and examples.

MySQL Data Type Storage Considerations: Comprehensive Guide

Numeric Data Types

Integer Types

TypeStorage (Bytes)Signed RangeUnsigned Range
TINYINT1-128 to 1270 to 255
SMALLINT2-32,768 to 32,7670 to 65,535
MEDIUMINT3-8,388,608 to 8,388,6070 to 16,777,215
INT4-2,147,483,648 to 2,147,483,6470 to 4,294,967,295
BIGINT8-9.2e18 to 9.2e180 to 1.8e19

Considerations:

  • Use the smallest integer type that fits your data (e.g., TINYINT for ages).
  • UNSIGNED doubles the positive range but doesn't affect storage size.

DECIMAL

Precision (Digits)Storage (Bytes)
1-94
10-188
19-2712
28-3616
37-4520
46-5424
55-6328

Formula:

Storage = 4 bytes per 9 digits (rounded up).

Example:

  • DECIMAL(10,2) -> 10 total digits -> 8 bytes.

Best Practices:

  • Avoid excessive precision (e.g., DECIMAL(65,30) wastes space).
  • Prefer DECIMAL for financial data (exact precision).

String Data Types

CHAR vs VARCHAR

TypeStorageMax LengthUse Case
CHARFixed: N bytes per character255 charsFixed-length codes (e.g., ISO country codes)
VARCHARVariable: 1-2 bytes (length) + data65,535 bytesVariable-length text (e.g., names, emails)

Examples:

  • CHAR(10) with utf8mb4: 10 4 = 40 bytes (even if storing 'A').
  • VARCHAR(255) with 'Hello' (5 chars in latin1): 1 (length byte) + 5 = 6 bytes.

Row Size Limit: Total row size (all columns) cannot exceed 65,535 bytes.

Example:

  • A table with two VARCHAR(32767) columns in utf8mb4 will exceed the limit (32767 4 2 = 262,136 bytes).

TEXT and BLOB Types

TypeMax Size (Bytes)Storage OverheadUse Case
TINYTEXT2551 byteShort notes
TEXT65,5352 bytesArticles, comments
MEDIUMTEXT16.7 million3 bytesLarge JSON/XML data
LONGTEXT4.3 billion4 bytesBooks, logs
BLOB typesSame as TEXT1-4 bytesBinary data (images, files)

Considerations:

  • TEXT/BLOB are stored off-page in InnoDB if > ~8KB, reducing row size limits.
  • Avoid SELECT * on TEXT/BLOB columns to minimize memory usage.

ENUM and SET

TypeStorage (Bytes)Max ValuesUse Case
ENUM1-265,535Single-choice lists (e.g., statuses)
SET1-864Multi-choice flags (e.g., permissions)

Example:

  • ENUM('red', 'green', 'blue') uses 1 byte (3 options 255).
  • SET('read', 'write', 'delete') uses 1 byte (3 bits).

Date/Time Types

TypeStorage (Bytes)FormatRange
DATE3YYYY-MM-DD1000-01-01 to 9999-12-31
TIME3HH:MM:SS-838:59:59 to 838:59:59
DATETIME5-8*YYYY-MM-DD HH:MM:SS1000-01-01 00:00:00 to 9999-12-31 23:59:59
TIMESTAMP4YYYY-MM-DD HH:MM:SS1970-01-01 00:00:01 to 2038-01-19 03:14:07 UTC
YEAR1YYYY1901 to 2155
  • DATETIME uses 5 bytes in MySQL 5.6.4+, 8 bytes in older versions.

Best Practices:

  • Use TIMESTAMP for auto-updating audit columns (e.g., created_at).
  • Prefer DATETIME for dates outside 1970-2038.

Spatial and JSON Types

TypeStorageUse Case
GEOMETRYVariableGeographic shapes
JSONLONGTEXT-likeStructured data (validated)

Considerations:

  • JSON adds validation overhead but supports efficient querying via generated columns.

Storage Optimization Tips

  1. Choose the Smallest Type:

    • Use SMALLINT instead of INT if values 65,535.
    • Prefer VARCHAR(50) over TEXT for short text.
  2. Avoid Over-Precision:

    • Use DECIMAL(10,2) instead of DECIMAL(20,5) for currency.
  3. Character Sets:

    • Use utf8mb4 for Unicode, latin1 for compactness if possible.
  4. Row Format:

    • Use COMPRESSED or DYNAMIC row formats in InnoDB for large rows.
  5. Indexing:

    • Avoid indexing large VARCHAR/TEXT columns. Use prefix indexes:

      CREATE INDEX idx_name ON users (name(10));

  6. Normalization:

    • Store repetitive values in lookup tables (e.g., status codes).

Common Mistakes

  • Wasted Space: Using CHAR(255) for variable-length strings.
  • Row Overflow: Exceeding 65,535 bytes with oversized VARCHAR columns.
  • Implicit Conversion: Mixing CHAR and VARCHAR in joins, causing full-table scans.

Example: Storage Calculation

Table Schema:

CREATE TABLE users (
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY, -- 4 bytes
username VARCHAR(50) CHARACTER SET utf8mb4, -- 1 + (50 4) = 201 bytes
country CHAR(2) CHARACTER SET latin1, -- 2 bytes
created_at DATETIME -- 5 bytes
);

Row Size: 4 + 201 + 2 + 5 = 212 bytes.

Implications: Efficient for large tables (1 million rows 212 MB).


Summary

  • Numeric: Use smallest integer/DECIMAL precision.
  • Strings: Prefer VARCHAR for variable text; avoid oversized CHAR.
  • Date/Time: Choose based on range and functionality (TIMESTAMP vs DATETIME).
  • Optimization: Normalize data, index wisely, and monitor row size limits.

Next Steps:

Explore advanced topics like indexing strategies, partitioning, or query optimization to further enhance performance.

Concept Map

flowchart LR
A[Schema Context] --> B[Storage Consideration]
B --> C[Query Pattern]
C --> D[Validation]
D --> E[Production Use]

Common Pitfalls

PitfallConsequencePrevention
Executing queries without validating sample rowsLogic errors reach production data or reportsStart with SELECT ... LIMIT 10 and inspect edge cases
Ignoring NULL and duplicate behaviorAggregations and filters return misleading resultsTest with NULL, duplicates, and empty sets explicitly
Using advanced syntax before checking schemaQueries fail due to missing columns/indexesVerify structure with DESCRIBE table_name; and adapt query design

Quick Reference

CREATE TABLE sample_storage_consideration (id INT PRIMARY KEY, value STORAGE_CONSIDERATION);
DESCRIBE sample_table;
SHOW CREATE TABLE sample_table;

What's Next