Use this lesson to understand Performance Implication with practical syntax and examples.
Performance Implications of MySQL Data Types
Storage Efficiency
- Smaller Data Types: Reduce disk I/O by fitting more rows per page.
- Example: Use
TINYINT(1 byte) instead ofINT(4 bytes) for status flags. - Impact: A table with 1 million rows saves 3 MB with this change.
- Example: Use
- Fixed vs Variable Types:
CHARwastes space for variable-length data but is faster for comparisons.VARCHARsaves space but adds 1-2 bytes of length overhead.
Indexing Impact
-
Index Size: Smaller data types -> smaller indexes -> faster scans.
- Example: Indexing a
BIGINT(8 bytes) vsINT(4 bytes) doubles index size. - Use
SMALLINT(2 bytes) for foreign keys if possible.
- Example: Indexing a
-
Prefix Indexes: For
TEXT/VARCHAR, index only the necessary prefix:CREATE INDEX idx_name ON users (name(10)); -- First 10 characters
Query Execution
-
Data Type Conversions: Implicit conversions in
WHEREclauses cause full scans.-- [X] Slow: phone (VARCHAR) vs INTEGER
SELECT * FROM users WHERE phone = 123456789;
-- [OK] Fast: Match types
SELECT * FROM users WHERE phone = '123456789'; -
Temporal Types:
TIMESTAMP(4 bytes) vsDATETIME(5-8 bytes): Smaller but limited to 1970-2038.DATE(3 bytes) is optimal for date-only storage.
Join Operations
- Key Size Impact: Larger keys slow down joins.
- Example: Joining on
VARCHAR(255)vsINTincreases memory usage. - Best Practice: Use integer foreign keys where possible.
- Example: Joining on
Memory Usage
- Temporary Tables: Queries using
TEXT/BLOBmay spill to disk, slowing sorting/grouping.- Mitigation: Avoid
SELECT *when unnecessary.
- Mitigation: Avoid
Specialized Types
-
ENUM: Stored as integers but string-readable. Faster than
VARCHARfor filtering.-- ENUM('red','green','blue') uses 1 byte vs VARCHAR(6) using 6+ bytes. -
SET: Bitmask storage for multi-value columns. Compact but limited to 64 options.
Collation & Character Sets
- utf8mb4: Uses 4 bytes/character but supports emojis.
- latin1: 1 byte/character. Use for ASCII-only data to save space.
- Impact: A
VARCHAR(255)inutf8mb4uses 1,020 bytes (2554) vs 255 bytes inlatin1.
NULL Handling
- Overhead: Each
NULLcolumn adds a bit to the row's NULL bitmap.- Example: 8
NULLcolumns -> 1 byte overhead. - Best Practice: Use
NOT NULL DEFAULTwhere appropriate.
- Example: 8
Floating-Point Tradeoffs
- FLOAT/DOUBLE: Faster math but approximate.
- DECIMAL: Exact precision but slower computations.
- Use
DECIMALfor financial data,FLOATfor scientific calculations.
- Use
BLOB/TEXT Considerations
-
Off-Page Storage: InnoDB stores >~8KB
BLOB/TEXTexternally, requiring extra I/O. -
Retrieval Overhead:
-- [X] Slow:
SELECT content FROM articles WHERE ...;
-- [OK] Better:
SELECT id, title FROM articles WHERE ...;
Performance Optimization Checklist
- Downsize Columns: Use the smallest type that accommodates your data.
- Index Wisely: Prefer numeric types for keys and limit index length.
- Avoid Implicit Conversions: Match data types in joins/filters.
- Normalize Data: Replace repetitive strings with integer IDs.
- Monitor Row Size: Stay under 65,535 bytes to prevent row overflow.
Example: Optimizing a User Table
Before:
CREATE TABLE users (
id BIGINT PRIMARY KEY, -- 8 bytes
email VARCHAR(255) CHARACTER SET utf8mb4, -- 1020 bytes
country VARCHAR(50), -- 50 bytes
created_at DATETIME -- 5 bytes
);
Total Row Size: ~1,083 bytes
After:
CREATE TABLE users (
id INT UNSIGNED PRIMARY KEY, -- 4 bytes
email VARCHAR(100) CHARACTER SET latin1, -- 100 bytes
country CHAR(2), -- 2 bytes
created_at TIMESTAMP -- 4 bytes
);
Total Row Size: 110 bytes (90% reduction)
Next Steps:
- Explain Plans: Use
EXPLAINto analyze query performance. - Benchmark: Test with realistic data volumes.
- Profile Storage: Monitor disk usage with
SHOW TABLE STATUS.
By aligning data types with use cases and storage needs, you can achieve 10x+ performance gains in large-scale applications.
Concept Map
flowchart LR
A[Schema Context] --> B[Performance Implication]
B --> C[Query Pattern]
C --> D[Validation]
D --> E[Production Use]
Common Pitfalls
| Pitfall | Consequence | Prevention |
|---|---|---|
| Executing queries without validating sample rows | Logic errors reach production data or reports | Start with SELECT ... LIMIT 10 and inspect edge cases |
| Ignoring NULL and duplicate behavior | Aggregations and filters return misleading results | Test with NULL, duplicates, and empty sets explicitly |
| Using advanced syntax before checking schema | Queries fail due to missing columns/indexes | Verify structure with DESCRIBE table_name; and adapt query design |
Quick Reference
CREATE TABLE sample_performance_implication (id INT PRIMARY KEY, value PERFORMANCE_IMPLICATION);
DESCRIBE sample_table;
SHOW CREATE TABLE sample_table;
What's Next
- Previous: Storage Consideration - Review the previous lesson to reinforce context.
- Module Overview - Return to this module index and choose another related lesson.