Data Types
Choosing the correct data type can lead to a better performing database, for example comparing numeric types takes less time than comparing character strings types because character string types have character set and collation considerations, also the smaller the data the faster it will be processed and less I/O is used making queries perform even better.
MySQL includes many of the ISO SQL 2003 standard data types and adds more data types of its own, the data types can be categorized by the following
I am only going to give a quick discussion on each as it is best to refer to the latest MySQL documentation or to a specific version that you have, I will try and give examples but I do not have examples for every possibility.
Before we begin a quick review on how to create a table in MySQL, which can come in a number of flavors
Standard table | create table table_name ( ... ); |
Temporary table | create temporary table_name ( ... ); |
Memory table | create table table_name ( ... ) ENGINE = MEMORY; |
You can also display how the table was created
Display table construction | show create table table_name; |
Just some things that you should be aware of regarding tables
Lastly on tables is the contraints which I discuss in my indexes section.
We start with character types, I have already discussed data types in my Oracle data types section if you wish to compare.
String Type | Character string type | Size |
Example |
Fixed-width strings | character(length) | create table string_type ( fixwidth1 CHARACTER(10) NOT NULL DEFAULT '', fixwidth2 CHAR(10) NOT NULL DEFAULT'', fixwidth3 CHAR(10) CHARACTER SET utf16 COLLATE utf16_general_ci, varilength1 CHARACTER VARYING(10) NOT NULL DEFAULT '', varilength2 CHAR VARYING(10) NOT NULL DEFAULT '', varilength3 VARCHAR(10) NUL NULL DEFAULT '' ); |
|
char(length) | 0-255 |
||
variable-length strings | character varying(length) | ||
char varying(length) | |||
varchar(length) | 0-65,535 |
||
tinytext | 0-255 |
||
text | 0-64Kb |
||
meduimtext | 0-16Mb |
||
longtext | 0-4Gb |
||
Character objects | character largeobject | ||
char large object | |||
CLOB |
Remember that the length of the string does not mean the size of the string as this depends on the character set used, for example latin1 character set uses one byte to store each character whereas the utf8 character set uses up to four bytes per character, double check the MySQL version that you are using as the number of bytes may vary and could have an impact on the size of your database. The system parameter pad_char_to_full_length changes the behavior of retrieving fixed-width strings from a table, if set the string is retrieved is the full length of the field.
find the length of a string and checking the padding | select concat('/',fixwidth1,'/'), char_length(fixwidth1) from string_type; |
National character sets are pretty much the same as above except that the character set uses the utf8
String Type | Character string type | Size |
Example |
Fixed-width strings | national character(length) | create table string_type ( fixwidth1 NATIONAL CHARACTER(10) NOT NULL DEFAULT '', fixwidth2 NCHAR(10) NOT NULL DEFAULT'', varilength1 NCHAR(10) NOT NULL DEFAULT '', ); |
|
national char(length) | |||
nchar(length) | 0-255 |
||
variable-length strings | national character varying(length) | ||
national char varying(length) | |||
nchar(length) | 0-65,535 |
||
Character objects | national character largeobject | ||
nchar large object | |||
NCLOB |
Binary large objects is the least restrictive data type, they are byte strings and are ordered by each bytes value.
Binary type | Size |
Example |
binary | 0-255 | create table string_type ( bin1 BINARY(100) NOT NULL DEFAULT '', blob1 BLOB(100) NOT NULL DEFAULT'', ); |
varbinary | 0-65,532 | |
tinyblob | 0-255 | |
blob | 0-65,532 | |
meduimblob | 0-16Mb | |
longblob | 0-4Gb |
Numeric types is probably the most vast, as there are many different types, these types all affect data precision and depending on the application type this category is the one that you will send more time on, I will categorize them in exact number type and approximate numeric type
Numeric Type | Type | signed range | unsigned range | Size (bytes) |
Example |
Exact | numeric(g,f) | maximum limits depend on the hardware and O/S max for g = 65, max for f = 30 |
depends |
create table num_test ( bt1 BIT(10), fl1 FLOAT(10), dl DECIMAL(5,4), uty1 TINYINT(10) unsigned NOT NULL AUTO_INCREMENT ); |
|
decimal(g,f) | maximum limits depend on the hardware and O/S max for g = 65, max for f = 30 |
depends |
|||
smallint | -32,768 to 32,767 | 0 to 65,535 | 2 |
||
integer | -2,147,483,648 to |
0 to 4,294,967,295 | 4 |
||
bigint | -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 | 0 to 18,446,744,073,709,551,615 | 8 |
||
tinyint | -128 to 127 | 0 to 255 | 1 |
||
meduimint | -8,388,608 to 8,388,607 | 0 to 16,777,215 | 3 |
||
bit(x) | 1 to 64 |
1-64 |
|||
serial | alias for BIGINT UNSIGNED NOT NULL AUTO_INCREMENT UNIQUE KEY |
n/a |
|||
approximate | float(p) | maximum limits depend on the hardware and O/S max for p = 24 |
4 |
||
real | alias for DOUBLE or FLOAT |
n/a |
|||
double(g,f) | maximum limits depend on the hardware and O/S max for g = 53 max for f = 30 |
8 |
Boolean data types have two options the standard BOOLEAN and the MySQL version called BOOL which is an alias for TINYINT(1);
Datetime types have the following, you can specify any of the valid dates as below
you can also put microseconds in as well
Datetime type | Range | Size (bytes) |
Zero value |
date | '1000-01-01' to '9999-12-31' | 3 |
'0000-00-00' |
datetime | '1000-01-01 00:00:00' to '9999-12-31 23:59:59' | 8 |
'0000-00-00 00:00:00' |
timestamp | '1970-01-01 00:00:00' to '2038-01-18 22:14:07' | 4 |
'0000-00-00 00:00:00' |
time | '-838:59:59' to '838:59:59' | 3 |
'00:00:00' |
year(2) | 00 to 99 | 1 |
'00' |
year(4) | 1901 to 2155 | 1 |
'0000' |
You need to be aware of a number of date, time and numeric conversion issues, here I point you to the MySQL documentation.
Lastly we come to the interval data types, these strictly speaking are not data types that can be stored in a table, they are used as part of an expression in a time based calculation.
Interval expressions | # below would translate to '2012-04-08' # below would translate to 2017-07-08 |
||
Interval data types |
|||
Name | Format |
Example | |
microsecond | n |
INTERVAL 5 MICROSECOND | |
second | n |
INTERVAL 5 SECOND | |
minute | n |
INTERVAL 5 MINUTE | |
hour | n |
INTERVAL 5 HOUR | |
day | n |
INTERVAL 5 DAY | |
week | n |
INTERVAL 5 WEEK | |
month | n |
INTERVAL 5 MONTH | |
quarter | n |
INTERVAL 5 QUARTER | |
year | n |
INTERVAL 5 YEAR | |
second_microsecond | 'n.n' |
INTERVAL '5.4' SECOND_MICROSECOND | |
minute_microsecond | 'n.n' |
INTERVAL '5.4' MINUTE_MICROSECOND | |
minute_second | 'n:n' |
INTERVAL '5:4' MINUTE_SECOND | |
hour_microsecond | 'n.n' |
INTERVAL '5.4' HOUR_MICROSECOND | |
hour_second | 'n:n:n' |
INTERVAL '5:4:3' HOUR_SECOND | |
hour_minute | 'n:n' |
INTERVAL '5:4' HOUR_MINUTE | |
day_microsecond | 'n.n' |
INTERVAL '5.4' DAY_MICROSECOND | |
day_second | 'n n:n:n' |
INTERVAL '5 4:3:2' DAY_SECOND | |
day_minute | 'n n:n' |
INTERVAL '5 4:3' DAY_MINUTE | |
day_hour | 'n n' |
INTERVAL '5 4' DAY_HOUR | |
year_month | 'n-n' |
INTERVAL '5-4' YEAR_MONTH |
MySQL has two data types that are very different to the above data types, they are similar to foreign keys in that the values inserted must be one of the field values set
ENUM | this is a enumerated list of 1 to 65,535 strings which indicate the allowed values for the field, only one of the values can be stored in the list |
SET | this is a enumerated list of 1 to 64 strings which indicate the allowed values for the field, any combination of the strings in the enumerated list can be stored as a comma-delimited list. |
All trailing spaces will be removed from enum and set data types, the system parameter sys_mode can affect the data types, see below for more information on sql modes.
sql_mode system parameter
|
This parameter affects both the enum and set data types, if the parameter is set to any of the below then an error will occur if a duplicate entry during a field creation, otherwise you will get a warning
|
here are some examples on how to use these data types
ENUM example | Example One INSERT INTO Countries (name,continent) VALUES('Kenya','Africa'); select name,continent,continent+0 'Index position' from countries; Example Two insert into bike values ('0001', 'chopper', 'red', 'rack,light'); select id, model, color, options from bike; |
SET example | CREATE TABLE allergy (symptom SET('sneezing','runny nose','stuffy head','red eyes') );
INSERT INTO allergy (symptom) VALUES(''); select symptom,symptom+0 'Index position' from allergy; |
There are number of things that you should understand about these data types
Not many developers use enum and set mainly because they are hard to alter and do require table rebuilds if altered, most developers will use foreign keys but I reference them here so that you know that you have other options.
Finally the below is a quick table detailing what attributes a data type can use, also check the MySQL documentation of the version you are using.
Data Type | Attributes that can be used |
Character Strings | NOT NULL, NULL, DEFAULT and BINARY |
National Character Strings | NOT NULL, NULL, DEFAULT and BINARY |
Binary | NOT NULL, NULL and BINARY (only binary can use this) |
Numeric data | NOT NULL, NULL, DEFAULT, AUTO_INCREMENT, SIGNED, UNSIGNED, ZEROFILL, SERIAL DEFAULT VALUE |
DateTime | NOT NULL, NULL and DEFAULT |
Enum and Sets | NOT NULL, NULL and DEFAULT |
MySQL has a system parameter that allows you to change the default behavior of data validation and allowed SQL syntax, it can be set on a global level or on a session level, by default the sql_mode is set to '' (blank).
sql_mode | # You can set it at a global or sessions level
|
By default MySQL will allow you to enter invalid data and will change it to make it valid, this of course means that is may invalidate data integrity, lets see an example
invalid data entry | create table test_sql_mode ( |
As you can see in the example above, values that are too large to fit in a field are truncated to the closest value that does fit, similarly for string data if they are too large. So what you need MySQL is to throw an error instead of a warning this were the sql_mode comes into play.
You have to be careful when using NULL values in tables, something's to beware of are
Remember that NULL means no value has been determined it does not mean zero or an empty field, it means unknown.
Using the PROCEDURE ANALYSE statement you can obtain the table information regarding used space on each field, this will be displayed in the optimal_fieldtype field of the output, you may then want to reduce certain columns to optimize them, when analyzing tables make sure that you have a good sample size.
optimal data types | # The PROCEDURE ANALYSE takes two options
## Lets create a simple table using a high value for varchar thus we will be wasting space ## Now insert some data, the data will be well within the limits of the varchar insert into accounts values (11,'top shop inc 11'); insert into accounts values (21,'top shop ltd'); ## Lets insert some NULL's ## Lets insert some empty fields ## Now run the procedure analyse and see what we get, from the output we can see the NULL,s and the |
This tool can be useful if you are planning to migrate to a new system and want to reduce table column sizes so that not to waste space, if you already have a system you can always alter the table providing it is not to large or you have an enough time to alter the table. Be careful when using the procedure analyse function it does not always get it right, sometimes you are the best person to know what is right for your data optimization as you would know the data.
There are a number of tasks that you can perform on an tables and indexes which are available in MySQL
check table | check table tablename [option] ## You can use one of the following options
|
repair table | repair table tablename [option] ## You can use one of the following options
|
checksum table | checksum table tablename [option] ## You can use one of the following options
|
analyze table | analyze table tablename |
optimize table | optimize table tablename |