SQL Server BIT vs TINYINT for booleans

max105 · Jan 12, 2010

I've been told that it's better (or more efficient?) to use TINYINT data types instead of BIT data types for boolean columns in SQL Server. Can anyone verify this and explain to me why that is?

PTNL · Jan 12, 2010

This question sounds like it is the tip of an iceberg for a bigger question. Exactly what efficiencies or benefits were you given for using TINYINT versus BIT?

calebb · Jan 12, 2010

If your choice is a single BIT vs a single TINYINT column, then it makes no difference - the minimum unit of storage is a byte. But if you have 3 true/false conditions (for example), you can have 3 BIT fields which will still store 1 byte per record. (3 TINYINT columns would use 3 bytes per record)

And, of course, less space per record is advantageous because then you can fit more records per Page, lower disk i/o, etc. For small scale projects, you'll likely never notice a difference. On a larger scale, 1 or 2 bytes per record can have a much larger impact.

max105 · Jan 12, 2010

PTNL said:
This question sounds like it is the tip of an iceberg for a bigger question. Exactly what efficiencies or benefits were you given for using TINYINT versus BIT?

I wasn't really given any real reasons other than that someone said it's best practice to do so.

max105 · Jan 12, 2010

calebb said:
If your choice is a single BIT vs a single TINYINT column, then it makes no difference - the minimum unit of storage is a byte. But if you have 3 true/false conditions (for example), you can have 3 BIT fields which will still store 1 byte per record. (3 TINYINT columns would use 3 bytes per record)

And, of course, less space per record is advantageous because then you can fit more records per Page, lower disk i/o, etc. For small scale projects, you'll likely never notice a difference. On a larger scale, 1 or 2 bytes per record can have a much larger impact.

I have the same argument too. Space-wise it seems more efficient to be using bits instead of tinyints. Sure, it probably won't matter whether you're using bit vs tinyint for a small table. But on that point, why WOULDN'T you use a bit to be on the safer side? Is there some weird inefficiency with the way SQL Server handles bits?

PTNL · Jan 12, 2010

max105 said:
I wasn't really given any real reasons other than that someone said it's best practice to do so.

What exact "best practice"? What supporting documentation or business need motivated this statement?

If a column is a true/false or yes/no scenario, then BIT would likely be appropriate; if more than two values are possible, then TINYINT could be appropriate. However... it's important to consider the business needs in addition to query and database schema optimizations; so treat that first sentence as a guidelines and not a blanket statement rule. Also consider that BIT is more self-documenting than TINYINT. And as a general rule, I try to avoid allowing NULLs for BIT columns.

Some other technical thoughts...
BIT fields can be used for filtering and ordering, but cannot be used for math calculations and aggregations. An example would be that "WHERE MyBitField=1" and "ORDER BY MyBitField" would work, but "MyBitField + 1" and "GROUP BY MyBitField" would not work (unless you wrapped MyBitField in a CAST statement to another numeric type, but that's getting a little off-topic). Plus, there is the space saving consideration with BIT over TINYINT.

max105 · Jan 12, 2010

PTNL said:
What exact "best practice"? What supporting documentation or business need motivated this statement?

I've asked myself this same question and Google'd around trying to find anything. I'm as skeptical as you are of this so-called "best practice". I don't feel like I ever get straight answers when I ask about it, and the other person is more senior than I at our office. Before I go getting myself into a debate over the use of BIT vs TINYINT, I thought I'd ask around to see if anybody knows/read/heard anything remotely related to this topic.

If a column is a true/false or yes/no scenario, then BIT would likely be appropriate; if more than two values are possible, then TINYINT could be appropriate. However... it's important to consider the business needs in addition to query and database schema optimizations; so treat that first sentence as a guidelines and not a blanket statement rule. Also consider that BIT is more self-documenting than TINYINT. And as a general rule, I try to avoid allowing NULLs for BIT columns.

Regarding the business needs, the column is only meant to hold TRUE/FALSE or YES/NO values so I don't see any reason why shouldn't be using BIT data types.

Some other technical thoughts...
BIT fields can be used for filtering and ordering, but cannot be used for math calculations and aggregations. An example would be that "WHERE MyBitField=1" and "ORDER BY MyBitField" would work, but "MyBitField + 1" and "GROUP BY MyBitField" would not work (unless you wrapped MyBitField in a CAST statement to another numeric type, but that's getting a little off-topic). Plus, there is the space saving consideration with BIT over TINYINT.

Maybe there's something to this last part here that the other person isn't able to explain to me. We won't need to do any mathemtical computations on the field so that's a moot point for us, but we do group on BIT fields. I just tried running an aggregate query based on the BIT field and it seems to work fine. However, is grouping on a BIT field maybe inefficient because it needs to auto-cast the data type into an INT/TINYINT first?

Sorry if you feel I'm asking dumb questions, but I'm just trying to find ANY reason/argument for using TINYINT instead of BIT. If nobody can come up with an answer, I feel I can build a stronger case for using the BIT data types.

keenan · Jan 12, 2010

If the implementation provides a more specific data type for the type of data you're trying to represent (e.g. BOOLEAN or BIT vs. INTEGER), and it's not severely broken performance-wise, I don't think there's much argument for using it as the default choice for that kind of data. Perhaps in a particular case it could be useful if performance is shown to be better in that situation, but really unless the implementation is broken, performance should be equivalent (at least in this case). Using the more specific type is both more self-documenting and enforces more implicit constraints (and adding explicit constraints should easily negate any performance advantage of using a less-specific type) in addition to usually being more space efficient and depending on the types in question, possibly offering additional functionality (for example PostgreSQL's support for IP addresses).

The one argument I think holds water would be in favour of using ANSI SQL data types wherever possible to make the application easier to port to another RDBMS, but in this case TINYINT is not ANSI SQL (which does provide for BIT arrays, but not boolean types like this one).

PTNL · Jan 12, 2010

max105 said:
Regarding the business needs, the column is only meant to hold TRUE/FALSE or YES/NO values so I don't see any reason why shouldn't be using BIT data types.

max105 said:
Sorry if you feel I'm asking dumb questions, but I'm just trying to find ANY reason/argument for using TINYINT instead of BIT. If nobody can come up with an answer, I feel I can build a stronger case for using the BIT data types.

You seem to have a grasp on when to use BIT or boolean types, as well as a numeric type. So if there are some column(s) that are still up for debate, then let us know what column(s) you are debating about, how it ties to the application, and why you're still on the fence (or which way you are leaning). We may be able to provide more business-geared answers instead of simple technical differences and nuances.

mikeblas · Jan 13, 2010

max105 said:
I've asked myself this same question and Google'd around trying to find anything.

You should have asked the question of the person who offered you this advice.

BIT is preferred over TINYINT because BIT packs. If you use one TINYINT per flag, you're wasting space (and therefore I/O) on the row because you could have used BIT, packing the BITs into a single byte of storage.

If you manually pack flags into an integer type like TINYINT, you can achieve the same result. You have to think through your indexing scheme a little more carefully, though, if you want an index that covers the flag in question.

The person who made the recommendation to you is the one who would be best able to explain that recommendation, though.

max105 · Jan 13, 2010

Regarding the business needs for the column types, I can confidently say that we are only using these columns in a binary/boolean fashion. In fact, I'm the one writing the applications that consume these fields.

However, every time I submit my new table schemas for deployment, my BIT column types are all converted over to TINYINT column types. These people are not in any better shoes to be able to better define the business needs than me so their reasons are purely technical regarding BIT vs TINYINT.

As I mentioned before, I've raised this issue before but haven't able to get a straight answer. That's why I came here asking/poking around to see if anybody else could explain why TINYINT is more advantageous than BIT data types from a purely technical point of view. It seems like everybody here is making the same argument that I have for using BITs (i.e. can pack and save space, self documenting/enforcing, etc.) over TINYINT. I feel more confident with this information from the community before bringing up this debate topic again with the other people.

mikeblas · Jan 13, 2010

max105 said:
These people are not in any better shoes to be able to better define the business needs than me so their reasons are purely technical regarding BIT vs TINYINT.

And what reason do they give for their changes? Indeed, you said you've asked in the past, but you didn't think you were getting straight answers. Why not? What answers did you get, and what, in particular, bothered you about them?

SQL Server BIT vs TINYINT for booleans

max105

n00b

PTNL

Supreme [H]ardness

calebb

[H]F Junkie

max105

n00b

max105

n00b

PTNL

Supreme [H]ardness

max105

n00b

keenan

2[H]4U

PTNL

Supreme [H]ardness

mikeblas

[H]ard|DCer of the Month - May 2006

max105

n00b

mikeblas

[H]ard|DCer of the Month - May 2006