How to replace magic numbers with named constants in C++?

HOCP4ME

2[H]4U
Joined
Jul 1, 2005
Messages
2,959
Hello everyone. We all know we're never supposed to use "magic numbers" in our code and should instead replace them with named constants to make the code more readable and future changes much easier. The problem is, I can't think of a satisfactory way to do this in C++. The choices I've thought of so far include:

-Use preprocessor #defines, which offer the best readability but have no knowledge of scope and completely violate OO programming principles.

-Use const data inside the class, but this prevents compiler optimizations and unecessarily stores a simple number in memory multiple times.

-Use static const data inside the class with a "get()" function, but this ruins readability (i.e. "if (myObject.getValue() == myClass::getMax())") and seems to me like it would hamper compiler optimizations.

-Use static const data inside the class, but make it public, which improves readability and optimization but violates the "no public data" rule.

None of these options seem to fit the principles of OO programming. Which would you choose? Is there another option that I'm missing here?

Thanks for any assistance.
 
Note, I am not a programmer; I am a hardware engineer who does programming from time to time. There are much more experienced people on these boards than I.

What I do:

If they are especially magical I typically use #define; especially if they are not likely to be changed (for example Boltzmann, or Euler's constant). I'll make a .h file with my constants and then include it in whatever requires them.

I'm sure this is non-ideal and violates OO principles, but my projects are not particularly complex, nor would I be programming in C++ if I was fanatically concerned with adhering to OO principles.

On another note, do you believe that looking up the values via a function is going to slow down your program in any noticeable way? Is this going to cause you to violate your performance requirements? If you're concerned about performance you can run your code through a profiler and see where your bottlenecks are. You're probably already blocking all sorts of other compiler optimizations just based on your coding style...
 
If you are concerned about scope of defines, you can use namespaces.

At work, we use defines or enum lists throughout our products.
 
if you are concerned about adhering to OO, use a static class member. what you posted is absolutely no less readable, and if it is a constant, make it public. The point of information hiding isn't to make programs brain dead and boring. The point is to prevent programmers from modifying something they shouldn't be modifying, or becoming unnecessarily concerned with details of implementation. if you declare it as const public, you have achieved the same result. just be sure to be consistent about your choice.
 
Figure out the needed scope of the constant. If it's not necessarily restricted to working with a single class why not make it global? (Although putting it in a namespace might not be a bad idea)

If it's instead to be used when working with a specific class, why not make it a public static const member of that class? Before you retort with because data should never be public, you might revist *why* it's a good idea to make data non-public... and ask yourself if those reasons really apply to constants _intended_ for outside consumption. Or to say it another way... read what nameless_centurian said. :)
 
enums are generally the way to go. enums can be scoped to a function for "local" use, namespace/file scope for slightly larger use. If you want to use magic numbers in OO, use getMagicNumber() or something along those lines (eg, Math.getPi()). Although, if you have to do that, you should probably re-think your design -- do you *really* need to expose "Pi", or can the calculations that use it be wrapped into the "Math" class? I don't recommend sharing enums across multiple files unless you absolutely have to.
 
I don't recommend sharing enums across multiple files unless you absolutely have to.
Why not? You give a bunch of advice, but provide absolutely no rationale for any of it, so I'm burning with curiosity.

Ironically, you say that "enums are the way to go", but you use pi as an example. How do you propose using an enum to define a float or double constant?
 
Why not? You give a bunch of advice, but provide absolutely no rationale for any of it, so I'm burning with curiosity.

Ironically, you say that "enums are the way to go", but you use pi as an example. How do you propose using an enum to define a float or double constant?

When do you *have* to share constants across multiple files? The only time that I have encountered this is when you are defining array sizes for pre-STL APIs in C, and also true constant values that will never ever change (Pi, C, avogadro's number, etc). Pi was a bad example to use in my previous post, my bad.

Everything else (well, almost) that I have stumbled upon is a result of laziness (or lack of time/resources) in terms of writing proper code or making a solid enough design. Whenever the code refers to some constant, that code becomes coupled with that constant -- which is good if it is in the same header file, and IMHO gets worse and worse as the API get further away from the header. Unless there is a reason for this coupling (radius and PI are coupled by definition, so that is okay), I'd avoid it if possible.

The best way to describe it is with contexts... some values are context-free (science / math / etc), some have minimal context (red is "#FF0000" in HTML, 000000FF in AABBGGRR, etc), and others have meaning only in a given context (array size only has meaning in the context of the API that takes it as a parameter). If this doesn't make sense, remove all context from the value and slowly add context to it until you get its meaning. 3.141519 by itself is just another random number. Add the context of "PI" to it and its meaningful. What about 6.626068 × 10-34? Calling it "h" isn't enough, its not obvious what it means. Add "planck's constant" in there somewhere and its better. Put it into chemisty.h and you're set.

These are examples of "good" globals, here's an examples of "bad" globals:

The radius of class MyCircle is hardcoded to "5". You can either a) have a global enum of MyCircle_radius = 5 in MyCircle.h, or b) have a MyCircle.getRadius() function. Both are valid from a programming perspective, but I prefer MyCircle.getRadius() as the context of MyCircle is immediately applied to "radius", conveying the appropriate meaning. If you think that globally scoping the enum is "ok", consider the case of everyone on [H] defining their own circle with radius joebob_radius. In order to figure out what the radius of the circle is, you'll need a giant switch statement checking for the type of circle it is and then pulling up the appropriate enum. BAD. This doesn't scale well, especially when the distance between the value and the context increases.


Regarding non-integer constants, const float/double/std::string/etc are better than #defines as the type of the value (aka context :) ) gets carried along.


Hope that satisfies your burning curiosity :)
 
When do you *have* to share constants across multiple files?
Whenever you need the same value in two different files. In medium or large projects, that's all the time.

the only time that I have encountered this is when you are defining array sizes for pre-STL APIs in C, and also true constant values that will never ever change
When you start writing code in C++ that requires efficiency, I think you'll notice this case a lot more. You'll want to preallocate buffers to their maximum dynamic size, or to some sensible default size, to avoid reallocations, for example.

You can either a) have a global enum of MyCircle_radius = 5 in MyCircle.h, or b) have a MyCircle.getRadius() function.
Why have you limited me to only those two choices? There's at least a couple of other ways to skin this cat.

Regarding non-integer constants, const float/double/std::string/etc are better than #defines as the type of the value (aka context :) ) gets carried along.
const std::string is pretty retarded. A #define can carry as much type information as a const variable. But you said enums were the way to go, not const variables. And enums carry no type information, as I've pointed out--they're ints, and that's that. If you stress context so much, why do you also recommend enums, which provide none of (one of your definitions of) context?

Hope that satisfies your burning curiosity
Not really. Why is it that you recommend not sharing enums across files? Files are pretty arbitrary in C++, so this seems like especially curious advice. Does your advice change if it's const variables that are in question, instead of enums? Or the constant accessor function you're recommending? Or a preprocessor macro?

The advantages you cite for enums are also applicable to const variables. And #defines can do most of 'em, too. So then isn't your promotion of enums really an issue primary of aesthetic style or personal preference?
 
Whenever you need the same value in two different files. In medium or large projects, that's all the time.

But in which cases (specifically) do you need to access the value directly? Are you sure that its really necessary to share the direct value between files? or can you abstract that logic away into a class/function/etc so the value is never directly accessible?

When you start writing code in C++ that requires efficiency, I think you'll notice this case a lot more. You'll want to preallocate buffers to their maximum dynamic size, or to some sensible default size, to avoid reallocations, for example.

I can't see how this is applicable. Pre-allocating buffers is a valid point, but isn't it better to allocate the buffer IN the routine that will be populating it (as opposed to creating a container + allocating space, then passing it (by reference, of course) into to that routine to populate)? I don't think we're on the same page here, can you give an example?

Why have you limited me to only those two choices? There's at least a couple of other ways to skin this cat.

There are plenty of ways to solve any problem, I chose the two opposites that happen to fit with the point I wanted to make. Going through all possible variations is overkill, I thought it would be pretty obvious that there are options c) d) e) f) g) etc.


const std::string is pretty retarded. A #define can carry as much type information as a const variable. But you said enums were the way to go, not const variables. And enums carry no type information, as I've pointed out--they're ints, and that's that. If you stress context so much, why do you also recommend enums, which provide none of (one of your definitions of) context?

std::string's are great, unless you're writing extremely streamlined and optimized code, you're not going to notice the speed difference between std::string and char * and #define. what's so bad about them? other than the overhead, whats not to love about getting rid of pointers?

#define's do not carry any type information at all. its a simple find-replace by the pre-processor. for example:

#define RED "#FF0000"
#define RED 0xFF0000

vs

const std::string RED("#FF0000");
const int RED = 0xFF0000;

How can you possibly say that #define's can carry as much type information as a const variable?

I prefer: enums where possible, followed by const variables, and #defines if you truly have no other choice (which VEEEEEEEERY rarely is the case).


Not really. Why is it that you recommend not sharing enums across files? Files are pretty arbitrary in C++, so this seems like especially curious advice. Does your advice change if it's const variables that are in question, instead of enums? Or the constant accessor function you're recommending? Or a preprocessor macro?

Because its a shortcut that encourages bad design and messy code. Variables in general should have as small of a scope as possible. It's the same thing as having public data members. It saves you a lot of code and lets you manipulate things directly, which, in certain cases is the right solution. But generally, you'll get giant spaghetti code thats not maintainable after enough people have worked on it.

Files shouldn't be "pretty arbitrary" in C++. Each file should have a defined purpose. If you need to get at some global value from another file, are you sure that everything is properly abstracted away?

The advantages you cite for enums are also applicable to const variables. And #defines can do most of 'em, too. So then isn't your promotion of enums really an issue primary of aesthetic style or personal preference?

there are some things you can do with enums that you cant do as cleanly with const variables:

enum
{
TBLEMPLOYEE_COLUMN_INDEX_EMPID= 0,
TBLEMPLOYEE_COLUMN_INDEX_NAME = 1,
TBLEMPLOYEE_COLUMN_INDEX_ADDRESS = 2,
TBLEMPLOYEE_COLUMN_INDEX_PHONE = 3,
TBLEMPLOYEE_COLUMN_MAX = 4
};

How much information does this enum convey? It implies that there is a 4 column table "TBLEMPLOYEE", with columns "EMPID", "NAME", "ADDRESS", "PHONE", in that order. Can all this information be expressed as clearly and succinctly with const variables (including all the subtle implications of an enum vs const var :))? If so, please illustrate.

The reason I brought up enums is that the original question was regarding "magic numbers". In my experience, magic numbers tend to be integers. For example, some index into a DB query, or some state or some flag or something else. enums tend to be the best solution in those cases, but they aren't a blanket solution for everything. That's why I recommend them, and if they don't fit.. they don't fit. That's perfectly reasonable, it's up to the programmer to make the correct decision as to how to write their code.

I don't like #defines. I think they create more issues than they solve, especially when there are better solutions available. They are last resorts if there are no other options remaining.
 
Few thoughts...

An enum is not the way to declare a constant. An enum should be used to declare a type with a restricted set of abstract values. The distinction here can be a really fine line; however, as a general rule if you're never creating an instantiation of an enum you're misusing it.

Enums created for the appropriate purpose should most definately be used across multiple files; the same way that you'll reference your Circle class in multiple files. It's a type that you've created to be used elsewhere.

consts are preferable over #defines as they allow for more type-checking at compile time and they force you to think about the size of the constant. This is a good thing.

#defined constants let you do stuff that consts don't... however you're usually getting pretty hacky if you're using their ability to do so. (Personally I'm thinking of a particular hack I did with derived classes in a toolset that didn't allow templates...)

there are some things you can do with enums that you cant do as cleanly with const variables:

enum
{
TBLEMPLOYEE_COLUMN_INDEX_EMPID= 0,
TBLEMPLOYEE_COLUMN_INDEX_NAME = 1,
TBLEMPLOYEE_COLUMN_INDEX_ADDRESS = 2,
TBLEMPLOYEE_COLUMN_INDEX_PHONE = 3,
TBLEMPLOYEE_COLUMN_MAX = 4
};

I dunno... I kinda think that:

const int TBLEMPLOYEE_COLUMN_INDEX_EMPID= 0;
const int TBLEMPLOYEE_COLUMN_INDEX_NAME = 1;
const int TBLEMPLOYEE_COLUMN_INDEX_ADDRESS = 2;
const int TBLEMPLOYEE_COLUMN_INDEX_PHONE = 3;
const int TBLEMPLOYEE_COLUMN_MAX = 4;

Serves exactly the same purpose. Especially as your enum was unnamed I didn't even need to put them in a namespace.

As for allocation of memory... get involved with hard realtime systems. From talking with my coworkers it's apparently quite common for your applications to have thier own "memory manager", that will at startup allocate memory of various sized chunks (matching the expected needs of the program) to be available for the use of the application on an as-needed basis. Then you'll design so that instead of your class allocating/deallocating memory itself it will request/release a pre-allocated block from your applications "memory manager". Personally I haven't had to deal with such a situation yet, however I expect it'll happen before too much longer.
 
But in which cases (specifically) do you need to access the value directly? Are you sure that its really necessary to share the direct value between files? or can you abstract that logic away into a class/function/etc so the value is never directly accessible?
The value has to be accessible, otherwise it's not useful. I'm not sure what you're asking here.

Pre-allocating buffers is a valid point, but isn't it better to allocate the buffer IN the routine that will be populating it (as opposed to creating a container + allocating space, then passing it (by reference, of course) into to that routine to populate)? I don't think we're on the same page here, can you give an example?
Say you have a hash table class, and you've got a pretty good one. It can decide how many buckets to allocate, and even which hash function to use. It really helps this function to know how much data to expect so that it can make the best choices for bucket count and hash function; so that it doesn't need to reallocate often, and so that it doesn't need to rehash as it is being populated. You can pass a value into the constructor so that it knows what to preallocate:

Code:
CHashTable< CUserRecord > tableUsers;				// some arbitrary, probably small, default size
CHahsTable< CUserRecord > tableUsers2( k_cMaxUsersPerTeam );	// the size I know I'll need as a maxmimum
CHashTable< CUserRecord > tableUsers3( k_cMedianUsersPerTeam );	// the size I know I'll need most often, to compromise time and memory

You don't understand my example, I expect, because you've assumed the caller will allocate the buffer used by the collection. It doesn't, since the better design is to provide an allocator interface (perhaps as a template) to the collection rather than point it at a buffer.

I think it's pretty easy to imagine that the constants I've used above appear in lots of different code throughout the project, and combining that code into the same file simply because it uses this one constant seems rather silly.

There are plenty of ways to solve any problem, I chose the two opposites that happen to fit with the point I wanted to make. Going through all possible variations is overkill, I thought it would be pretty obvious that there are options c) d) e) f) g) etc.
Indeed, it's obvious that there are other options -- which is why I asked why you were considering only two. You explicitly limited the reader to only two rather than say "of the available options". Since you also precluded the suggestions I had been making, it means you're not entertaining them in your scenario. I'm trying to figure out why you've discarded them out of hand, then, instead of trying to reason them out for yourself.

std::string's are great, unless you're writing extremely streamlined and optimized code, you're not going to notice the speed difference between std::string and char * and #define. what's so bad about them? other than the overhead, whats not to love about getting rid of pointers?
While std::string has some advantages, the class also has many problems that preclude its use in lots of different situations. Here, we're specifically talking about constants, which means that you'd want to code a constant std::string. You'd end up coding this in some CPP file:

Code:
const std::string g_strName( "Some name here" );

and then providing a declaration in a header:

Code:
extern const std::string g_strName;

We haven't really "gotten rid of pointers"; a pointer to g_strName is availble, and you can call the awkward g_strName::c_str() to get a pointer to the string that lives inside. We've created a few subtle problems for ourselves, though.

First, we've got a global object. (And even if we scoped this to a class, we'd end up with a static in the class). Global objects are pretty bad for a few different reasons. This declaration allocates memory, which means that we need to make sure whatever memory subsystem we use is initialized and avaialble very early in the application's load time. The allocation might cause the object to throw an exception during load, which is impossible to catch and handle properly, and can be a serious concern depending on what requirements for reliability are on the application.

Depending on the platform being used, global objects might construct at just the wrong time. Such as in Windows: if this code is in a DLL, the constructor is going to run in response to DllMain(). And that can cause some dicey side-effects.

It also has a strong influence on load time, again depending on the platform. In Windows, executables are divided into pages. Pages are loaded on demand. The static initializer list the runtime walks in order to fire coded constructors ends up paging in random code here and there, all over the place. These page faults take lots of time, and end up meaning blocking I/O for the process, and so on.

Developers often make the mistake of putting the first, initializing declaration into a header. Sure, this is incorrect, but people do it all the time. It ends up causing a subtle problem: since const data is implicitly static, each module that includes that header gets its own private copy of the data. That means, if you're not careful, you can have hundreds of instances of this object floating around in your executable.

We've also lost some information. I have a string. What's it's length? If I need that (and I probably do!), I have to call g_strName.length(), which walks the string and finds its length. Since the string is constant and known at compile-time, why is its length not constant and known at compile time?

std::string implements dynamically-allocated strings. If you have lots of string constants, then you've got lots of small little memory allocations lying around all over the place. This isn't great for fragmentation, and isn't exactly a benefit when trying to diagnose problems.

std::string implements mutable strings. The majority of the members of std::string want to change the value of the string, but now they don't work because you're trying to use an immutable string. So, what real value have I gained after paying for all these penalties?

How can you possibly say that #define's can carry as much type information as a const variable?
How can you possibly say that they don't? The first define is a string literal. It has the type (const char *). The second one is an integer; depending on your compiler, it might be an (unsigned int), though it isn't useful as an l-value. (Then again, neither are your const declarations). In C++ and C alike, literals have type.

I prefer: enums where possible, followed by const variables, and #defines if you truly have no other choice (which VEEEEEEEERY rarely is the case).
This is a little more sensible than your previous advice, though I'm still not sure I follow: where do constant value accessor functions fit in your stack rank?

Because its a shortcut that encourages bad design and messy code. Variables in general should have as small of a scope as possible. It's the same thing as having public data members. It saves you a lot of code and lets you manipulate things directly, which, in certain cases is the right solution.
If I expose an enum in a header file to multiple compilands, what is it exactly that I can now manipulate directly that I shouldn't be able to?

Files shouldn't be "pretty arbitrary" in C++. Each file should have a defined purpose. If you need to get at some global value from another file, are you sure that everything is properly abstracted away?
If the value is global, by definition, I have access to it from another file. If it wasn't intended to be global, it would have been declared static.

How much information does this enum convey?
It directly conveys a lot less information than that. What you've asserted that it conveys is either something you've accepted as your own guess, or something that you already know, since you wrote the declaration. The enum certainly doesn't enforce any of the things that you claim it conveys. For example, the odrer in which the columns are stored in the database is completely up to the database, not this enum. The enum you provide also manages to convey some confusion: why is MAX declared 4, when 4 is never used? Shouldn't that be 3? Or should it be named COLUMN_NEXT, or COLUMN_LAST, instead of COLUMN_MAX ?

Point is, "bad code" isn't a result of particular low-level style choices like this. It's the result of failing to comment. It's the result of choosing poor algorithms, or implementing them poorly.

enums tend to be the best solution in those cases, but they aren't a blanket solution for everything.
I'm glad you've realized the problem of your initial advice, and have come up with something more sensible.
 
mikeblas:

About defines, if you use them, then anything that includes that header through a long chain inclusion will see the define, but if you use class variables or an enum, the scope problem becomes a bit more manageable. If you use an enum, even though you can still see it in a header chain, enums can be qualified in such a way that they don't take up additional names that you might use if say, you were to use a simple define.
 
If you use an enum, even though you can still see it in a header chain, enums can be qualified in such a way that they don't take up additional names that you might use if say, you were to use a simple define.
Right; since the preprocessor is unrestricted in scope, that's the way it works. You can use #undef to have a bit of control, or simply not put the #define in a commonly used header and gain scoping that way. It's not unmanageable, but it's certainly not as controllable as other constructs.

On the other hand, enums have the problem of being integer types, only; and not necessarily very controllable integer types, at that. You might shy away from enums just for that reason, if data size is a concern. In Visual Studio, they have the advantage of being symbolic in the debugger -- so evaluating some enum type in a watch expression shows you the symbolic name instead of the value.

There's lots of tradeoffs to consider; some of them with practical effect, and some of them more aesthetic or convenient.

Hey, some of us appreciate your posts.
I know I always learn something new from reading them.
Thank you for your kind words.
Although I still think you're wrong on #defines vs consts :p
In what regard? Scoping, or typing? What's your opinion?
 
Right; since the preprocessor is unrestricted in scope, that's the way it works. You can use #undef to have a bit of control, or simply not put the #define in a commonly used header and gain scoping that way. It's not unmanageable, but it's certainly not as controllable as other constructs.

On the other hand, enums have the problem of being integer types, only; and not necessarily very controllable integer types, at that. You might shy away from enums just for that reason, if data size is a concern. In Visual Studio, they have the advantage of being symbolic in the debugger -- so evaluating some enum type in a watch expression shows you the symbolic name instead of the value.

There's lots of tradeoffs to consider; some of them with practical effect, and some of them more aesthetic or convenient.

I think they need dual-purpose enumerable types (int, some kind of const char array) since there are so many places where I could use such functionality, but would want to avoid creating a class and defining mapping functions for it (which really takes away the simplicity and elegance of the code).
 
Back
Top