Beginner C++ Questions (references and const)

Bop

2[H]4U
Joined
Oct 1, 2003
Messages
3,306
I've been learning Java for about four months now and a few days ago I purchased some books on C++. (currently reading C++ Primer Plus) Initially a few things threw me off (like how C++ passes objects by value and instead of passing references by value), but either I'm a slow learner or simply not understanding this next concept correctly.

Please correct me if I am wrong on any of these statements. I have a headache and my brain is fried; I will probably say something stupid. :D

I understand that using const at the end of a member function like void method() const allows a const object to use that function. A method with a const parameter prevents that method from modifying any data in that object (ex void method(const Parameter1 & prm).

My confusion begins when you use const and references.

I don't think I understand how returning references works.

I know that
int a = 5;
int & b = a; (b "points" to the value at the address of a)

is just about equivalent to:
int a =5;
int * b = &a;(b points to the address of a)

but what exactly happens step by step when you return a reference? For example let's say the function is int & getIntRef(int & a){return a;}. I could use getIntRef(a) = 5 to change the value at that memory address.

I know what the end result is, but what is going on at a deeper level? Is there such a thing as "copying" references or am I thinking about it the wrong way?

When const is thrown into the mix I get even more confused. If I reuse the previous function and add const in front like so: const int & getIntRef(int & a){return a;} I know a few things. I can't type something like getIntRef(a) = 5. I can type int b = getIntRef(a). Does the const mean the reference is const (which seems redundant since you cannot reassign references) or the value it is pointing to become const? Wouldn't that mean I cannot modify the original "int a" that was passed anymore (even through other methods or directly)? Or I just can't modify the value through that specific returned reference?

I *think* I understand pointers more right now, what would be the equivalent if written with pointers?

Sorry for the wall of text or any stupid questions. :p
 
What happens is normally the value is pushed on the stack for a non ref/value parameter before the jump to the function occurs. Then if you change the value it does not return. If you pass by reference the address of the value is pushed on the stack before the jump to the function and when you change the value the memory the address points to is changed therefore essentially passed back. This is in-process and the stack is either C or Pascal style (forward or backward) based on the calling convention. Out of process is more complicated and requires data marshaling for by reference parameters. And runtime engines like java and the .Net CLR handle memory very differently than a compiler that creates machine code from C++ because the runtime owns/manages the memory where a C++ actually owns and manages it's own memory.

This I believe is bad code (you can never set the address of because its the address and an address can never change its always the same, but you can change the address that a pointer points to if that makes sense):
int & b = a; (b "points" to the value at the address of a)

You can never use the & (address of) on the left side of a operator.
int *b = a; is correct way to code that I believe even though I don't think you would ever code this way.

And in your second example you would need **b to get to the value of a if *b points to the address of a.

Also one last note memory pointers are different for 16, 32/64 bit which gives you the capability to address more memory by tghe pointer size and at the page level.
 
Last edited:
In general, the idea of "const" exists only inside the compiler. When the compiled program is running, the specific memory addresses involved aren't labeled as const or protected in any way. const just allows the compiler to do analysis and validate certain assumptions.

You should think of passing a variable by reference as essentially the same thing as passing a pointer. Thus, the following functions are equivalent:
Code:
void myFunc1(int *a) {
    *a = 6;
}

void myFunc2(int &a) {
    a = 6;
}
The difference is only in the calling syntax. By contrast, if you call a function with a non-reference/non-pointer argument, a copy of the value is passed.
 
In general, the idea of "const" exists only inside the compiler. When the compiled program is running, the specific memory addresses involved aren't labeled as const or protected in any way. const just allows the compiler to do analysis and validate certain assumptions.

You should think of passing a variable by reference as essentially the same thing as passing a pointer. Thus, the following functions are equivalent:
Code:
void myFunc1(int *a) {
    *a = 6;
}

void myFunc2(int &a) {
    a = 6;
}
The difference is only in the calling syntax. By contrast, if you call a function with a non-reference/non-pointer argument, a copy of the value is passed.

Actually I think that code is bad too its been years but I used to program C/C++ for IBM and that code is not equivalent. If you pass the address of a to a function you want to say *a = 6 in the second example but you really never have a & in the function signature if I remember correctly but I understand your point. The & is used on what you pass into the pointer through the function call you have defined in your first example.

Here is a good example from wiki(http://clc-wiki.net/wiki/C_language:Terms:Pass_by_reference):
#include <stdio.h>
void foo(int *x);

int main(void) {
int i = 5;

printf("In main(): %d\n", i);
foo(&i);
printf("In main(): %d\n", i);

return 0;
}

void foo(int *x) {
printf("In foo(): %d\n", *x);
*x = 10;
printf("In foo(): %d\n", *x);
}

Prints
In main(): 5
In foo(): 5
In foo(): 10
In main(): 10

Here is the code modified for pass by value:
#include <stdio.h>
void foo(int x);

int main(void) {
int i = 5;

printf("In main(): %d\n", i);
foo(i);
printf("In main(): %d\n", i);

return 0;
}

void foo(int x) {
printf("In foo(): %d\n", *x);
x = 10;
printf("In foo(): %d\n", *x);
}

Prints
In main(): 5
In foo(): 5
In foo(): 10
In main(): 5 (Here is the difference)
 
FYI constants in C/C++ are replaced in the code with the actual values during compilation.
 
but what exactly happens step by step when you return a reference? For example let's say the function is int & getIntRef(int & a){return a;}. I could use getIntRef(a) = 5 to change the value at that memory address.

I know what the end result is, but what is going on at a deeper level? Is there such a thing as "copying" references or am I thinking about it the wrong way?
What's going on at a deeper level is up to the compiler, really. It can decide how to implement the code any way it wants to, as long as that implementation is compatible with the semantics of the language. With that said, most compilers implement references in a similar way to the way the implement pointers.

The calling code you provide gets the address of the variable "a", and pokes a 5 into the data at that address. If we think about the complete implementation of your function, the compiler gets the address of "a" and passes it to the getIntRef() function. That function immediately returns that address. The address is used as the target for the assignment operator.

When const is thrown into the mix I get even more confused. If I reuse the previous function and add const in front like so: const int & getIntRef(int & a){return a;} I know a few things. I can't type something like getIntRef(a) = 5. I can type int b = getIntRef(a). Does the const mean the reference is const (which seems redundant since you cannot reassign references) or the value it is pointing to become const?

The way you've used const means that the integer is constant, not the reference. You can reassign references. A trick I like to use is reading the declaration backwards -- right to left -- in order to get the constness right. If we consider const int & getIntRef(int & a) and read it backwards, we might say "getIntRef returns a reference to an int that's constant".

Let's think of simple variable declarations, as that's a little easier and all the modes are possible:

Code:
int fooey = 39;
int & const refBing = fooey;
const int & refBaz = fooey;
const int & const refZing = fooey;

If we apply the "read backward" technique, we'll read these declarations: refBing is a constant reference to an integer. It can't be reassigned because it's const. refBaz is a reference to an integer that's constant. The integer is constant, not the reference. RefZing is a constant referene to an integer that's constant. Neither the integer nor the reference can be changed.

Wouldn't that mean I cannot modify the original "int a" that was passed anymore (even through other methods or directly)? Or I just can't modify the value through that specific returned reference?
Indeed, constness follows the reference. You could have a const reference and a non-const reference to the same thing. We did that above in the example.

I *think* I understand pointers more right now, what would be the equivalent if written with pointers?
Well, at this point, I'm confused. The equivalent to which declaration? If you mean this one:const int & getIntRef(int & a){return a;} then the answer is const int * getIntPtr(int* a){return a;}.

If you pass the address of a to a function you want to say *a = 6 in the second example but you really never have a & in the function signature if I remember correctly but I understand your point. The & is used on what you pass into the pointer through the function call you have defined in your first example.
Arainach's example is correct as provided. The second example declares a function taking a reference to an integer.

FYI constants in C/C++ are replaced in the code with the actual values during compilation.
Not always -- not even in general. You might be confusing const values with preprocessor macros.

In general, the idea of "const" exists only inside the compiler. When the compiled program is running, the specific memory addresses involved aren't labeled as const or protected in any way. const just allows the compiler to do analysis and validate certain assumptions.
This is true, but not always. You say "in general", but there's a notable exception -- statically initialized data is usually (or, at least, can be) put into a section of the executable that is marked read-only with protections by the OS when it's loaded. This is implementation, but happens on all the modern PC compilers I use -- gcc and VC++. If you cast away constness, and make an assignment, you end up with an exception.

"const" exists in the source code, too. It really doesn't buy anything for optimization or the compiler validating anything. It does help code document itself, though. If I show you my DoSomething( const char *pstrString ) function, you can guess that I'm not going to change the string I'm passing it. If I do, you should tell me to fix either the code or the declaration.
 
#defines which are not macros are done in the intermediate code that is generated before compilation, during compilation consts are replaced with their values like said. This holds true for all compilers I have used, IBM, MS (They wrote IBMs originally until C++ then IBM wrote their own), GNU, etc.

I was wrong on the code int & is allowed. int *p is a pointer passed by value and int &p is a pointer passed by reference so you can actually do things like allocate memory (Like I said it's been a while for me).
 
Last edited:
#defines which are not macros are done in the intermediate code that is generated before compilation, during compilation consts are replaced with their values like said. This holds true for all compilers I have used, IBM, MS (They wrote IBMs originally until C++ then IBM wrote their own), GNU, etc.
#define defines a macro; it defines either a macro that is an object-type macro or a function-type macro. You can read more about this in section 16.1 of the C++ language standard.

Object type macros don't take parameters, are typically what a developer would use for a numeric replacement:

#define MAX_WINDOWS 64

Macro expansions are not emitted in intermediate language. They're emitted by the preprocessor, whose output is consumed by the compiler's front end. The compiler front end is what produces intermediate language, and that's consumed by the back end. The back end itself might produce object code, or in the case of the Microsoft compilers, code generation might be completed by the linker to enable better global optimization.

It's easy to demonstrate cases where VC++ doesn't replace the value of a const declaration directly. Non-trivial initializers, floating point numbers, arrays, object types, and so on, all result in storage being used and might result in run-time initialization. Note further that you can always take the address of a const variable, which means that the value of the variable is irrelevant -- it still needs to have an address, and therefore storage.

gcc works in the same way, though the tipping points for decision to use a run-time initializer is different. Since all compilers have to implement the same language, I'd expect they all end up eventually using storage for cases where the address is taken or the const is declared extern.
 
Last edited:
The preprocessor like I said does the # define resolutions in the code output there is no front end of a compiler other than the preprocessor: http://en.wikipedia.org/wiki/C_preprocessor

You can call #defines macros but technically they are not like macros in Excel is what I was meaning (To me #define ABC 5 is not a macro its functionally a constant). They are defined as any valid data, datatype, function etc in C/C++ and substitued in the code then compiled to be part of the machine code that is executed at runtime.
 
I don't believe that will work, I believe the compiler will complain that you are trying to assign a value to the function rather than the result.

but what exactly happens step by step when you return a reference? For example let's say the function is int & getIntRef(int & a){return a;}. I could use getIntRef(a) = 5 to change the value at that memory address.
 
The preprocessor like I said does the # define resolutions in the code output there is no front end of a compiler other than the preprocessor: http://en.wikipedia.org/wiki/C_preprocessor
Modern compilers are implemented in two phases. The front end is the first phase, and it translates the C or C++ language into intermediate language. The IL output is given to the back end, and the back end implements optimizations and translates the target-netural IL into target-specific object code. This makes it easier for the compiler implementer to support different targets. The front-end is the same for the X86 and X64 compilers, for example; they just use a different-back end that generates X86- or X64-specific code.

If you check out your Visual Studio BIN directory, you'll find the different files:

Code:
C:\Program Files (x86)\Microsoft Visual Studio 8\VC\bin>dir c*.dll
 Volume in drive C is Apex
 Volume Serial Number is 3A63-7C22

 Directory of C:\Program Files (x86)\Microsoft Visual Studio 8\VC\bin

12/02/2006  00:24           630,784 c1.dll
12/02/2006  00:06           778,240 c1ast.dll
12/02/2006  00:24         2,285,568 c1xx.dll
12/02/2006  00:06         2,498,560 c1xxast.dll
12/02/2006  00:10         2,265,088 c2.dll
               5 File(s)      8,458,240 bytes
               0 Dir(s)  109,522,255,872 bytes free

c1 is the front end for C, while c1xx is the front end for the C++ language. C2 is the shared back-end. The gcc guys are aggressive about publishing a spec describing the interface between their front end and back ends. Given the spec, developers can implement new languages pretty rapidly, since they don't have to worry about getting the back end done -- they just have to implement a front end that produces compatible IL.

Similarly, the back-end is mostly language-neutral. In VB6, the VB compiler started emitting IL that was compatible with the C++ compiler's back end. At that point, the VB language got many of the optimizations the C++ compiler had because most all of them were implemented in the back-end.

In fact, Microsoft implemented the .NET frameworks using MSIL. They took the IL standard they developed and got it ratified by ECMA, so anyone can produce a compiler that emits MSIL (or CIL, really) byte codes which can be executed by the .NET runtime platform.

If you've got Visual C++ installed, you might be able to demonstrate the different passes to yourself by doing a command-line build with the undocumented (and unsupported!) /Bd option on the CL command line. If I build a little sample program I have lying around with that option:

Code:
C:\foo>cl /Bd /EHsc ptrs.cpp

I see this output:

Code:
ptrs.cpp
`C:\Program Files (x86)\Microsoft Visual Studio 8\VC\BIN\c1xx.dll -zm0xFCDB0000 -il C:\Users\mikeblas\AppData\Local\Temp\_CL_4f1d6954 -f ptrs.cpp -W 1 -Ze -D_MSC_EXTENSIONS -Zp8 -ZB64 -D_INTEGRAL_MAX_BITS=64 -Gs -Ot -Foptrs.obj -pc \:/ -Fdvc80.pdb -D_MSC_VER=1400 -D_MSC_FULL_VER=140050727 -D_WIN32 -D_M_IX86=600 -D_M_IX86_FP=0 -GS -GR -D_CPPRTTI -Zc:forScope -Zc:wchar_t -Bd -EHs -D_CPPUNWIND -EHc -clrNoPureCRT -D_MT -I C:\Program Files (x86)\Microsoft Visual Studio 8\VC\ATLMFC\INCLUDE -I C:\Program Files (x86)\Microsoft Visual Studio 8\VC\INCLUDE -I C:\Program Files (x86)\Microsoft Visual Studio 8\VC\PlatformSDK\include -I C:\Program Files (x86)\Microsoft Visual Studio 8\SDK\v2.0\include'
`C:\Program Files (x86)\Microsoft Visual Studio 8\VC\BIN\c2.dll -il C:\Users\mikeblas\AppData\Local\Temp\_CL_4f1d6954 -f ptrs.cpp -W 1 -Gs4096 -Ob0 -dos -Foptrs.obj -Fdvc80.idb -GS -Bd -EHs -MT'
?/out:ptrs.exe

ptrs.obj

`"C:\Program Files (x86)\Microsoft Visual Studio 8\VC\BIN\link.exe" /link /errorreport:queue @C:\Users\mikeblas\AppData\Local\Temp\_CL_81cba2e4lk'
Microsoft (R) Incremental Linker Version 8.00.50727.762
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:ptrs.exe 
ptrs.obj

You can clearly see that the compiler is invoking the front end (in c1xx.dll), the back end (in c2.dll) and then the linker.

Without the architectural split between the compiler front end and compiler back end, which you insist doesn't exist, these architectures and savings wouldn't be possible. The front-end is not the same as the preprocessor. Indeed, the front-end lives behind the preprocessor, and consumes the preprocessor's output.

You can call #defines macros but technically they are not like macros in Excel is what I was meaning (To me #define ABC 5 is not a macro its functionally a constant). They are defined as any valid data, datatype, function etc in C/C++ and substitued in the code then compiled to be part of the machine code that is executed at runtime.
I call #defines macros just like the C and C++ language standard documentaiton does. Technically, they are macros because the techncial documentation says so.

Excel is a user application; it uses a user-oriented definition of "macro", which is just a script that performs a repetitive task. I'm a programmer, so I use the definition of "macro" that programmers use, and that definition describes a simple replacement mapping between characters input and characters output. The Microsoft Macro Assembler, along with the C and C++ preprocessors, are probably the most notable application of the term "macro" in this context.

Macros in the preprocessor don't reach machine code directly. Their deterministic lifetime ends when they're done being replaced by the preprocessor.

In your example, ABC is a macro; 5 isn't a constant -- it's a literal. C++ preprocessor macros are not constrained to data types, functions, or "valid data" as you assert, and that fact is easily demonstrable.
 
I don't believe that will work, I believe the compiler will complain that you are trying to assign a value to the function rather than the result.

Works fine; the returned reference is adequate as an assignable l-value.
 
Your describing what a parser does generating an intermediate language which has nothing to do with the preprocessor. The preprocessor generates the replacements of the #defines even with a parser seperated in a different executable.

Compilers and runtimes still work the exact same way today as they use to. Preproccessor->Lexical Analyser->Parser->Generation (Machine code/Intermediate language) then as the last step you either have an exe that executes the machine code generated or a runtime engine the executes the intermediate language generated as the last step. No matter the implementation seperate exes or not. And last but not least for the exe before it runs is the linker for external refernces like libs/dlls etc I don't want to leave that off whjich is different than the runtime intermediate language that dynaimcally loads external references.

And #define abc 5 is a constant in the original C language and has been replaced by more modern languages with the const tag/statement. I have programmed enough C/C++ and now C# to know that. Even though I don't code C/C++ anymore for IBM creating software sold off the shelf I still code C# daily as a computer consultant
 
Last edited:
Your describing what a parser does generating an intermediate language which has nothing to do with the preprocessor. The preprocessor generates the replacements of the #defines even with a parser seperated in a different executable.
Yes, I'm describing the parser generating the IL, and the I know that has nothing to do with the preprocessor. I've offered that description in response to your claim that "there is no front end of a compiler other than the preprocessor". Compilers most certainly have front ends.

Compilers and runtimes still work the exact same way today as they use to. Preproccessor->Lexical Analyser->Parser then as the last step you either have an exe that executes the machine code generated or a runtime engine the executes the intermediate language generated as the last step.
You're skipping object code and the linker.

And #define abc 5 is a constant in the original C language and has been replaced by more modern languages with the const tag/statement. I have programmed enough C/C++ and now C# to know that. Even though I don't code C/C++ anymore for IBM creating software sold off the shelf I still code C# daily as a computer consultant
Indeed, in C, what we call "literals" in C++ are generally called "constants". Let's remember that this thread is a question about C++, though, not a question about C. The fact remains that "const" declarations "literal" or "constant" describes the token generated by your macro in substitution, not the macro itself. Just the same, your assertion that constants in C and C++ are replaced with code values is not correct.

Not to be rude, but I think you've made several incorrrect statments in this thread; another one is that "You can never use the & (address of) on the left side of a operator." It's not hard to cook up examples where you can do so. One is where the operator appears on the left side of the equals, but is dereferenced again. Another is where the operator is overloaded to return a non-pointer value, like this:

Code:
class CFoo
{
public:
	CFoo(){ } 

	CFoo& operator&() { return *this; }

	CFoo& operator=(int n) { return *this; }
};
a

void main()
{
	int n = 35;
	*(&n) = 99;

	CFoo f;
	&f = 9;
}

Oh and if you really want to learn compilers like I did, I wrote one, you should get the Dragon Book: http://en.wikipedia.org/wiki/Compilers:_Principles,_Techniques,_and_Tools, its a great book.
The Aho-Sethi-Ullman book is a seminal reference, yes.
 
Last edited:
Compilers have what I listed Preproccessor->Lexical Analyser->Parser->Generation (Machine code/Intermediate language) not frontends (Thats a generic term that means nothing). I added the linker before you posted. The object code is the machine code I did not skip that. Trust me I am not to rusty on compilers like I might be on C/C++ syntax.
 
Object code is not machine code. Object code can't be executed, as it hasn't been bound, located, or resolved. The remaining symbolic references are what prevent it from being executed directly. Machine code has these references removed and resolved.

While I could continue finding errors in your post and try to help you with them, I think we're drifting afield from getting the original poster the help he's asking for.
 
Object code is machine code with external references period. Once linked it executes you are playing semantics the fact is if I have no external references the object code can be executed. But I think the OP now has more than he asked for to learn from.
 
Object code is machine code with external references period.
Except that it isn't. Object code contains public references as well as external references. Public references are unresolved, and may not be resolvable; that's a huge practical different, not only a semantic difference.
 
Whatever, from one who calls the compiler front end and back end instead of what it does I can poke at you all day long I tire of this, like I said the OP has more than what he requested.

Edit:
I have to add there is no doubt in my mind your a smart dude prolly smarter than me :)
 
Last edited:
What have I called the compiler front-end besides the "front end" ?
 
Thanks for the advice guys, I definitely understand const and references more now. I don't have much of a clue what the discussion led to, though. I'll check out that book on compilers after I read through Primer Plus and Algorithms by Sedgewick.

That leads me to a question about programming and computer science in general... does it ever become less intimidating? Even in Java I still don't understand much going through other programmers' source code. I gave up a career in law to go back to college for a degree in CS. I'll be out next year with a second BA and I feel even by then I won't truly know enough to be a half-way decent programmer.

I was also considering going for a MS in CS afterwards, but from what I've heard from several people it's not valuable and I'd be better off spending those two years just getting experience. (I'll also be creeping toward my late 20's by then as well...) Should I place any weight in their opinions?
 
You got great advice. No substite in really learning to code than actually coding. The sooner you get a job doing the more you'll learn. No degree is requied since you already seem to have a degree. But a BS in CS will put you on the right track, although you could probably start your programming career now. Its the hotest field in the world and the best IMO.

Pointers are hidden in today's languages and java and/or C# are the top dogs as far as language and I would put C# on top.
 
You should write code with other people, on a team. Even if its only a couple of other guys, it helps you understand what working on a development team means. You should also do something that has actual customers. With such a goal, you'll learn to balance demands for features, collecting requirements, and implementing solutions with just coding for the sake of it.
 
Back
Top