C is Not C++!!!
A superset of nothing useful
I often hear those who really should know better giving the advice that before you learn to code in C++ you should first learn to code in C. At face value this would seem like reasonable advice; after all C++ is a superset of C and so by learning C you’ll be learning some of C++. Unfortunately, this advice overlooks some fundamental but very important differences between C and C++ that may very well damage the learning curve of the student.
The main problem is that to say C++ is a superset of C greatly overstates the relationship. It is a superset but only insofar as the core syntax of the languages is very similar. As programming models go the two languages could hardly be further apart. It’s like arguing the case for learning to ride a push-bike before learning to drive a truck because both have wheels. In fact, if you’re going to go on the basis of the languages having similar syntax one could also argue a case for learning C# or even Java before learning C++. Frankly, either of those two languages would still be a better stepping stone than starting with plain on C programming!
The thing is that C and C++ differ greatly in their approaches to software development. The C programming language is a procedural language whose main focus is on being very small and very fast. The code is very linear and has a start, a middle and an end. This is not how you write C++ code (at least, not if you are writing it properly). C is a very powerful, small and fast language, but also very unforgiving. It is very easy to write really very bad C code because the language offers little in the way of protection for the unwary developer.
A stringy mess
The most frequently cited example of things that catch out the unwary is the simple “string” data type as used in C. The C programming language has no real concept of a “string” type. What it considers to be strings are really nothing more than arrays of “char” types, with the very last item in the array being set to NULL. In this way, the C runtime knows when it’s reached the end of a string by virtue of the fact it has discovered a NULL value. Unfortunately, not only does this make coding with C-Style strings very messy, since we need to use special stand-alone functions to perform even simple string manipulation, it’s also incredibly dangerous.
Consider what happens if we accidentally overwrite the terminating NULL with a non-NULL value. Suddenly our simple string is now of an arbitrary length. The string functions that will be looking for the NULL terminator will only stop when (if!) they hit the next arbitrary NULL value in memory. The result is undefined, but you can bet it isn’t going to be pretty. This has been the source of many a “buffer overrun” in badly-written C programs. Such defects can often lead to exploits that can be used to compromise systems.
In general, anything to do with “string” manipulation in C is considered to be (certainly by anyone who isn’t a hard-code C programmer) unsafe. The potential for something to go wrong is far too easy and the consequences are, altogether, far too dangerous. The question you have to ask yourself is why would you recommend a newbie, who is wanting to learn programming, subject himself to such a dangerous and unnecessary environment?
By contrast, C++ has a proper string type. It’s a first class object that can be passed around and it has full string type semantics. No need to call upon dark functions of witchcraft to do simple things like concatenate two strings. No need to perform random acts of memory allocation to ensure we don’t cause buffer overruns. No need to free these additional allocations (or, worse, forget to free them), because the string object does it all for you.
Scott Meyers (very famous author of C/C++ books) once gave a speech in which he argued that there is just no need to teach C++ programmers about C-style strings no C-style arrays. He went on to say that so many defects that exist in C++ code could be avoided if C++ programmers just unlearned (or never learned in the first place) about the existence of unsafe C programming types. C++ has proper object types, provided in the Standard Template Library, that replace these, providing safe and reusable components that just don’t suffer from the serious issues of their C-style equivalents.
I’m not suggesting programmers should not learn of the dangers of things such as “buffer overruns”! Of course they should. What they don’t need to learn (at least in the early days) is how to create them. We don’t give junior doctors scalpels and set them loose in the ER. They have to build up to the scary stuff; learn the best practices first. Only once they have that mastered do they learn the gory things.
As mentioned before, C is a procedural programming language. The basic structural type of a C program is a “function”. Functions contains units of reusable code. They (normally) take arguments as input parameters and (normally) return results. The problem with this is that functions are not first class objects in C. They contain no state (other than static local variables, which are not really the same thing) and they cannot be passed around as units of functionality.
It is possible to pass around function pointers, but this is not the same thing either. A pointer to a function is nothing more than an alias for it. It’s still not a first class function type. The problem with this model is that it doesn’t really make for reusable code. It is a long way from being either “functional” or “object oriented”.
By contrast, C++ is a full-blown object oriented programming language. Actually, more specifically, it is a multi-model programming language. Unlike C, C++ can support many different styles of programming. For example, it has a concept of function objects (functors), which are first class types. This means it’s possible to write “functional” C++ code should one desire.
Of course, more than that, it also has support for proper Object Oriented Programming (OOP). This means that rather than writing your code to be long and linear, you build our code out of reusable objects that model the problem you are trying to solve. It’s a completely different framework and one that makes code so much more robust.
Now the question is, why force a newbie to learn to code procedurally when he will eventually be jumping into OOP? The two styles are so very different and jumping from one to the other can be really quite tricky. Why not just learn OOP from the start? Not only does it end up teaching him bad programming practices from an OOP point of view, but it also teaches him bad habits that are really very hard to break!
I’ve seen so much badly written C++ code that was implemented by C programmers who decided to cross over but had no real clue what object orientation is about. They ended up implementing poor object models, classes that were not cohesive and interfaces that were not loosely coupled. This makes for very brittle object oriented code and is not the way to write C++.
Strong vs. loose typing
C++ is a strongly typed language. The compiler knows all the types (both inbuilt and user defined) and it is able to use its type system to do cool things, such as support function overloading. By contrast, C is not a strongly typed language (at least not in the same sense); it is a loosely typed language. This means that whilst it does have a type system, the compiler doesn’t really make much use of it beyond performing some basic static compile time checks. The compiler is not able to use the type system to do (amongst other things) overloaded function resolution. This means that even doing simple things like outputting stuff to the console requires knowledge of witchcraft and the black-art of format specifiers.
For example, to output anything more simple than a C-style string requires using a function such as printf. This function has to be told via a “format specifier” what types it is being asked to output. If the types it is told do not match the types it is given, the result is undefined (and that is never good). By contrast, C++ has streams, and these streams are type safe. You don’t need to tell a stream what type something is when you send it to be output to the console because the C++ type system already knows. It can’t go wrong because the C++ runtime take care of it for you.
Again, I’ve seen so much poor C++ code that contains a mixture of some C++ and a mixture of unsafe C code, where the programmer has held on to his use of printf (and scanf) with his last dying C programming breath. The results are not only very hard to read and maintain, but they are a disaster waiting to happen. Like most C functions that work with strings, these functions are also subject to the same problems of buffer overrun as most of the others.
In fact, these functions are worse because they also have the added complication of format specifiers. The point is that nearly all things in C (apart from the core shared syntax) are semantically different from C++. Forget C, it is a completely different programming language. Jump right in and just learn C++ (and learn it properly, not a half-baked C hybrid of it).
What’s the alternative?
Learn the semantics and forget the syntax
If you are advising someone who is just starting to learn programming and he wants to know what language to learn to benefit him when learning C++ (often, incorrectly conceived as not being a good language for beginners) recommend something like Python. Sure, it won’t teach him the syntax of C++ but who cares? Syntax is syntax; semantics are what count. By learning Python he will learn how to write object oriented code in a safe programming environment with a language that will hold his hand. Once he has the concepts, then his are ready to learn the syntax of C++ and to take the good programming skills he learned in Python and apply them to the power of C++.
For example, one could learn to speak Spanish (assuming you don’t already) to a level that would be perfectly acceptable without really needing to worry about the syntax. Does knowing the syntax help? Sure it does. Does not knowing it prevent you from learning to speak the language? No, of course it doesn’t. Let me put it even more simply. Hands up who knows the difference between a transitive verb and a non-transitive verb. If your hand is up, well done you! If it’s not, please don’t worry as I promise you’ll still be able to continue speaking English (or Spanish) without ever knowing the difference.
Learn the semantics of object oriented programming and how that applies to C++ and then, when you are ready, figure out the dark corners of the language that are shared with C. As far as the core language goes, you only need to learn those bits that work with C++; you don’t need to learn nor should you care about all the stuff that is in C++ just to make it backwards compatible with C. That stuff was only left in there to make porting C code over to C++ a lot easier. You’re not learning to port code, you’re learning to code in C++ so forget about C and all its weirdness! Learning C is not a short cut to learning C++; rather, it is a hindrance.
In defense of the C programming language
Finally, I just want to note that if this article comes across as berating the C programming language it wasn’t meant to. The simple fact is C and C++ are two very different programming languages that just happen to shame some similar syntax. The goal of the languages is very different and the C programming language excels at what it was originally designed for; to write small fast and very tight code that requires very little resource.
By contrast, C++ is a bit of a bloated beast and is not the language of choice if you are looking, for example, to write code for an embedded device (choose C for that). The only point this article tries to make is that C is not C++ and C++ is not C. They are very different, both have their pros and cons, and learning one is highly unlikely to make learning the other that much easier. Don’t waste your time; if you want to learn C++ then learn C++. C and not C++ never was and never will be!
C99 is not ANSI C
It should also be noted that this discussion was mainly aimed at true C (ANSI C) and doesn’t really consider the C99 standard. This is a greatly enhanced version of C and does add a lot of the nice things that C++ provides. Unfortunately, the semantics and syntax of C99 is still very different from C++ and so the same advice still applies: if you want to learn C++ just learn C++.
Oh, and don’t even get me started on Objective-C — that is a topic for another article I think!