Tuesday, July 5, 2016

C - First Steps

In this article, we'll see some of the basic usages of C: such as the variable declarations and usage; the print function; the flow control (if-else, for-loop, while-loop, switch-case); matrix and pointers; read and write from files or keyboard; and functions.

Variables.

In any programming language you can (of course) create variables and assigne them some values. In C, you can create different kind of variables, each one will store a specific type of data. There are four main data types: booleans, integers, floating point numbers, and characters (or strings). For the integer values, you have different sub-types depending on the values you want to store in, from short, int, long and long long. There are also sub-types for the floating point numbers: float and double. For all the integer values, you can declare them as unsigned or signed, this means that acceptable values will be positive for unsigned and positive and negative for signed. For example, you can store values in the range of [−32767, +32767] for signed short and [0, 65535] for unsigned short.
Here is an example how to declare and operate with variables and values:


Let's examine the code. First at all, I've made this example with Geany (https://www.geany.org/) which is available in all platforms (Linux, Mac and Windows). Then, the application generates no output, so I've used the comments to indicate the value of each variables (a single line comment starts with "//" and a multiple line comment starts with "/*" and ends with "*/").
At first line, I've imported the library stdio.h to be able to use some important functions (in this example I've used no one, so I could avoid this line). At line 3, I've created the main function with two arguments, this is the main entry point of our application. The two arguments argc and argv are the amount of input parameters and the list of input parameters passed when calling our runnable program (in this example I've used no one, so I could avoid them).
At line 5, I declared the integer variables a, b and c without any value. At line 6, I assigned the value 1 to the variable a, and the same at line 7. At 8, I've assigned the value of the sum of a and b to the variable c.
From lines 10 to 13, I've made another example for the floating point variables and a multiplication. Finally, at line 15, there is the exit code of our application (zero means OK, and negative values means error).

Print.

The previous example could be more useful if we can see the values of the variables instead of having the comments. To do so, we'll use the printf function which allows to print text and variable in the terminal where the application is run.

Now, the first line with the import is necessary, as the printf must be imported with the library stdio.h. To print values in the terminal, the function printf will accept as first argument the text to show, and a second argument a list with all the variables to print. The tricky point here, is that you must specify the type of each variable to print, and they will be printed in the given order. For the first example, all the variables are integers, so the type code to print is %i and for the second example, %g for floating point numbers. If you want to print characters, you must use %c, and for long %li. At the end of the text into the printf function, I've added the \n to indicate a new line character at the end of the text, otherwise both printf will print the result in the same line in the terminal. Let's see our result:


Flow control.

As your application becomes bigger and more complex, you will need some flow control to drive you runtime execution through the statements you need. To do so, you have the if-else, for-loop, while-loop and switch-case
The easiest flow control is with the if-else statement. If some condition is correct, we execute a defined block of code, if not, another (or none). Let's see this with an example:


This way, the subtraction will never lead to negative values. The if-else statement can be use in another manner, with more conditions as shown in line 12 (this condition could be deleted without changing the result). We can concatenate multiple if-else statements this way, and have more inside the current ones.
To run a piece of code several times, we have the loops (for-loop and while-loop). In the for-loop you specify the condition to leave the loop, and an operation which will be performed at each loop (this operation basically is the one which approach you to the end of the cycle). In the while-loop, the operation must be done implicitly inside the loop. This can lead to some infinite execution if the operation is never executed. Let's see this in an example:


In the for-loop, we've declared the initial condition that the value of the variable i must be 0, the condition that the loop is executed is that the value of i must be inferior to 10, and the operation to be performed at each cycle is increase i by one (the unitary increment/decrement can be abbreviated by ++ or -- respectively). In the while-loop, there is only the condition to be inside the loop. If you forgot to increase the value of j inside the loop, your application will never exit the while-loop, that's a reason that why the for-loop in widely used over the while-loop (but some times will could want this behavior). There is another version of the while-loop where the condition is checked after the block inside the loop is executed, it's the do-while-loop
Another variant of the if-else statement is the switch-case which is very similar but more compact, but only accepts a certain type of condition, let's see an example:


Here is a common block of code about how to handle error codes and print the adequate message. The switch-case only accepts a conditions which give an integer as output. The possible outputs will be listed in the case statements, and the rest of the cases which are not handled by any case will go into the default statement. The break statement means that you exit the current block of code, the switch-case in this case. If you forgot to add the break at line 19, when an error code -4 arrives, both messages of lines 18 and 21 will be printed (until another break is found or the end of the block reached). The break statement can be used anywhere, but you will mainly see it in the switch-case statements (it can also be used into the loops for a quick exit).

Matrix.

A matrix (or a vector of n dimensions) allows you to store values of the same type in a list. At the declaration, you must specify the type of the matrix and its size. This can easily be understood with a simple example:


In the example, at line 5, I've declared a list of float yet initialized with the given values. I could have declared it there is fill it later. Declaring an empty list must be associated with its size, and you can't overpass its size when filling it. This way, I don't botter about the size, only about the values (you can use both, it depends if you already have the values to put into the list, or you have to retrieve them programmatically). At line 9, I've retrieved the size of the list with the function sizeof(), I've to call the function twice, as the first will give me the amount of bytes the list uses in memory, and the second will give the amount of bytes each element uses in memory (this will be explained more in details in the Pointers section). From line 10 to 12, I've made a for-loop to get the sum of all the elements into the list. As you can see at line 11, I get the value of a single element of the list only by the index of the element into the list (be careful not to use indexes which are out of the limits of the list, you will not get an error, but wrong values (you don't know what's stored next to this list)). And finally at line 14, I calculated the mean value and print it at line 16. I've used float variables, as the division will lead to a more accurate result.
To write values into a list, it's the same a single variable, just with the "[]" symbols and the index at which position you want to write. For multi-dimensional lists, just use multiple square brackets ("[][]" for 2 dimensions, "[][][]" for 3 dimensions...).

Pointers.

As said in a previous article, C is a low level programming language. What does that mean? In many of the commonly used actual languages (Java, Python, Ruby...) you don't care about the position where a variable is stored. With C, you can manage this information. You can create a variable which points a specific memory address, and then navigate to the next contiguous memory address. You can both access the memory address or the value stored. That's why we had to divided the value of our list length by the length of the float type, because C give use the lower level information. Sometimes we have to operate with memory address and calculate how much space occupe our data to know where the next data is stored.
Let's see an example:


The pointers are declared with the symbol "*" as seen at line 6. The symbol "*" means "the value at position ...", then, to get the memory address we have to use the symbol "&" as in line 8. What's done at line 8? We get the memory address of the variable a (it's not a pointer, but we also have access to its memory address), and store it in the pointer. Then, at line 10, we print the value where b points. In the variable b is stored a memory address, not a variable value; this memory address is occupied by the initial value of a. At line 12, we changed the value where b points out (the address of a) with the value of 5. Lines 14 and 15 prints the value of a and where b points out (which is already a). Here is the output:


The pointers are a very powerful tool in C, but quite hard to understand, as you're managing memory addresses and not currently the values. But with some practices you will understand it better.

Read/write.

A first way to read data is through the input arguments when running our application. In the previous examples, in the main method declaration, we left two arguments which were never used (int argc, char** argv) which are the amount of argument passed when running your application and the arguments itself as a pointer of pointers or characters. And to write output results, until now I've only used printf to print the results in a terminal. What about the keyboard? And the files?
Let's start with the keyboard:


To read from the keyboard, we have the method scanf. First we need to create a variable where to store the result of the keyboard input (in this case, name). The method accepts two arguments, the first one is the appearance of the input want to receive (in this case, it's just a string denoted by %i). If you would a string followed by an integer, the first argument should be "%s %i" (if you want them separated by a space).  And finally, we print the stored result.
Let's go now with the files. When operating with files, you will first create a stream where the data will flow from and to the file. This flow consumes memory, and you must close it when you finished your work. There are three methods to three and two methods to read: fputc, fputs and fprintf to write; and fgetc, fgets and fscanf to read. What are the differences? The first of each, reads/writes a single character, the second of each reads/writes a string, and the third, reads/writes depending on a given format. After opening the stream, you must pass it as argument to one of those methods which will read/write through the stream, and return you the amount of bytes read/written. The stream is stored in a variable of type FILE. With an example this could come more clear:


First, we create the stream where all the file properties will be stored at line 5, and open the file in a read-mode at line 7 (the read mode is the second parameter). At line 14, I've used the fgetc function to read a single character. I do so in a do-while-loop until the character \n (new line) is read, storing the value of each character in the variable c and printing it at line 15. At line 26, I've used the method fgets to read a complete line or 255 characters, storing the result in our vector of characters declared at line 24. And finally, at line 35, I've used the fscanf method to read single words (I mean single words as this is the text format I specified as the second argument). Here's the output:


Let's see now how to write into a file.


As you can see, I've used fputc, fputs and fprintf to write into a file, let's explain how each one works. First at all, as for reading a file, we must open a stream (line 7) in write-mode to read the file into the object FILE. Then, at line 11, I've used a for-loop to write all the letters into the file in a single line (single line as there is no new line character until we exit the for-loop). If you look carefully, you'll see that not all the letters will be printed, you'll miss the 'z' as the condition to be inside the for-loop is to be less than 'z'. But how can I loop over a character? The character have its ASCII code associated, and that's where we iterate. In this case, we iterate from 97 (the value of 'a') to 122 (the value of 'z'). With fputs we print into the file the string which is passed as argument. There will be no new line at the end of the fputs, so they must be added explicitly. And finally, fprintf prints a formatted string with variables (if you want). If no variables are given to fprintf it does the same as fputs. Here is the result:


Functions.

Until now, I've written all the code into the main method, which is a bad practice. But that's why I've introduced the functions and how to use them. With the functions you can move your code into a personal function (just as pintf) and have the minimum of code into the main function to have an application more readable. A function is composed by:
  • The return type. What your function will return when you call it (it can be void).
  • The function name. It must be concise with what it will do.
  • The input parameters. A list of parameters which will be needed when calling the functions as they are needed inside the function.
  • And the function body. Where all the function code will be placed.

In the previous example, I've created at line 17 a sum function, and at line 21 a difference function. Each one return an integer and have two integers as input parameters. This time, the return statement will return the expected result of each function (the sum of two integers for the first case, and the difference of two integers in the second case). To use them, is as easy as seen in lines 8 and 9 (as other C functions). To avoid some warnings, I had to create a file main.h with the declaration of each functions and include it in the main.c at line 2.


The header files will be explained in another article, but basically, they're needed to see all the existing methods in your application, and needed in the preprocessing compilation step.

In the previous example, I've used methods with input parameters as values. Another way to declare (and use) functions is with input parameters as references. What's the difference? In our case, I only needed the values 1 and 5, no more. But I could needed a list of values and made the sum of all them, in this case, I'll need the reference (the memory address) of the first element, and inside the function, iterate over the list to obtain the total sum:


I've modified the previous example to have the sum function with an input parameter by reference. First of all, I've initialized the variable at line 8 (malloc need the import of line 2). What's malloc? Memory-Allocation. It allocates the memory amount specified, this case as we need 4 elements into the list, I've multiplied 4 by the length of the data type stored (integers). The memory allocation is always needed when initializing pointers, it avoids the application to write into the next N bytes some internal values and leave them for your use. To access the values of the list, is as simple as the arrays (I could use the * and &, but I think is easiest to understand for now in this way). When passing input parameters by reference, you always need to pass the length of the data, as there is no other way to know until when you have to iterate.

No comments:

Post a Comment