1 Lecture 2: Elements and syntax Ali reza Masoum amasoum@alum. sharifLecture 2: Elements and syntax Ali reza Masoum Urmia University Of Technology 2007
2 Lexical elements of C Basic vocabulary consists of tokens, of which there are six kinds: keywords (reserved words you can’t use for anything else) identifiers (e.g. variable names, function names like “main”, “cos” ...) constants (e.g. the number 5) string constants (e.g. “Hello\n”) operators (e.g. +, -, =, and the parentheses following function names) punctuators (e.g. {})
3 Comments Comments are strings of symbols placed between the delimiters /* and */. The compiler changes each comment into a single blank character. /* */ /* A multi-line comment */ /************************************** * A fancy framed * * comment, for example * * for the title of the program * **************************************/
4 Keywords data types flow controlKeywords are reserved words with strict meanings that may not be redefined or used in other contexts. They are: data types auto break case char const continue default do double else enum extern float for goto if int long register return short signed sizeof static struct switch typedef union unsigned void volatile while flow control
5 Machine Storage A bit is simply the smallest unit of storage. It holds either a ‘1’ or a ‘0’. A byte has 8 bits. Bits and bytes are always of length 1 and 8 respectively. A machine word (usually simply called word) is 4 bytes (on most machines). In C, an identifier (variable) assigns a name to an area of memory. The statement: int inches, feet, fathoms; provides names for three different areas of memory. Identifiers of type int tell the compiler that the memory will hold integers (whole numbers).
6 Identifiers An identifier – a variable name or function name – is a sequence of letters, digits and underscores; no other characters. Rules: The first character must not be a digit; you are also strongly advised not to begin identifiers with an underscore, as it may cause conflict with some system names. Lower- and uppercase characters are distinct. Give variables names that make sense! Call them things like “relative_angle” or “spin_z”, not just “q” or “spz” which won’t make immediate sense to you later on. Names of standard library functions such as printf should not normally be redefined. Note that on some old systems, only the first 8 characters of an identifier are used; in ANSI C, at least the first 31 are used. Exercise: Which of the following are identifiers? k _id _ parameter#1 101_south am_i_an_identifier how-about-me A _x
7 More on Identifiers In the depth program, inches, feet and fathoms are variables. A variable (called an identifier) names an area of storage of a specified data type. Statements read, set and update the values of variables. Our variables are of type int. (i.e. integers). In algebra, the = symbol means two things have the same value. In C, (and a lot of other programming languages) it means to assign the value of the expression to the right to the variable on the left. fathoms = 7; assigns the constant value 7 to the identifier fathoms. feet = 6 * fathoms; multiples the value in the identifier fathoms by 6 and assigns the result to the identifier feet.
8 Identifiers Sequence of letters, digits, and the special character _.• A letter or underscore must be the 1st character of an identifier. For this class, don’t use identifiers that begin with an underscore. • C is case-sensitive: Apple and apple are two different identifiers.
9 Identifiers (cont.) Valid variable names: n x _id num1a_long_identifier Invalid variable names: var.1 num!2 not#this 126East +more
10 Declarations All variables must be declared before use. A declaration specifies a type, and contains a list of one or more variables of that type. int lower, upper, step; int c, line; A variable may be initialized in its declaration. int i = 0;
11 Constants floating-point constants decimal integer constantsIn addition to variables, that change as the program runs, you will often need to use constants, e.g. e = e-19, that do not change. floating-point constants 123.4e-2 decimal integer constants 123 starts with 0 octal integer constants 0123 (decimal 83) starts with 0x hexadecimal integer constants 0x123 (decimal 291) character constants ‘a‘ ‘\n‘ (newline, „escape“-character) constant expressions -123
12 String constants A sequence of characters enclosed by double quotes, e.g. “abc”, is a string constant or string literal. It is stored by the compiler as an array of characters. String constants are not the same as character constants: thus, ‘a’ is different from “a”: the latter is an array of characters of size one, whereas the former is just a character. If a double-quote is to appear in a string, it must be preceded by a backslash (“\““); likewise the backslash character itself (“\\“) “This string contains one \\ backslash“ String constants may not include a (literal) linefeed: thus, “this is not a string” is not a string. If you want to stretch your string over a line break, you can end the line with a backslash; this has the effect of continuing it to the next line. “this is \ a string” string constants separated by white space are concatenated into one string: “this ” “is a string” is equivalent to “this is a string”.
13 Operators Arithmetic Operators: +, -, *, /, %+,-,*,/ are for addition, subtraction, multiplication, and division respectively. % is for modulus (divide and take the remainder). 5 % 3 is 2 7 % 2 is 1 Logical Operators: !, ||, && ! is NOT || is OR && is AND and other operators to manipulate bits, addresses ...
14 Assignment Operator = The assignment operator = in C looks like (and is usually referred to by the same name as) the mathematical equals sign; but they are not equivalent. The mathematical equation x + 2 = 0 is not a legal assignment expression, since the left-hand side is an expression, not a variable, and it may not be assigned a value in this way. In contrast: The statement x = x + 1; is perfectly legal in C; it adds 1 to the current value of x, and assigns the new value to x. Mathematically, though, it makes no sense, as you can see by subtracting x from both sides. Assignment / arithmetic shortcut x += 1 means x = x+1 x *= 5 means x = x*5 x -= 2 means x = x-2 x /= 3 means x = x/3 (the right side is always evaluted first)
15 Increment and Decrement OperatorsThe increment and decrement operators are ++ and - -, respectively. They can be used as both prefix and postfix operators. Example: c = 1; a = ++c; printf(“a = %d, c = %d”, a,c); increments c, so c becomes 2; gives a the new value, 2, so it prints out a = 2, c = 2. In contrast, c = 1; a = c++; printf(“a = %d, c = %d”, a,c); sets a to be the present value of c, i.e. 1, and then increments c by 1: it prints out a = 1, c = 2.
16 Arithmetic Operators The binary arithmetic operators are +, -, *, /, and the modulus operator %. x % y produces the remainder when x is divided by y. 11 % 5 = 1 20 % 3 = 2 Arithmetic operators associate left to right.
17 Precedence and Associativity of OperatorsOperator Precedence: x = * 3; (What is the value of x?) x = 1 + (2*3); x is 7 x= (1+2) * 3; x is 9 Associativity: (left to right) 10 – 3 + 7 15 / 5 * 2
18 variable = expression Assignment OperatorThe expression can simply be a constant or a variable: int x, y; x = 5; y = x; x = 6; The expression can be an arithmetic expression: x = y + 1; y = x * 2; variable = expression
19 Assignment Compatibilityint x; double y; x = y; error! y = x; it is okay. x = ; error! y = ; it is okay. x = 10/4; x is 2. y = 10/4; y is 2.0 y = 10/4.0; y is 2.5
20 Increment and Decrement OperatorsThe increment operator ++ adds 1 to its operand. ++i; equiv. i = i + 1; i++; equiv. i=i + 1; • The decrement operator -- subtracts 1 from its operand. --i; equiv. i = i - 1; i--; equiv. i = i - 1;
21 Increment and Decrement OperatorsSuppose n = 5. n++; /* sets n to 6 */ n = n + 1; ++n; /* sets n to 6 */
22 Increment and Decrement Operators (cont.)When ++a is used in an expression, the value of a is incremented before the expression is evaluated. When a++ is used, the expression is evaluated with the current value of a and then a is incremented. Similarly, with --a and a--.
23 Increment and Decrement Operators (cont.)Suppose n = 5. x = n++; /* sets x to 5 and n to 6 */ 1. x = n; 2. n = n + 1; x = ++n; /* sets x and n to 6 */ 1. n = n + 1; 2. x = n;
24 /. Preincrementing and postincrementing. / #include
25 /. increment and decrement expressions. / #include
26 Arithmetic Assignment OperatorsAssume: int c = 3, d = 5, e = 4, f = 6, g = 12; Operator Expression Explanation Assigns += c += 7 c = c to c -= d -= 4 d = d to d *= e *= 5 e = e * to e /= f /= 3 f = f / to f %= g %= 9 g = g % to g
27 Expressions 7 /* Constant expression */Expessions are a sequence of constants, variables, and function calls which may be combined by operations. Example expressions 7 /* Constant expression */ fathoms /* Variable expression */ 6 * fathoms /* expression * expression */ 12 * feet /* Also expression * expression */ Operations include addition(+), subtraction(-), multiplication(*), division(/), modulus(%). The expression 10+3 has value 13, 10-3 has value 7, 10*3 has value 30, 10%3 has value 1. (The remainder when 10 is divided by 3) An expression having a binary operation on two ints is understood to be of type int. Thus the expression 10/3 has the value 3.(The fractional part is discarded.)
28 Precedence and AssociativityAn operator is, obviously, something that performs an operation on one (unary operator ++, - -, !) or two (binary operator +, -, *, /) numbers to produce a result of some sort. In a given expression, some operators have priority. Rule 1: Multiplication and division are done before addition and subtraction. Thus, * 3 is 7 not 9. Example: j *= k + 3; is equivalent to j = j * (k + 3); and not to j = j * k + 3; this is because + has higher precedence than *=. A clear case for using parentheses! Rule 2: For anything else, use brackets to avoid bugs! Operators Associativity ( ) ++(postfix) (postfix) Left to right + (unary) - (unary) ++ (prefix) -- (prefix) Right to left * / % + - < <= > >= = = != && || ?: = += - = *= /= %= higher priority
29 Assignment ExpressionsThe assignment expression k = 3 is actually a self-contained unit that has the value 3. The whole expression (not just the variable k) has a value (in this case, 3). Example 1: b = 2; c = 3; a = b + c; may be condensed to a = (b = 2) + (c = 3); the expression (b = 2) not only assigns the value 2 to the variable b, but also has as a whole the value 2; likewise for (c = 3). Whole expression has value 2 Whole expression has value 3 Example 2: a = b = c = 1; is shorthand for a = (b = (c = 1)); first c, and then the expression (c = 1) are assigned the value 1; then b, and in turn the expression (b = 1) and finally the variable a are all assigned the value 1. Note that the associativity is right-to-left: as the operators all have equal precedence, operations are carried out in that order.
30 Statements If a semicolon is put at the end of an expression, it becomes a statement. Note that statements do not have a value. Thus, the statement k = 3; does two things: it assigns the value 3 to the variable k; it assigns the value 3 to the whole expression k = 3. However, the statement “k=3;” (with semicolon) does not have a value. { a = 1; b = 2; } Compound statement ; Empty statement
31 Binary Counting A decimal number (e.g. 38742) is written asdndn-1....d1d0, where the di represent individual digits, and it may be evaluated as dn 10n + dn-1 10n d1 d0 100. Likewise, in binary, a number dndn-1....d1d0 may be evaluated as dn 2n + dn-1 2n d1 21 + d0 20; but the di here are either 0 or 1. Computers store and manipulate their data in this format. Thus, the number = 25 + 0 21 is represented as on an 8-bit machine.
32 Data Types At the start of a function, all variables must be declared, and optionally they may be initialized: int a, b = 12, c; float x, y = 10.5, z = -6.0; This tells the compiler to set aside an appropriate amount of space in memory to store the values associated with each variable; it also enables the compiler to instruct the machine to perform specific operations correctly. At the machine level, the operation of addition of integers is different than the operation of floating-point variables. The different fundamental data types are: Character variables: char signed char unsigned char Integer variables: short int long Unsigned integer: unsigned short unsigned unsigned long Floating-point: float double long double
33 Size of Data Types(unix)char byte short int 2 bytes int 4 bytes long int 4 bytes unsigned int 4 bytes float 4 bytes double 8 bytes long double 16 bytes * 8 bits = 1 byte
34 Data Type and Sizes 1 1 Characters are treated as small integers.char c = ‘A’; /* ‘A’ has ASCII encoding 65 */ int i= 65; /* 65 is ASCII encoding for ‘A’ */ 1 1
35 Fundamental Data TypesThere are two types of representation for integer numbers: unsigned signed char unsigned char signed char 0: positive 1:negative int signed int unsigned int 0: positive 1:negative
36 Data Types and Sizes int between –2,147,483,648 and 2,147,483,647 if (4-bytes) int between –32,768 and 32,767 if (2-bytes) There are a number of qualifiers that can be applied to these basic types. short and long can be applied to integers. short int sh; long int counter; The intent is that short and long should provide different lengths of integers where practical (i.e to save memory). Other qualifiers such as unsigned and signed may be applied to char or int.
37 Data Type int Old machines used 16 bits (2 bytes) to store int variables; most machines nowadays use 32 (or occasionally 64) bits. Of these n bits, one is used to store the sign, and the remainder to store the value itself. (The data type unsigned assumes the number is positive, and uses the extra bit to double its range.) It is possible, when the programmer is careless, for numbers to exceed the possible range of values that may be stored in a variable, in which case an integer overflow occurs; the program will usually continue to run, but with results that are nonsense. In addition to decimal integer constants, there are also hexadecimal (base 16; digits 0-9, followed by A-F; specified by leading 0x, as in 0x1a) and octal (base 8; digits 0-7; specified by leading 0, as in 053). Suffixes can be appended to integer constants to specify their type: e.g., 37U specifies an unsigned integer constant, 456ul an unsigned long integer constant.
38 Characters Variables of any integral type – in particular, char and int – can be used to represent character variables. In C there are no constants of type char: characters such as ‘a’ and ‘+’ are of type int. In addition to representing characters, a variable of type char can be used to hold small (1 byte) integer values. Most machines use the ASCII character code to represent letters and other characters as small integer numbers. Note that there is no correspondence between a character representing a digit and that digit’s intrinsic value: the value of ‘2’ is not 2 but 50. Letters do appear in alphabetical order, which makes sorting much easier. Some characters are the so-called escape codes; e.g. the alert character (beep) \a has integer value 7.
39 The Data Type char Each character is stored in a machine in one byte (8 bits) 1 byte is capable of storing 28 or 256 distinct values. When a character is stored in a byte, the contents of that byte can be thought of as either a character or as an integer.
40 The Data Type char A character constant is written between single quotes. ‘a’ ‘b’ A declaration for a variable of type char is char c; Character variables can be initialized char c1=‘A’, c2=‘B’, c3=‘*’;
41 In C, a character is considered to have the integer valuecorresponding to its ASCII encoding. lowercase ‘a’ ‘b’ ‘c’ ... ‘z’ ASCII value uppercase ‘A’ ‘B’ ‘C’ ... ‘Z’ ASCII value digit ‘0’ ‘1’ ‘2’ ... ‘9’ ASCII value other ‘&’ ‘*’ ‘+’ ... ASCII value
42 Characters and IntegersThere is no relationship between the character ‘2’ (which has the ASCII value 50) and the constant number 2. ‘2’ is not 2. ‘A’ to ‘Z’ 65 to 90 ‘a’ to ‘z’ 97 to 112 Examples: printf(“%c”,’a’); printf(“%c”,97); have similar output. Printf(“%d”,’a’); printf(“%d”,97); have also similar output.
43 The Data Type char alert \a 7Some nonprinting and hard-to-print characters require an escape sequence. For example, the newline character is written as \n and it represents a single ASCII character. Name of character Written in C Integer Value alert \a backslash \\ double quote \” horizontal tab \t
44 Floating Types There are three floating data types: float double long double The default working floating type in C is double, not float. Suffixes may be appended to floating constants to specify a type other than double f is a float number, 3.14l is a long double. Integers may be represented by floating constants, but they must be written with a decimal point. (123. ) Exponential notation is also available: e6 means x 106. A float is usually stored in 4 bytes, and is therefore accurate to typically 6 decimal places with a range of to A double is usually accurate to about 15 decimal places, and has a range of to
45 The Floating Types A float on many machines has an approx. range of to 1038. A double on many machines has an approx. range of to The working floating type in C is double.
46 Precedence of Operators in C
47 Types in C
48 Typecasting int i=4,j=3; float x,y; f = i/j; x = (float) i/j;The implicit type of an expression is not always what one would expect, but determined by the types of the operands: int i=4,j=3; float x,y; f = i/j; since only integer numbers are involved, the result of the division is 1, not To obtain the floating point result, the type of a sub-expression must be explicitly specified, by putting it in brackets in front of the variable: x = (float) i/j; to cast i and therefore the division as float Any types can be cast in this way, as long as they are compatible: y = (float) ( (int) x + i ); but not y = (unsigned) ( - fabs(x) ); (result undefined)
49 Conversions Consider the following expressions.x + y where x and y are of type int x + y where x and y are of type short In both situations, x+y is converted to int. If all the values of the original type can be represented by an int, the value is converted to an int; otherwise it is converted to an unsigned int. (The integral promotion)
50 Arithmetic ConversionsArithmetic conversions can occur when the operands of a binary operator are evaluated. i + f where i is an int and f is a float, the result is float
51 Examples
52 Type Casting In addition to implicit conversions, there are explicit conversions called casts. If i is an int, then (double) i will cast the value of i so that the expression has type double. Casts can be applied to expressions. x = (float)((int)y+1); y = (float) i + 3;
53 Common errors Assume a 4-byte word machine int a =1, b= 2147483648;Dividing two integers will always give an integer int j = 2, k =5; j/k 0; (double) j/k 0.4;
54 The sizeof Operator The operator sizeof yields the number of bytes needed to store an object. Such storage requirements vary from machine to machine, but it is always the case that sizeof(char) = 1 sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long) sizeof(signed) = sizeof(unsigned) = sizeof(int) sizeof(float) <= sizeof(double) <= sizeof(long double) The size returned is usually unsigned, as there’s no need to make allowance for the possibility of negative sizes!
55 The Use of typedef C provides the typedef facility so that an identifier can be associated with a specific type. For example, typedef int color; makes color a type that is synonymous with int, and it can be used in declarations just as other types are used.
56 printf Example printf(“%12.2e divided by %12.2e is %12.2e.\n”,x,y,x/y); The argument to printf( ) has two parts: the control string, and the rest. The control string contains specifiers such as %c, %d and so on. Control chars also can include precision and width specifiers: %12.2f specifies a number of width (at least) 12, of which 2 are decimal places. Each control specifier must correspond to an argument in the remainder of the list. The function itself returns the number of characters printed, although that is not often used.
57 printf() conversion characters
58 scanf Example scanf(“%d%f”,&x,&y);The function scanf( ) likewise has a control string and an arbitrary number of other arguments. The other arguments must be addresses in memory: places to write to (i.e., usually preceded by &, although the name of an array is an address in itself so doesn’t need the &). If you want to input a value for the variable x, if the command were scanf(x), the computer would have to look up the address of x for itself to know where to store the value you are about to give it. Instead, with scanf(&x), you give the compiler the address of x directly. This is a common source of errors amongst novice programmers. White space in the control string must match white space in the input stream; scanf normally looks for non-white-space characters. It is assumed that enough space has been allocated to hold any strings that are read in: this is the programmer’s responsibility. The function returns the number of successful conversions performed.
59 scanf() scanf() is a function that reads input from the keyboard. It takes multiple arguments, the first is a control string, the rest are identifiers that should receive the keyboard input. /* Read an integer and store it in the identifier labeled x */ scanf(“%d”, &x); /* Note the %d is a format as in printf() */ Note the & must precede the variable name. It is a common but deadly mistake to forget it. This is a tough one to spot and difficult to debug, so be careful. The scanf() function needs its format to match the type of argument. The conversion characters are c, d, f, lf, Lf, and s and they match char, decimal integer, float, double, long double, and string.
60 scanf() example program#include
61 Mathematical functionsThere are no built-in mathematical functions in C Functions such as sqrt() exp() log() sin() cos() tan() are defined in the mathematics library. All of these take one argument of type double, and return a value of type double; pow() takes two arguments, base and exponent, of type double and returns as double. In order to use functions from the standard maths library, you should #include
62 Programming style You should write your code in a tidy way to make it nicely readable. In particular, make sure your indentation is correct – this will take care of a lot of debugging for you. Whenever you open a curly bracket (i.e. you start a compound statement), you should indent further; lines underneath to the same level until you either open a new bracket or close the existing one. xemacs knows the rules about how to indent properly. Hit the tab key on every line of your code, and xemacs will indent it properly for you. Missing semicolons, parentheses that have been opened but not closed and so on will then stick out, as they will appear to be badly indented. Every program you write should have comment lines at the top stating the name, author, date and purpose of the code. Every function in your program should have a comment stating its purpose. Include plenty of comments throughout the code to explain what’s going on. Don’t use global variables, i.e. variables outside main( ), if there is no compelling reason. .
63 An example of really bad C styleA valid C-program (code in obscure.c) #include
64 How not to lose marks on your homeworkRead the good programming practice/ structure of a C program notes Even though the compiler doesn't care, marks will be deducted for poor indentation and for poorly-commented code Code must have a comment at the top stating author, date, purpose Don’t use global variables inappropriately Always free dynamically-allocated memory space before finishing the program (you’ll learn about this later on) Always close files you have opened (also later on) And of course your code should compile, and the compiled code should run!