VINAYAK WORLD: 2014

Thursday, 13 February 2014

LITTLE ENDIAN vs BIG ENDIAN

"Little Endian" means that the low-order byte of the number is stored in memory at the lowest address, and the high-order byte at the highest address. (The little end comes first.)

For example, a 4 byte LongInt

Byte3 Byte2 Byte1 Byte0

will be arranged in memory as follows:

Base Address+0 Byte0

Base Address+1 Byte1

Base Address+2 Byte2

Base Address+3 Byte3

Intel processors (those used in PC's) use "Little Endian" byte order.
"Big Endian" means that the high-order byte of the number is stored in memory at the lowest address, and the low-order byte at the highest address. (The big end comes first.)

Our LongInt, would then be stored as:

Base Address+0 Byte3

Base Address+1 Byte2

Base Address+2 Byte1

Base Address+3 Byte0

Motorola processors (those used in Mac's) use "Big Endian" byte order.

In "Little Endian" form, assembly language instructions for picking up a 1, 2, 4, or longer byte number proceed in exactly the same way for all formats: first pick up the lowest order byte at offset 0.

Also, because of the 1:1 relationship between address offset and byte number (offset 0 is byte 0), multiple precision math routines are correspondingly easy to write.

In "Big Endian" form, by having the high-order byte come first, you can always test whether the number is positive or negative by looking at the byte at offset zero.
You don't have to know how long the number is, nor do you have to skip over any bytes to find the byte containing the sign information. The numbers are also stored in the order in which they are printed out, so binary to decimal routines are particularly efficient.

Common file formats and their endian order are as follows:
Adobe Photoshop -- Big Endian
BMP (Windows and OS/2 Bitmaps) -- Little Endian
DXF (AutoCad) -- Variable
GIF -- Little Endian
IMG (GEM Raster) -- Big Endian
JPEG -- Big Endian
FLI (Autodesk Animator) -- Little Endian
MacPaint -- Big Endian
PCX (PC Paintbrush) -- Little Endian

Wednesday, 22 January 2014

Multi Threading Basics

MULTI THREADING:

In most modern operating systems it is possible for an application to split into many "threads" that all execute concurrently. It might not be immediately obvious why this is useful, but there are numerous reasons why this is beneficial.

When a program is split into many threads, each thread acts like its own individual program, except that all the threads work in the same memory space, so all their memory is shared.multiple threads can run on multiple CPUs, providing a performance improvement.

Multi threading is the ability of a program or an operating system process to manage its use by more than one user at a time and to even manage multiple requests by the same user without having to have multiple copies of the programming running in the computer.

Each user request for a program or system service (and here a user can also be another program) is kept track of as a thread with a separate identity.

As programs work on behalf of the initial request for that thread and are interrupted by other requests, the status of work on behalf of that thread is kept track of until the work is completed.

A multithreaded application works just as well on a single-CPU system, but without the added speed. As multi-core processors become commonplace, such as Dual-Core processors and Intel Pentium 4's with HyperThreading, multithreading will be one of the simplest ways to boost performance.

Secondly, and often more importantly, it allows the programmer to divide each particular job of a program up into its own piece that operates independently of all the others. This becomes particularly important when many threads are doing blocking I/O operations.

A media player, for example, can have a thread for pre-buffering the incoming media, possibly from a harddrive, CD, DVD, or network socket, a thread to process user input, and a thread to play the actual media. A stall in any single thread won't keep the others from doing their jobs.

For the operating system, switching between threads is normally cheaper than switching between processes. This is because the memory management information doesn't change between threads, only the stack and register set do, which means less data to copy on context switches.

Multithreaded applications often require synchronization objects. These objects are used to protect memory from being modified by multiple threads at the same time, which might make the data incorrect.

The first, and simplest, is an object called a mutex. A mutex is like a lock. A thread can lock it, and then any subsequent attempt to lock it, by the same thread or any other, will cause the attempting thread to block until the mutex is unlocked.

These are very handy for keeping data structures correct from all the threads' points of view. For example, imagine a very large linked list. If one thread deletes a node at the same time that another thread is trying to walk the list, it is possible for the walking thread to fall off the list, so to speak, if the node is deleted or changed.

Using a mutex to "lock" the list keeps this from happening.Computer Scientist people will tell you that Mutex stands for Mutual Exclusion.

Technically speaking, only the thread that locks a mutex can unlock it, but sometimes operating systems will allow any thread to unlock it. Doing this is, of course, a Bad Idea.

Similar to the mutex is the semaphore. A semaphore is like a mutex that counts instead of locks. If it reaches zero, the next attempt to access the semaphore will block until someone else increases it. This is useful for resource management when there is more than one resource, or if two separate threads are using the same resource in coordination. Common terminology for using semaphores is "uping" and "downing", where upping increases the count and downing decreases and blocks on zero.

Unlike mutexes, semaphores are designed to allow multiple threads to up and down them all at once. If you create a semaphore with a count of 1, it will act just like a mutex, with the ability to allow other threads to unlock it.

The third and final structure is the thread itself. More specifically, thread identifiers. These are useful for getting certain threads to wait for other threads, or for getting threads to tell other threads interesting things.

Computer Scientists like to refer to the pieces of code protected by mutexes and semaphores as Critical Sections. In general, it's a good idea to keep Critical Sections as short as possible to allow the application to be as parallel as possible. The larger the critical section, the more likely it is that multiple threads will hit it at the same time, causing stalls.

Look Ahead in Parser

Look Ahead in Parsing:

Lookahead establishes the maximum incoming tokens that a parser can use to decide which rule it should use. Lookahead is especially relevant to LL, LR, and LALR parsers, where it is often explicitly indicated by affixing the lookahead to the algorithm name in parentheses, such as LALR(1).

Most programming languages, the primary target of parsers, are carefully defined in such a way that a parser with limited lookahead, typically one, can parse them, because parsers with limited lookahead are often more efficient. One important change to this trend came in 1990 when Terence Parr created ANTLR for his Ph.D. thesis, a parser generator for efficient LL(k) parsers, where k is any fixed value.

Parsers typically have only a few actions after seeing each token. They are shift (add this token to the stack for later reduction), reduce (pop tokens from the stack and form a syntactic construct), end, error (no known rule applies) or conflict (does not know whether to shift or reduce).

Lookahead has two advantages.

        1.It helps the parser take the correct action in case of conflicts. For example, parsing the if statement in the case of an else clause.

    2.It eliminates many duplicate states and eases the burden of an extra stack. A C language non-lookahead parser will have around 10,000 states. A lookahead parser will have around 300 states.

Example: Parsing the Expression 1 + 2 * 3

Set of expression parsing rules (called grammar) is as follows,

    Rule1: E -> E + E      Expression is the sum of two expressions.

    Rule2: E -> E * E      Expression is the product of two expressions.

    Rule3: E -> number     Expression is a simple number

    Rule4: + has less precedence than *

Most programming languages (except for a few such as APL and Smalltalk) and algebraic formulas give higher precedence to multiplication than addition, in which case the correct interpretation of the example above is (1 + (23)). Note that Rule4 above is a semantic rule. It is possible to rewrite the grammar to incorporate this into the syntax. However, not all such rules can be translated into syntax.

Simple Non-look ahead parser actions:

Initially Input = [1,+,2,,3]

        Shift "1" onto stack from input (in anticipation of rule3). Input = [+,2,,3] Stack = [1]

    Reduces "1" to expression "E" based on rule3. Stack = [E]

    Shift "+" onto stack from input (in anticipation of rule1). Input = [2,,3] Stack = [E,+]

    Shift "2" onto stack from input (in anticipation of rule3). Input = [,3] Stack = [E,+,2]

    Reduce stack element "2" to Expression "E" based on rule3. Stack = [E,+,E]

    Reduce stack items [E,+] and new input "E" to "E" based on rule1. Stack = [E]

    Shift "" onto stack from input (in anticipation of rule2). Input = [3] Stack = [E,]

    Shift "3" onto stack from input (in anticipation of rule3). Input = [] (empty) Stack = [E,,3]

    Reduce stack element "3" to expression "E" based on rule3. Stack = [E,,E]

    Reduce stack items [E,] and new input "E" to "E" based on rule2. Stack = [E]

The parse tree and resulting code from it is not correct according to language semantics.

To correctly parse without look ahead, there are three solutions:

        1.The user has to enclose expressions within parentheses. This often is not a viable solution.

    2.The parser needs to have more logic to backtrack and retry whenever a rule is violated or not complete. The similar method is followed in LL parsers.

    3.Alternatively, the parser or grammar needs to have extra logic to delay reduction and reduce only when it is absolutely sure which rule to reduce first. This method is used in LR parsers. This correctly parses the expression but with many more states and increased stack depth.

Look ahead parser actions:

        Shift 1 onto stack on input 1 in anticipation of rule3. It does not reduce immediately.

    Reduce stack item 1 to simple Expression on input + based on rule3. The look ahead is +, so we are on path to E +, so we can reduce the stack to E.

    Shift + onto stack on input + in anticipation of rule1.

    Shift 2 onto stack on input 2 in anticipation of rule3.

    Reduce stack item 2 to Expression on input * based on rule3. The look ahead * expects only E before it.

    Now stack has E + E and still the input is . It has two choices now, either to shift based on rule2 or reduction based on rule1. Since has more precedence than + based on rule4, so shift * onto stack in anticipation of rule2.

    Shift 3 onto stack on input 3 in anticipation of rule3.

    Reduce stack item 3 to Expression after seeing end of input based on rule3.

    Reduce stack items E * E to E based on rule2.

    Reduce stack items E + E to E based on rule1.

The parse tree generated is correct and simply more efficient than non-look ahead parsers. This is the strategy followed in LALR parsers.

Bitwise Operators in C

BITWISE OPERATORS:

    Bitwise operators are u.d to manipulate one or more bits from integral operands like char, int, short, long.

    C language supports the following bitwise operators.

        | – Bitwise OR

    & – Bitwise AND

        ~ – One’s complement

    ^ – Bitwise XOR

    << – left shift

    >> – right shift

Note on shifting signed and unsigned numbers:

While performing shifting, if the operand is a signed value, then arithmetic shift will be used. If the type is unsigned, then logical shift will be used.

In case of arithmetic shift, the sign-bit ( MSB ) is preserved. Logical shift will not preserve the signed bit. Let’s see this via an example.

#include<stdio.h>

int main()

    {
       signed char a=-8;
        signed char b= a >> 1;
        printf("%d\n",b);
    }

In the above code, we are right shifting -8 by 1. The result will be “-4". Here arithmetic shift is applied since the operand is a signed value.

#include<stdio.h>

int main()

    {
        unsigned char a=-8;
       unsigned char b= a >> 1;
       printf("%d\n",b);
    }

Negative number are represented using 2's complement of its positive equivalent.

2's compliment of +8 is

1111 1000

Right shifting by 1 yields,

0111 1100 ( 124 in decimal )

The above code will result in 124 ( Positive value ). Here logical shift is applied since the operand is unsigned, and it won’t preserve the MSB of the operand.

Right shifts preserve the sign bit. When a signed integer shifts right, the most-significant bit remains set. When an unsigned integer shifts right, the most-significant bit is cleared.

Forward Declaration of Structures

Forward Declaration:

Forward declaration is a declaration proceeding an actual definition, usually for the purpose of being able to reference the declared type when the definition is not available.

Of course, not everything may be done with the declared-not-defined structure, but in certain context it is possible to use it.

Such type is called incomplete, and there are a number of restrictions on its usage.

For example:

struct X; // forward declaration

void f(struct X) { } // usage of the declared, undefined structure

// void f(struct X) { }         // ILLEGAL
// struct X x;                  // ILLEGAL
// int n =sizeof(struct X);     // ILLEGAL

// later, or somewhere else altogether
struct X { / ... */ };

This can be useful (e.g.) to break circular dependencies, or cut down the compilation time, as the definitions are usually significantly larger, and so more resources are required to parse it.

Batch File Concepts

BATCH FILE:

                       These are simple text files containing some lines with commands that get executed in sequence, one after the other. These files have the special extension BAT or CMD.

Files of this type are recognized and executed through an interface (shell) provided by a system file called the command interpreter. In Windows XP/ Vista the command interpreter is the file cmd.exe.

The large assortment of versatile commands available in Windows XP/Vista/7 makes batch files a powerful tool.Constructing a batch file consists of nothing more than opening any text editor like the accessory Notepad, entering some lines containing commands, and saving the file with an extension BAT or CMD.

Basic Commands Are,

@echo off - The purpose of this first command is to turn off this display. The command "echo off" turns off the display for the whole script, except for the "echo off" command itself. The "at" sign "@" in front makes the command apply to itself as well.

Rem - Very often batch files contain lines that start with "Rem". This is a way to enter comments and documentation. The computer ignores anything on a line following Rem.

set /p variable= [string] - "Variable" is the name of the variable that will be assigned to the data that we want the user to input. "String" is the message that the user will see as a prompt. If desired, "string" can be omitted.

Here is an example that asks the user to enter his or her name:

            set /p name= What is wer name?

This will create a variable %name% whose value is whatever the user enters. Note that the user must press the "Enter' key after typing the input.

Conditional branching with "If" statements:

Batch files can make decisions and choose actions that depend on conditions. This type of action makes use of a statement beginning with "If".

The basic meaning of an "If" statement is ,

        If something is true then do an action (otherwise do a different action)

The second part of the statement (in parentheses) is optional.

Otherwise, the system just goes to the next line in the batch file if the first condition isn't met.

The actual syntax is ,

        If (condition) (command1) Else (command2) The "Else" part is optional. The form "If not" can also be used to test if a condition is false.

Note that "If" tests for true or false in the Boolean sense.

"If exist" statement:

There is a special "If exist" statement that can be used to test for the existence of a file, followed by a command.

An example would be:

        If exist somefile.ext del somefile.ext

we can also use a negative existence test:

        if not exist somefile.ext echo no file

"If defined" statement:

Another special case is "if defined ", which is used to test for the existence of a variable.

        For example: if defined somevariable somecommand

This can also be used in the negative form, "if not defined".

When comparing variables that are strings, it may be best to enclose the variable name in quotes.

For example, use:

        if "%1" == somestring somecommand

Variables are declared and given a value in a single statement using "set".

The syntax is: set some_variable = some_value

Localizing variables:

The declaration of a variable lasts as long as the present command window is open. If you are using a batch file that does not close its instance of the command window when the batch file terminates, any variables that the batch file declares remain.

If you wish to localize a variable to a particular set of statements, use the "setlocal" and "endlocal" commands. Thus. to confine a variable declaration to a particular block of code, use:....

                setlocal

                set some_variable = some_value

                ...some statements

                endlocal

                ...

C PROGRAMMING TRICKS

C Programming Tricks:

1.

Always try to compare like

if ( 0 == i )

rather than

if ( i == 0)

In this way you can catch error for unintended assignments like

if ( 0 = i) .

2.

Dont use strlen in a loop condition:

The strlen function is expensive. For every call it must loop over every character in the input string until a nul terminator is found. Therefore, it is very unwise to call it more often than needed. This is bad code:

for ( int ix = 0; ix < strlen(a_str); ix++)
{
     a_str[ix] = tolower( (unsigned char) a_str[ix] );
}

Lets consider a string 1000 characters in length. strlen will be called 1001 times and loop over 1000 characters each time. That is over one million wasted iterations. If the tolower call and assignment takes the equivalent of 10 iterations we can calculate that the operation takes one hundred times longer than it would if written correctly.

strlen as a loop condition should be replaced with:

for ( int ix = 0; a_str[ix] != '\0'; ix++)
{
     a_str[ix] = tolower( (unsigned char) a_str[ix] );
}

or the slightly less efficient:

int len = strlen(a_str);
for ( int ix = 0; ix < len; ix++)
{
     a_str[ix] = tolower( (unsigned char) a_str[ix] );
}

As well as removing unnecessary strlen calls from loops, we should try to remove any other expensive function calls.

3.

Temporarily swap:

Here's a neat trick to swap two variables without creating a temporary:

void swap(int& a, int& b)
{
   a ^= b;
   b ^= a;
   a ^= b;
}

To check correctness, you only need to know that (a^b)^a = b and (b^a)^b = a.

Subscribe to: Comments (Atom)