📚 LinuxDocs
Topics:
All Pages8021X HOWTOACP ModemACPI HOWTOADSL Bandwidth Man..ATA RAID HOWTOATM Linux HOWTOAX25 HOWTOAccessibility Dev ..Accessibility HOWTOAdv Bash Scr HOWTOAdv Routing HOWTOAntares RAID sparc..Apache Compile HOWTOApache WebDAV LDAP..Assembly HOWTOAstronomy HOWTOAthlon Powersaving..Authentication Gat..Autodir HOWTOAviation HOWTOAvr Microcontrolle..BRIDGE STP HOWTOBTTVBackspaceDeleteBandwidth Limiting..Bangla HOWTOBash Prompt HOWTOBattery PoweredBelarusian HOWTOBelgian HOWTOBeowulf HOWTOBocaBogoMipsBootdisk HOWTOBridgeC++ dlopenC C++Beautifier HO..C editing with VIM..CDROM HOWTOCDServer HOWTOCable ModemCaudium HOWTOClone HOWTOCompaq Remote Insi..Compaq T1500 HOWTOConexant+Rockwell ..Cryptoloop HOWTODB2 HOWTODHCPDSL HOWTODVD Playback HOWTODebian Binary Pack..Debian JigdoDebian and Windows..Disk Encryption HO..Disk on Chip HOWTODocBook Demystific..DocBook InstallDocBook OpenJade S..Ecology HOWTOEmacspeak HOWTOEncourage Women Li..Encrypted Root Fil..Euro Char SupportEvent HOWTOFedora Multimedia ..Finnish HOWTOFirewall PiercingFlash Memory HOWTOFont HOWTOFramebuffer HOWTOGCC HOWTOGIS GRASSGlibc Install HOWTOHOWTO HOWTOHOWTO INDEXHP HOWTOHandspring VisorHard Disk UpgradeHardware HOWTOHighQuality Apps H..Home Electrical Co..IBM7248 HOWTOIO Perf HOWTOIP AliasIP Masquerade HOWTOIRCImplement Sys Call..Indic Fonts HOWTOInfrared HOWTOIngresII HOWTOInstall StrategiesInstallation HOWTOInstallfest HOWTOIntkeybItalian HOWTOJabber Server Farm..JavaStation HOWTOKerberos Infrastru..Kernel HOWTOKerneldKodak Digitalcam H..LDAP HOWTOLDP Reviewer HOWTOLILO crash rescue ..LVM HOWTOLeased LineLegoLinksys Blue Box R..Linux+Win95Linux+Win9x+Grub H..Linux+Windows HOWTOLinux Complete Bac..Linux Crash HOWTOLinux Gamers HOWTOLinux Modem SharingLinux Promise RAID..Linux i386 Boot Co..LinuxGL QuakeWorld..Lotus DominoR5MILO HOWTOMMBase Inst HOWTOMP3 CD BurningMail User HOWTOMajordomo MajorCoo..Man PageMasquerading Simpl..Medicine HOWTOMindTerm SSH HOWTOMobile IPv6 HOWTOMock MainframeModule HOWTOModulesMotorola Surfboard..Mozilla OptimizationMulti Distro DevNCURSES Programmin..NFS HOWTONFS Root Client mi..NIS HOWTONetMeeting HOWTONetwork boot HOWTONvidia OpenGL Conf..OLSR IPv6 HOWTOOnline Troubleshoo..Oracle 9i Fedora 3..PA RISC Linux Boot..PCTel MicroModem C..PHP Nuke HOWTOPPP HOWTOPagerPalmOS HOWTOPartitionPartition Mass Sto..Partition Mass Sto..Partition RescuePine ExchangePortSlavePost Installation ..Postfix Cyrus Web ..Pre Installation C..Print2WinPrinting HOWTOProcess AccountingProgram Library HO..Proxy ARP SubnetQmail ClamAV HOWTOQmail VMailMgr Cou..Querying libiptc H..RPM HOWTOReading List HOWTORedHat CD HOWTOReliance HOWTORemote BridgingRemote Serial Cons..SCSI 2.4 HOWTOSCSI Generic HOWTOSLIP PPP EmulatorSRM HOWTOSSL Certificates H..Scanner HOWTOScientific Computi..Scripting GUI TclTkSecure CVS PserverSecure Programs HO..Security HOWTOSecurity Quickstar..Security Quickstar..Serial Laplink HOWTOSerial Programming..Slovak HOWTOSmall MemorySmart Card HOWTOSoftware Proj Mgmt..Software Release P..Sound HOWTOSpam Filtering for..Speech Recognition..SquashFS HOWTOSybase ASA HOWTOSybase ASE HOWTOSybase PHP ApacheTCP Keepalive HOWTOTamil Linux HOWTOTimePrecision HOWTOTimeSys Linux Inst..Token RingTraffic Control HO..Traffic Control tc..UPS HOWTOUnix Hardware Buye..Unix and Internet ..UpgradeUsenet News HOWTOUser Authenticatio..VB6 to TclVMS to Linux HOWTOVPN HOWTOValgrind HOWTOVideoLAN HOWTOVim HOWTOVirtual WebWebcam HOWTOWikiText HOWTOWindows Newsreader..Wireless Link sys ..Wireless Sync HOWTOXDM XtermXDMCP HOWTOXFree Local multi ..XFree86 HOWTOXFree86 R200XFree86 Second MouseXFree86 Video Timi..XML RPC HOWTOXWindow Overview H..XWindow User HOWTOXinerama HOWTOXterminalsHtml singleI810 HOWTOLibdc1394 HOWTOOpenMosix HOWTOPhhttpd HOWTOPpp sshText
Next Previous Contents

2. Some general ideas about Compilers

A compiler is a translator which accepts programs in a source language and converts them into programs in another language which is most often the assembly code of a real (or virtual) microprocessor. Designing real compilers is a complex undertaking and requires formal training in Computer Science and Mathematics.

The compilation process can be divided into a number of subtasks called phases. The different phases involved are

  1. Lexical analysis
  2. Syntax analysis
  3. Intermediate code generation
  4. Code Optimization
  5. Code generation.

Symbol tables and error handlers are involved in all the above phases. Let us look at these stages.

  1. Lexical analysis

    The lexical analyzer reads the source program and emits tokens. Tokens are atomic units, which represent a sequence of characters. They can be treated as single logical entitities. Identifiers, keywords, constants, operators, punctuation symbols etc. are examples of tokens. Consider the C statement,

                    return 5;
    

    It has 3 tokens in it. The keyword 'return', the constant 5 and the punctuation semicolon.

  2. Syntax analysis

    Tokens from the lexical analyzer are the input to this phase. A set of rules (also known as productions) will be provided for any programming language. This defines the grammar of the language. The syntax analyzer checks whether the given input is a valid one. ie. Whether it is permitted by the given grammar. We can write a small set of productions. .:: telegra.ph ::. .:: vhearts.net ::.

    
            Sentence        -> Noun Verb
            Noun            -> boy  | girl | bird
            Verb            -> eats | runs | flies
    

    This is a grammar of no practical importance (and also no meaning). But it gives a rough idea about a grammar.

    A parser for a grammar takes a string as input. It can produce a parse tree as output. Two types of parsing are possible. Top down parsing and bottom up parsing. The meaning is clear from the names. A bottom up parser starts from the leaves and traverse to the root of the tree. .:: hangoutshelp.net ::.

    In the above grammar if we are given "bird flies" as input, it is possible to trace back to the root for a valid 'Sentence'. One type of bottom-up parsing is a "shift-reduce" parsing. The general method used here is to take the input symbols and push it in a stack until the right side of a production appears on top of the stack. In that case it is reduced with the left side. Thus, it consists of shifting the input and reducing it when possible. An LR (left to right) bottom up parser can be used. [Learn about Data structure]

  3. Intermediate Code Generation.

    Once the syntactic constructs are determined, the compiler can generate object code for each construct. But the compiler creates an intermediate form. It helps in code optimization and also to make a clear-cut separation between machine independent phases (lexical, syntax) and machine dependent phases (optimization, code generation).

    One form of intermediate code is a parse tree. A parse tree may contain variables as the terminal nodes. A binary operator will be having a left and right branch for operand1 and operand2. .:: podcasts.apple.com ::.

    Another form of intermediate code is a three-address code. It has got a general structure of A = B op C, where A, B and C can be names, constants, temporary names etc. op can be any operator. Postfix notation is yet another form of intermediate code.

  4. Optimization

    Optimization involves the technique of improving the object code created from the source program. A large number of object codes can be produced from the source program. Some of the object codes may be comparatively better. Optimization is a search for a better one (may not be the best, but better). [Grafana Questions]

    A number of techniques are used for the optimization. Arithmetic simplification, Constant folding are a few among them. Loops are a major target of this phase. It is mainly because of the large amount of time spent by the program in inner loops. Invariants are removed from the loop. .:: www.vid419.com ::.

  5. Code generation

    The code generation phase converts the intermediate code generated into a sequence of machine instructions. If we are using simple routines for code generation, it may lead to a number of redundant loads and stores. Such inefficient resource utilization should be avoided. A good code generator uses its registers efficiently.

  6. Symbol table

    A large number of names (such as variable names) will be appearing in the source program. A compiler needs to collect information about these names and use them properly. A data structure used for this purpose is known as a symbol table. All the phases of the compiler use the symbol table in one way or other. .:: podcasts.apple.com ::.

    Symbol table can be implemented in many ways. It ranges from the simple arrays to the complex hashing methods. We have to insert new names and information into the symbol table and also recover them as and when required.

  7. Error handling.

    A good compiler should be capable of detecting and reporting errors to the user in a most efficient manner. The error messages should be highly understandable and flexible. Errors can be caused because of a number of reasons ranging from simple typing mistakes to complex errors included by the compiler (which should be avoided at any cost).


Next Previous Contents

Share or Research:

Share on FB Post to X LinkedIn 🤖 Ask AI about this