Architecture of the Ultimate Extensible Distributed System

1 Architecture of the Ultimate Extensible Distributed Sys...
Author: Kristopher Blake
0 downloads 2 Views

1 Architecture of the Ultimate Extensible Distributed SystemJim Fawcett CSE681 – Software Modeling and Analysis Fall 2006

2 Your Assignment Your supervisor just handed you a spec for implementation of: Distributed system with universal connectability using sockets Can process an open-ended variety of documents Expandable by 5 orders of magnitude in ten years Can add new tools easily Supports 50 million users a day without gridlock. You say NO WAY! Well, maybe. Table of Contents

3 Introduction to the Internet and WebThis presentation addresses two questions: Is that possible? Well yes – look over there – the web! How was it accomplished? Processing structure and protocols Programming tools Web servers and browsers that host: Script languages, e.g., Javascript, VBScript, Perl, Programming languages: Visual Basic, Java, C++, C# And, of course, some very smart people Table of Contents

4 Table of Contents Introduction to the Internet and WebInternet Design Principles Internet and Web History Web Technologies Pinging Various URLs Web Processing Models Programming The Web Extending The Web People in the Web Table of Contents

5 Goals: Build distributed system to share documents.Support expansion by 5 orders of magnitude in ten years – 200 hosts to 500 million hosts. Manage communication between hundreds of millions of machines every day without collapsing from congestion. Provide for arbitrary extensions: From static text documents to graphics, dynamic content, streaming video, programmable interfaces, voice, … Table of Contents

6 Original Goals of the WebUniversal readership When content is available it should be accessible from any type of computer, anywhere. Interconnecting all things Hypertext links everywhere. Simple authoring Table of Contents

7 Internet Design PrinciplesGoal is connectivity Achieved with Internet Protocol (IP) Stateless so survives failures – no need to backup Made scalable with end-to-end intelligence Transport Control Protocol (TCP) Sender does not send until receipt is acknowledged Amount sent is based on receiver’s current available buffer size – so receiver won’t be flooded. Be strict when sending and tolerant when receiving Protocol Specific Packet Headers Internet Design Robustness and the Internet Table of Contents

8 Web Design Principles Universal Decentralized Modular ExtensibleScalable Accessible Forward/backwards compatibility Architecture of World Wide Web Table of Contents

9 Basic Concepts Client/Server Model Universal Addressing Search EnginesTCP/IP, DNS Search Engines Universal Protocols HTTP, URLs, HTML, FTP Format Negotiation through HTTP Hypertext  Hypermedia via HTML  XHTML Support for text, images, sound, and scripting Table of Contents

10 Internet and Web HistoryTable of Contents

11 Internet History 1961 – First paper on packet-switching theory, Kleinrock, MIT 1969 – ARPANet goes on line Four hosts, each connected to at least two others 1974 – TCP/IP, Berkley Sockets invented 1983 – TCP/IP becomes only official protocol 1983 – Name server developed at University of Wisconsin. 1984 – Work begins on NSFNET 1990 – ARPANET shutdown and dismantled 1990 – ANSNET takes over NSFNET Non-profit organization – MERIT, MCI, IBM Starts commercialization of the internet 1995 – NSFNET backbone retired 1998 – DNS transferred from Dept of Commerce to ICANN 2000 – Web size estimates surpass 1 billion indexable pages Table of Contents

12 Web History 1990 – World Wide Web projectTim Berners-Lee starts project at CERN Demonstrates browser/editor accessing hypertext files HTTP 0.9 defined, supports only hypertext, linked to port 80 1991 – first web server outside Europe CERN releases WWW, installed at SLAC 1992 – HTTP 1.0, supports images, scripts as well 1993 – Growth phase 1994 – CERN and MIT agree to set up WWW Consortium 1999 – HTTP 1.1, supports open ended extensions Table of Contents

13 Web Growth Phase – 1993 InterNIC created to provide registration services WWW (port 80 HTTP) traffic is 1% of NSFNET traffic 200 Known HTTP servers Article on WWW in New York Times Mosaic first release Table of Contents

14 Growth of the Web Table of Contents

15 Web Technologies Table of Contents

16 Tools: Servers on the InternetHTTP - HyperText Transport Protocol JSP and ASP added dynamic content Web Services add RPC program interface FTP - File Transport Protocol Gopher - Text and Menus NNTP - Network News Transfer Protocol DNS - Distributed Name Service telnet - log into a remote computer New tools - if they use TCP/IP just add them Table of Contents

17 Network Protocol StackHTTP HTTP TCP TCP Physically, a request goes down the protocol stack on the client, across the network to the server, then up the server’s protocol stack (solid arrows). Logically, however, the corresponding layers on each machine “talk” to each other (dashed arrows). IP IP Ethernet Ethernet Table of Contents

18 TCP/IP Protocol Architecture LayersNetwork Protocols OSI Model Layers TCP/IP Protocol Architecture Layers TCP/IP Protocol Suite Application Layer Presentation Layer Application Layer Telnet FTP SMTP DNS RIP SNMP HTTP Session Layer Host-to-Host Transport Layer TCP UDP Transport Layer This figure depicts the different layers used in networking protocols. Network Layer Internet Layer IP IGMP ICMP ARP Data Link Layer Network Interface Layer Ethernet Token Ring Frame Relay ATM Physical Layer Table of Contents

19 Networks - Transport LayerProvides efficient, reliable and cost-effective service Uses the Sockets programming model Ports identify application Well-known ports identify standard services (e.g. HTTP uses port 80, SMTP uses port 25) Transmission Control Protocol (TCP) Provides reliable, connection-oriented byte stream UDP Connectionless, unreliable Table of Contents

20 Communication Between NetworksInternet Protocol (IP) Routable, connectionless datagram delivery Specifies source and destination Does not guarantee reliable delivery Large message may be broken into many datagrams, not guaranteed to arrive in the order sent Transport Control Protocol (TCP) Reliable stream transport service Datagrams are delivered to the receiving application in the order sent Error control is provided to improve reliability Table of Contents

21 – few hundreds of millisecPinging Various URLs Ping in network – few millisec Ping in Syracuse – few tens of millisec Ping to Moscow – few hundreds of millisec Table of Contents

22 Tracing HTTP Message with TracertTable of Contents

23 HTTP – Excerpts from W3C Docs skip to HTTP MethodsAn application-level protocol with low overhead and the speed necessary for distributed, collaborative, hyper-media information systems. It is a generic, stateless, object-oriented protocol which can be used for many tasks, such as name servers and distributed object management systems, through extensions of its request methods (commands). A feature of HTTP is the typing and negotiation of data representation, allowing systems to be built independently of the data being transferred. The protocol is typically layered on top of TCP/IP in order to guarantee data transfer. The protocol consists of a request and response paradigm. Table of Contents

24 HTTP Messages as seen by packet snifferTCP [ :15:20.718] E €…šÀ¨ fÏ.¼  P‚X {È EPDpѼ GET /ms.htm HTTP/1.1 Connection: Keep-Alive Host: TCP [ :15:20.843] E nEÏ.¼À¨ f P {È E‚XIPÿ¶jà HTTP/ OK Cache-Control: max-age=60 Content-Length: 669 Content-Type: text/html Last-Modified: Thu, 11 Jul :05:42 GMT Accept-Ranges: bytes ETag: "be61bb30fd28c21:27b" Server: Microsoft-IIS/6.0 P3P: CP="ALL IND DSP COR ADM CONo CUR CUSo IVAo IVDo PSA PSD TAI TELo OUR SAMo CNT COM INT NAV ONL PHY PRE PUR UNI" X-Powered-By: ASP.NET Date: Wed, 19 May :15:16 GMT "http://www.w3.org/TR/REC-html40/loose.dtd"> Microsoft Corporation -- Where Do You Want to Go Today? If your browser can't handle redirect, please click here Request Message method Response Message headers message body Table of Contents

25 Typical HTTP TransactionClient browser finds a machine address from an internet Domain Name Server (DNS). Client and Server open TCP/IP socket connection. Server waits for a request. Browser sends a verb and an object: GET XYZ.HTM or POST form If there is an error server can send back an HTML-based explanation. Server applies headers to a returned HTML file and delivers to browser. Client and Server close connection. It is possible for the client to request the connection stay open with HTTP 1.1. Table of Contents

26 A Typical HTTP TransactionFrom my home network I typed: telnet GET / HTTP/1.0 On the next page you will see what I received: Table of Contents

27 Table of Contents

28 Table of Contents

29 HTTP Methods GET request-URI HTTP/1.1Retrieve entity specified in request-URI as body of response message POST request-URI HTTP/1.1 Sends data in message body to the entity specified in request-URI PUT request-URI HTTP/1.1 Sends entity in message body to become newly created entity specified by request-URI HEAD request-URI HTTP/1.1 Same as GET except the server does not send specified entity in response message DELETE request-URI HTTP/1.1 Request to delete entity specified in request-URI. TRACE request-URI HTTP/1.1 Request for each host node to report back Table of Contents

30 HTTP Request Method File HTTP version HeadersGET /default.asp HTTP/1.0 Accept: image/gif, image/x-bitmap, image/jpeg, */* Accept-Language: en User-Agent: Mozilla/1.22 (compatible; MSIE 2.0; Windows 95) Connection: Keep-Alive If-Modified-Since: Sunday, 17-Apr-96 04:32:58 GMT Blank line Data – none for GET Table of Contents

31 Multipurpose Internet Mail Extensions (MIME) skip to HTTP ResponseDefines types of data/documents text/plain text/html image/gif image/jpeg audio/x-pn-realaudio audio/x-ms-wma video/x-ms-asf application/octet-stream Table of Contents

32 Request Message request methods: DELETE, GET, HEAD, POST, PUT, TRACErequest line headers blank line body request methods: DELETE, GET, HEAD, POST, PUT, TRACE GET /pub/index.html HTTP/1.0 Date: Wed, 20 Mar :00:02 GMT Pragma: no-cache From: User-Agent: Mozilla/4.03 GET: Retrieves resource indicated by URI HEAD: Retrieves ONLY metadata indicated by URI POST: Pushes the resource in place of the URI (or creates a new one) or inputs the entity (data part) to the program(ex: CGI script) pointed by the URI Table of Contents

33 HTTP Response skip to Programming the WebHTTP version Status code Reason phrase Headers HTTP/ OK Date: Sun, 21 Apr :20:42 GMT Server: Microsoft-Internet-Information-Server/5.0 Connection: keep-alive Content-Type: text/html Last-Modified: Thu, 18 Apr :39:05 GMT Content-Length: 2543 Some data... blah, blah, blah Data Table of Contents

34 Response Message status line headers blank line body Table of ContentsHTTP/ OK Date: Tue, 08 Oct :31:35 GMT Server: Apache/ tomcat/1.0 Last-Modified: 7Oct :40:01 GMT ETag: "20f-6c4b-3da21b51" Accept-Ranges: bytes Content-Length: 27723 Keep-Alive: timeout=5, max=300 Connection: Keep-Alive Content-Type: text/html status line headers blank line body Why a human readeable phrase? To make extensibility easier. You can add new status codes without waiting for a protocol standard Table of Contents

35 Status Codes Classes: Table of Contents 200 OK 201 Created202 Accepted 204 No Content 301 Moved Permanently 302 Moved Temporarily 304 Not Modified 400 Bad Request 401 Unauthorized 403 Forbidden 404 Not Found 500 Internal Server Error 501 Not Implemented 502 Bad Gateway 503 Service Unavailable Classes: 1xx: Informational - not used, reserved for future 2xx: Success - action was successfully received, understood, and accepted 3xx: Redirection - further action needed to complete request 4xx: Client Error - request contains bad syntax or cannot be fulfilled 5xx: Server Error - server failed to fulfill an apparently valid request 404, 403, 200, 304 Table of Contents

36 Headers Request Line A Blank Line Body Entity Headers Request HeadersGeneral Headers Status Line A Blank Line Body Entity Headers Response Headers General Headers Table of Contents

37 Headers General Headers Request Headers Table of ContentsDate Pragma Cache Control Connection Trailer Transfer-Encoding Upgrade Via Warning Authorization From If-Modified-Since Referer User-Agent Accept Accept-Charset Accept-Encoding Language Expect Host If-Match If-None-Match If-Range If-Unmodified-Since Max-Forwards Proxy-Authorization Range TE Headers present in HTTP/1.0 & HTTP/1.1 New Headers added in HTTP/1.1 Table of Contents

38 Headers Entity Headers Response Headers Table of ContentsAllow Content-Encoding Content-Length Content-Type Expires Last-Modified extension-header Content-Language Content-Location Content-MD5 Content-Range Location Server WWW-Authenticate Accept-Ranges Age ETag Proxy-Authenticate Retry-After Vary Headers present in HTTP/1.0 & HTTP/1.1 New Headers added in HTTP/1.1 Table of Contents

39 Web Processing Models HyperText Markup Language (HTML)Web of linked documents Unlimited scope of information content HyperText Transfer Protocol (HTTP) Universal access HTTP is a "request-response" protocol specifying that a client will open a connection to server then send request using a very specific format. Server will then respond and close connection. Graphical Browser Client Sophisticated rendering makes authoring simpler HTML File Server Using HTTP, Interprets request, provides appropriate response, usually a file in HTML format Table of Contents

40 HTML Structure HTML tag Tagged Head section declarationsTagged Body section Block elements Headings, paragraphs, lists Forms Text fields, Buttons, Menus, … Frames Images Links Tables Text Table of Contents

41 Table of Contents

42 Table of Contents

43 Extension - Cascading Style SheetsHelp to separate content from presentation Defines styles using C-structure like notation: body { font-family: tahoma; font-size: medium; } may apply to specific tags, as above .notice { color: red; font-size: large; } defines a class called notice by default can be applied to any tag Table of Contents

44 Cascading style sheets introducedCascading style sheets introduced. Helps separate content from presentation. Table of Contents

45 Programming The Web Table of Contents

46 Web Programming Model Packaged functionality Dynamic content displayWeb server supports default and user supplied controls Dynamic content display ASP, JSP generates HTML using server data Browser interprets client side scripts Machine-to-Machine Web services provide RPC interface Table of Contents

47 Programming the Web Client-Side Programming JavaScript Dynamic HTMLCan modify html document using scripts sent from server and interpreted by client. .Net controls – need permissions Server-Side Programming ASP script Server components C# code-behind ADO Web controls used on ASPX pages Web services Table of Contents

48 Web Programming – Language ModelTable of Contents

49 Programming the Web Server-Side CodeWhat is server-side code? Software that runs on the server, not the client Receives input from URL parameters HTML form data Cookies HTTP headers Can access server-side databases, servers, files, mainframes, etc. Dynamically builds a custom HTML response for a client This course will focus on server-side .NET technologies. Table of Contents

50 Traditional HTML Serving ModelTable of Contents

51 ASP Dynamic Serving ModelTable of Contents

52 ASP.NET Serving Model Table of Contents

53 Programming the Web Server-Side CodeWhy server-side code? Accessibility You can reach the Internet from any browser, any device, any time, anywhere Manageability Does not require distribution of application code Easy to change code Security Source code is not exposed Once user is authenticated, can only allow certain actions Scalability Web-based 3-tier architecture can scale out Table of Contents

54 Three Tier ArchitectureClient Tier Presentation layer Client UI, client-side scripts, client specific application logic Server Tier Application logic, server-side scripts, form handling, data requests Data Tier Data storage and access Table of Contents

55 Table of Contents

56 .Net Controls The model of previous slide is very powerful!A browser that knows nothing about some sophisticated server-side processing can take advantage of that by downloading a .Net control that encapsulates all the intelligence necessary to work with the server. Similarly, a browser can be given new processing capabilities, simply by loading a local web page that contains controls with the desired abilities. Note that web page scripts do the same thing, only not quite so efficiently, and often with limitations on processing capabilities. Table of Contents

57 Displaying ActiveX Controls on a Web PageHere is an example of an object tag and attributes for inserting a control on a Web page. WIDTH=400 HEIGHT=200 ALIGN=center HSPACE=0 VSPACE=0 > 58 Table of Contents

59 Browser Object Model Window Document Form Frame Location Navigatorbrowser window Document current HTML page Form a form holds controls often used to submit data to server Frame frame in browser window Location Location of current web page URL, domain name, port, path, … Navigator Browser, itself History Table of Contents

60 Table of Contents

61 Some Examples Basic HTML pages Example #1 Table of Contents

62 Server Object Model Application ObjectData sharing and locking across clients Request Object Extracts client data and cookies from HTTP request Reponse Object Send cookies or call Write method to place string in HTML output Server Object Provides utility methods Session Object If browser supports cookies, will maintain data between page loads, as long as session lasts. Table of Contents

63 Server Components skip to Security IssuesAd Rotator – rotates advertisements Browser Capabilities – determines type Database Access Active Data Objects (ADO) provide common interface to a variety of data sources Content Linking Creates list of web pages File Access Component Provides access to server files from scripts Table of Contents

64 Table of Contents

65 Server Side Programming with ASPAn Active Server Page (ASP) consists of HTML and script. HTML is sent to the client “as-is” Script is executed on a server to dynamically generate more HTML to send to the client. Since it is generated dynamically, ASP can tailor the HTML to the context in which it executes, e.g., based on time, data from client, current server state, etc. Table of Contents

66 Table of Contents

67 Table of Contents

68 Table of Contents

69 ISAPI – Server Side ExtensionsServer Extensions work like CGI scripts to provide server-side processing, but they are DLLs, which reside in the memory space of the HTTP server. This is an enormous performance advantage over CGI extensions which need to spawn a new process each time they are run. The extension DLL exports HttpExtensionProc(), which is called by IIS when the user request asks for the extension processing. Active Server Page (ASP) scripts are an easier way to accomplish the same thing. One would expect the ASP script to be faster than CGI but slower than an ISAPI extension. Table of Contents

70 Using Controls and AppletsWe’ve already seen how to include an ActiveX control on a web page. Now let’s see how to do that for a Java Applet: Java Applet - Sprites Table of Contents

71 Including Java Applet Table of Contents Table of Contents

72 Security Issues Threats Data integrity Privacy Denial of servicecode that deletes or modifies data Privacy code that copies confidential data and makes it available to others Denial of service code that consumes all of CPU time or disk memory. Elevation of privilege Code that attempts to gain administrative access Table of Contents

73 Table of Contents

74 Protections Least privilege rule: Digital signing Security zonesUse the technology with the fewest capabilities that gets the job done. Digital signing Who are you? Security zones Trusted and untrusted sites Secure sockets layer (SSL) Transport layer security (TLS) Encryption Table of Contents

75 Extending The Web Table of Contents

76 Current Extensions Describe data with XML Extend HTML into XHTMLSeparate style from content with CSS Cascading style sheets Can be included from a file to give uniform style of pages and documents Document Object Model – DOM Defines a scripting interface Table of Contents

77 The Extensible Web Some recent W3C Technologies Table of Contents

78 Areas of Exploration XML - Universal Data ServicesTVWeb - merger of features MathML - Mathematical Markup Language RDF - Resouce Description Framework Accessibility - for the handicapped SMIL - Synchronized Multimedia Integration Language Internationalization Speech Table of Contents

79 People in the Web Web Development Internet Web server, HTTPTim Berners-Lee, Robert Cailiau Mosaic web browser Marc Andreessen Internet TCP/IP protocol Vinton Cerf, Robert Kahn Internet flow control Larry Roberts Table of Contents

80 References World Wide Web ConsortiumExcellent Tutorial Papers, standards Source of several slides used here Mark Sapposnek webdev.htm Tutorials Web developer’s links Web designer’s links Tech details links XHTML Black Book, Steven Holzner, Coriolis, 2000 Aging but comprehensive treatment of HTML, XHTML, JavaScript Web Developers Virtual Library More tutorials Table of Contents

81 End of Presentation Table of Contents