Introduction to Concurrency: Introduction to Concurrency

1 Introduction to Concurrency: Introduction to Concurrenc...
Author: Dustin Cummings
0 downloads 3 Views

1 Introduction to Concurrency: Introduction to ConcurrencyMajeed Kassis

2 What is Concurrency? The ability to run multiple tasks on a single CPU at the same time. Central to the design of modern operating systems is managing multiple processes: Multi-threading Multi-processing Distributed processing Central issue is managing Concurrency: Communication among processes Sharing of and competing for resources (memory, files, and I/O access) Synchronization of the activities of multiple processes Allocation of processor time to processes

3 Core Terms Atomic Operation: Critical Section: Mutual ExclusionA sequence of indivisible statements; executed without interruptions Critical Section: A section of code that requires access to shared resource Cannot be executed concurrently Mutual Exclusion The ability to allow single access to a critical section at any given time Race Condition: A flaw in the application that occurs if the timing or the ordering of events are important, leading to a wrong result in case the timing or ordering was not enforced Example: Two threads incrementing a shared value in a loop. Condition Variable A variable allowing threads to block as long as the variable value did not change Once the variable value changes, then the condition is met, allowing the blocked threads to continue working.

4 Parallel vs. ConcurrentParallel: Using multiple processing resources (CPUs, cores) at once to solve a problem faster. Example: A sorting algorithm that has several threads each sort part of the array. Concurrent: Multiple execution flows (e.g. threads) accessing a shared resource at the same time. Example: Many threads trying to make changes to the same data structure (a global list, map, etc.). CPUs/cores work CPUs/cores resource

5 Concurrency Unlike parallelism, not always about running faster.Even a single-CPU, single-core machine may want concurrency. Useful for: App responsiveness Example: Respond to GUI events in one thread while another thread is performing an expensive computation Processor utilization (mask I/O latency) If 1 thread is stuck working, others have something else to do Failure isolation Convenient structure if want to interleave multiple tasks and do not want an exception in one to stop the other

6 Concurrency is done by Time SlicingMany threads want to access a single shared resource Instead of giving each time its time to finish working in a queue We assign a “quantum” of time on which each thread may use. Once quantum is used up, the next thread gains access We only ensure time fairness, not performance fairness! If a given piece of code is run by two threads at once: The order of which thread gets to run first is unpredictable. How many statements of one thread run before the other thread runs some of its own statements is unpredictable.

7 Possible Solutions Data Corruption CommunicationThe use of mutual exclusion mechanisms on critical sections prevents data corruption. Causes performance slowdowns! Adds complexity! Communication Prevent racing condition by using conditional variables. The use of conditional variables enables partial ordering in a concurrent system.

8 Concerns in Concurrent SystemsData Corruption Due to the nature of concurrent systems, the shared data may get corrupted. Communication Communication between tasks is needed to avoid conflicts. Runtime Bad design or implementation of the system may cause slowdowns. Complexity Concurrent systems are more complex than sequential ones.

9 Possible Solutions - ContinuedRuntime Minimize the use of mutual exclusion mechanisms. Minimize the size of critical sections. Complexity A big problem due to the use of mutual exclusion mechanisms. Lock-free alternatives?

10 Concurrency in Distributed SystemsMajeed Kassis

11 What is a distributed system?A distributed system is a collection of independent computers that appear to the users of the system as a single system. They communicate by using a network by passing messages. Successful distributed system requires: Networking Processors Memory and Storage Communication Protocols Without the advancement of technology in these five categories, we would not have distributed systems today.

12 Advancements of Technology: NetworkingRobert Metcalfe presents the concept of Ethernet at the National Computer Conference in 1976. Ethernet introduced as networking standard in 1980.

13 Ethernet Progress Throughout the Years1980: 2.94Mbps – 367kB/s 1982: 10Mbps MB/s (thick cable) 1985: 10Mbps – 1.25MB/s (thin cable) 1995: 100Mbps – 12.5MB/s 1999: 1Gbps – 125MB/s 2001: 10Gbps – 1250MB/s 2010: 40Gbps/100Gbps 2017: 400Gbps

14 Advancements of Technology: NetworkingWi-Fi - Wireless Fidelity It is a wireless technology that uses radio frequency to transmit data through the air 1985: 1990: IEEE established group 1997: Legacy, 1/2Mbit/sec, 2.4Ghz 1999: a, 6-54Mbit/sec, 5Ghz 2003: g, 6-54Mbit/sec, 2.4Ghz 2009: n, 65Mbit/135Mbit/sec 2.4Ghz/5Ghz First MIMO (multiple-input/multiple-output) streams! This allows up to 4x4 connectivity! 2013: ac, 866.7Mbit/sec max, 5Ghz 8x8 MIMO 2016: ad, 7Gbit/sec, 60Ghz

15 Wi-Fi Advantages Freedom Setup Cost Flexibility Scalable Mobile AccessYou can work from any location that you can get a signal. Setup Cost No cabling required. Flexibility Quick and easy to setup in temp or permanent space. Scalable Can be expanded with growth. Mobile Access Can access the network on the move.

16 Side effect? Network Size!1985: Large companies and universities Consumers using dial-up Gateways between networks 1961 hosts online! 2017: One large internet Widespread connectivity High speed Over 1 billion hosts online!

17 Side effect? Communication speed!1985: Streaming impractical Downloading one song will take ~15minues using 44Kbps modem Today: Streaming high quality songs is trivial. Google Music, Apply Music, Spotify, etc Downloading one song will take 0.4seconds using 100Mbps connection Streaming 4K Netflix video is instantaneous.

18 Advancements of Technology: Processing PowerCPUs: Smaller Cheaper Power efficient Faster Microprocessors became the core of all systems today. Intel x86 architecture for heaving computing 15Watt – i7 7600U - Ultrabooks 65Watt – i – Desktops 91Watt - i7 7700K – Faster desktops ARM architecture for lower power mobility CPU – 2Watt GPU – 2WAtt MMU – 0.5Watt

19 History of CPUs – Consumer Desktop CPUs1977: 8080, 2Mhz, 6K transistors 1985: 386DX, 33Mhz, 275K transistors 1995: Pentium Pro, 200Mhz, 5.5M transistors 2003: Pentium 4, 3.8Ghz, 125M transistors – Fastest CPU in Mhz 2005: Pentium D, 3.7Ghz, 169M transistors – First dual core CPU 2008: i7 920, 2.67Ghz, 731M transistors – 4 core CPU 2017: i7 7700K, 4.2Ghz, 5.2B transistors – 4 core CPU AMD: Ryzen 1800X, 3.6Ghz, 4.8B transistors – 8 core CPU

20 Advancement of Technology: Storage1975: 5.25” Floppy Disk 1981: 3.5” Floppy Disk 1983: 3.5” HDD 1984: CD-Rom 1988: 2.5” HDD 1995: DVD-Rom 1999: Flash Memory 2003: Blu-Ray

21 Side effect? Storage costs reduced drastically:1985: 50GB would cost $511,640 using over 1279 drives Today: 4TB HDD costs $120 This allowed many companies to massively increase in size base of its customers, mainly customer based content creation websites

22 Advancement of Technology: ProtocolsBase network protocols developed over the years: HTTP – application layer TCP – transport layer IP – internet layer Ethernet – network interface Massive amount of distributed programming APIs developed We will learn some of these APIs Non-Blocking IO AsyncIO /Node ReactiveX Reactive REST

23 Distributed Systems: AdvantagesEconomical Can achieve good price/performance radio in contrary to centralized systems Reliable Built, by design, to be fault tolerant to hardware failures Incremental Growth Allows for performance/storage growth in tiny steps Uses consumer grade hardware! Scalable The system can grow as needed without the need for drastic changes in its architecture

24 Scalability Examples Distributed systems scale in contrary to multiprocessor systems. What does that mean? Google search engine: In 1999 it took 1 month to crawl and build an index of about 50 million pages. In 2012, the same task was accomplished in less than one minute. A single query uses 1,000 computers in 0.2 seconds to retrieve an answer Disney’s Cars 2 movie. Rendering a single frame took 11.5 hours on average. They used 12,500 cores! Monsters University required (on average) 29 hours per frame. Total render time: 100 million CPU hours. They’ve used 5000 AMD processors over 10Gbps networks.

25 Distributed Models: Client-Server ModelClient sends requests to server. Server is a system that runs a service. The server is always on, and processes requests from clients. Clients do not communicate with other clients. Examples: HTTP, FTP,

26 Multi Tier ArchitectureA client server architecture Splits server into three tiers Each tier can be modified and updated independently of the rest of the tiers! Example: (right) 3-Tier architecture for web systems

27 Peer-to-Peer Model: Napster (Centralized)1999: First peer-to-peer application! Clients directly connected to each other Peers act as clients and servers. Peers initiate sessions with each other. Central server for indexing only. Drawback? Advantage? Can search for files.

28 Peer-to-Peer: BitTorrent (Decentralized)No central server! Each node connected to many other nodes. Files are divided into small pieces! Downloading from many users Uploading to many users Can be pure peer-to-peer with super-peers acting as trackers. Drawback? Cannot search for files Advantages Faster downloads Fast distribution of popular content Major users of P2P? Windows Update! Microsoft Skype Blizzard Entertainment

29 Cloud Computing Resources are provided as an internet serviceSoftware – SaaS Remotely hosted software Google Apps, Office 365 Infrastructure – IaaS Scalable, high performance virtual machines Amazon Web Services – AWS, Google Compute Engine Platforms – PaaS Scalable platforms to build web applications Google App Engine, Amazon Elastic Beanstalk Cloud Storage Remote file storage DropBox, Google Drive

30 Distributed Systems: Design ChallengesReliability: Data must never get lost due to failures Fault tolerance – in case of hardware, software, or network failure Availability: Must be online all the time Reduce latency Scalability: Adding more processing power increases performance Handling geographical distance Security Authentication, authorization, encryption

31 Concerns in Distributed SystemsConcurrency of components Each component is independent Has no access to other component resources How to share data? How to modify shared data? Lack of global clock Event ordering issue Independent failure of components Failed component must not affect the system What happens if you are waiting on another process? Failed component must not cause data loss

32