Introduction to Distributed Systems II

9 minute read

Updated:

Centralised vs decentralised systems

Centralised Decentralised
One component with non- autonomous parts Multiple autonomous components
Component shared by users all the time Components are not shared by all users
All resources accessible Resources may not be accessible
Software runs in a single process Software runs in concurrent processes on different processors
Single point of control Multiple points of control
Single point of failure Multiple points of failure
  • Multiple points of control – if you have a true dist sys – if one fails, rest will still work – you HIDE the aspect of failure!

    • In google, it will fail all the time but for us it still works

Types of distributed Systems

1 - High performance distributed computing systems

  • Historically, if we look at parallel computing, aka running code over a large no of processes. We have 2 models - Shared and distributed memory model. We emphasized on two models

    • Shared memory– one memory, and large number of processors –to access the memory they have to go through interconnect (like a middleware). → it ensures the consistency of data.
      • OpenMP – famous – shared memory system
      • PThreads – another shared memory system
    • Private memory – everyone has their own memory
      • MPI – message passing interface
    • Bear in mind the two diff model. But there’s a third one – distributed shared memory!

Cluster computing- to do with group of machines connected through LAN

  • Have all process, os, BUT there must be some form of node, a master node, it has application to run, has always compute nodes, and do some computation, and give me the result.

Grid computing (decentralize) – idea is that if you take number of resources form various resources, why cant you bring them all together to be shared by individuals → enabling communities which becomes a part of virtual organization → idea of sharing resources, geographically!

  • E.g. 1998 [p8] – grid comp was v important in the US. Teragrid – XSEDE now.
    • The resources are spread all over the country. What they do is they put all machines to a high-speed network→ resource sharing!! And you add the job to the queue, you don’t know where its gonna be executed. → it’s a one single organization.
    • Interesting is that, you will benefit when managing and accessing the resources when you need them!
  • Technical challenges of grid computing

    • Security and trust – no central control, so many admins -> you gotta trust them all
    • Resource stability – resources come in and out and change over time.
    • Complex distributed apps – workflow (do I have an app that starts with 0 -00-00- back to 0) when im done with task 1, run 23 paralley, can I have a resource scheduler??? whats interesting is if you have computing capability of a grid – you can do that fairly quickly!
    • Resource heterogeneity – how heterogenous ur grid is – in terms of hardware, cpu, gpu – what kinda resource do you have on ur grid

Layered grid architecture – TCP/IP protocol stack

  • image
  • Fabric layer – compute resources, storage, network → control locally
  • Connectivity layer – talking to things, enabling ppl to have single sign on – imp aspect of sec.
  • Resource layer – is it possible now to share resources? What are the protocols that will allow me to configure my grid so that anyone could submit and run on these machines on the grid?
  • Collective layer – co-schedule and allocate resources

Cloud computing (centralized)

  • Computing resources by default are virtualized. And a lot of the services we access on phone will reside on a cloud.
  • Q: Example of virtualization
    • Hyper-V by Microsoft, amazon web services, doctor containers
  • 3 services of cloud computing
    1. Infrastructure as service <→ map with datacentre (hardware, cpu), computation, storage, like amazon s3
    2. Platform layer <–> with storage framework, platform tools so that cloud provide you the platform you to just deploy it on that platform – MS azure
    3. Software as a service – applications, web services

2 - Distributed information systems

  • Always need to integrate apps from here and there – apps that are usually networked, so the idea of interoperability was always problematic.
    • Apps are on server available to clients, and simple integration will consist all clients and requests from all apps, then send that off, finally present the copy of result to user
    • Can we have diff apps in diff servers to talk to each other? -enterprise application integration

  • E.g. Distributed Transaction Processing
    • Database applications where operations that are carried out as a transaction.
    • Transaction in database will have number of actions to execute
      • Begin, end, abort transaction
    • ALL OR NOTHING IS VERY VERY IMP!!
      • ACID – atomic (will happen indivisibly, happen or not happen), consistent (doesn’t lead to incorrect result), isolated, durable (when a commit is there, its permanent)
      • E.g. holiday - we gonna book a flight AND you needa book hotel too. They go together!
        • If you decide to abort it, and ur not interested, you needa cancel both! They have to abort. Because they need to abort, ACID holds.


3 - Distributed Pervasive Systems

  • The node devices are small, mobile, and embedded in a larger system → naturally blends into the user’s environment → Its everywhere.
    • E.g. Internet of things – devices that are there to send some data.
    • E.g. Car that is always connected, smart city, fridge,
  • Three overlapping subtypes:
    • Ubiquitous- has to do with devices being all over the place, and as a user we can continuously communicate w/ the system.
    • Mobile- Devices are inherently mobile. It tends to move from one place to other. Then how do you trace them???

      • E.g. Mobile phone network where the millions of users move around, but are constantly connected.
    • Sensor networks – store, analyse, and make sense of data. Its important! As you can potentially make decisions.

      • E.g. Smart city-Leeds. Number of sensors to detect pollution, air quality, etc

E.g. IoT: Telemedicine Application - Eg of an ubiquitous system

  • A person living on his own, bc of the issue, this person has some sensors on the body, like a body sensor node or a wearable device. What they do is they send data! Ideally to the doctor in the hospital.
    • Consider: Is patient mobile data connected? Is wifi connected? Zigbee is used for communication btwn a sensor and a control system
  • Data is sent out, its saved in cloud, but the fact that the data is collected gives opportunities to doctors to analyse the data.
  • And there’s a software that analyses the data. Then you expect the doctor to be notified automatically through his phone to take further actions.

E.g.2 – Connected Car

  • A lot of cars are connected! E.g. Jaguar. And its expected that 100% of cars would be connected by 2025
  • When you drive this car, it will generate all sorts of data. As cars have substantial amount of data.
  • AND you HAVE TO analyse it very very quickly!!!! Inside a car, there’s a computing infrastructure that makes sure that car does analyse the data. But sometimes the car can’t analyse everything due to the amount of data.
  • If the data is too big, send it to cloud and analyse it! It has to be quick as it could be related to accidents

Ubiquitous Computing Systems

  • If you look at any system, it’s likely that it will ALL support 5 different characteristics
  • Distribution – Devices are networked, distributed, accessible in a transparent manner – all devices, computer, cloud, etc are distributed, and the fact that they are a part of d.s. it can be viewed as one single coherent system
  • Interaction – interaction btwn users and devices is highly unobtrusive (smooth and user is unaware)
    • If you take an example of a connected car, say this car is driven by wife or a husband. When its time for the husband to drive, the car is smart enough to sense that it’s the husband! (position, weight). The car will make sure that this interaction will lead to no. of events – seating or mirror arrangements
      • Input device are used to identify situation → input analysis leads to actions
    • The interface is seemingly hidden in ubiquitous computing
  • Context awareness – system is aware of user context. Context BOARD written. To take context in which interactions take place into account. Raw data that’s collected by sensors is lifted and abstracted so that it can be used by applications
    • Where - location
    • Who – who is the user?
    • When – am I interacting now? Or later?
    • What – event or an action
  • Autonomy – system operates autonomously with minimum human interaction.
    • Take sensor network and you add one more device. Fact that you add a device and it interacts with the rest → system is autonomous!!
      • Universal Plug and Play Protocol (adding devices to an existing system)– devices can discover each other and make sure they can set up communications between them.
      • Address allocation - If the device is mobile, its likely that the IP address is gonna change. DHCP allocates IP address to any of these devices automatically.
  • Intelligence – can handle a wide range of dynamic actions and interactions using the methods and techniques from the field of AI.

Mobile Computing Systems

  • Fact that we have mobile devices that changes location and that they can be added to a network. → Change of local services and reachability
  • They need to be reachable!! And communication can become hard in certain scenario. E.g. no signal. Therefore might not be reachable → connectivity is not always guaranteed

Sensor Networks

  • Sensor network consists a large number of small nodes, equipped with sensing devices, they are usually battery powered, and uses wireless comm. It has small computing power and the communicating capacity is limited
  • image

  • Model A – the data collected in sensor network is sent directly to operator. The operator receives the data and potentially stores it and performs computation.
    • Easy architecture
    • Drawback - Since you are constantly using network resources to send the data, its a waste! you may end up with so much data but not know what to do
  • Model B- The sensor is clever bc they can sense the data, store and process it locally!! When operator asks to ‘send me resources relating to x? ’ then the sensor network can send the result of the following query
    • Drawback – since the storage capacity in the sensor is usually limited → when operator sends a query, it may not rely on the aggregation of the data collected. Data might not be there! overwritten

Summary

  • We looked at diff types of distributed system
  • Emphasized the importance of
    • Computations
    • Information Processing
    • Pervasiveness

Leave a comment