© Distribution of this video is restricted by its owner
Transcript ×
Auto highlight
Font-size
00:04 and trying to call Cologne this foot ? Okay, So at closely about

00:19 I today, which is being used programming clusters of war in general distribution

00:29 type systems so far. So I talk a little bit to buck the

00:34 back on, um, a little more than last time and then start

00:44 talk about various suspects of M P . And I'll see how far I

00:49 right. So, yes, it's reminder again. So now the target

00:55 of platforms for MP i. The collection of incomplete computer systems,

01:05 , and like servers or PCs and and then hope together typically a in

01:12 or some for in principle could be bus. But Busses air, usually

01:17 capable to keep up with the requirements the communication capability between independent No,

01:29 so the difference is, as it said, that now they're definitely that

01:41 memory spaces. So there is no address space, and also, in

01:46 ways it's somewhat similar to when we about heterogeneous knows programming. But in

01:52 case, um, first it tends be to your or very few on

02:00 memories, places and the old can more than once the deep you attached

02:06 it in terms of accelerator. So not necessarily just to It could be

02:10 few, but it's usually if you in this case there may be hundreds

02:15 thousands or even tens of thousands off memory spaces. Uh, so,

02:25 , and I think this is pretty process on this line that is a

02:29 busy, but the notion is again there is no global address place but

02:37 collection off separate address places. So means the global picture is simply something

02:48 the call programmer basically keeps in Hopefully, in the right thing,

02:55 cold. It's the question of stitching separate things happening in the various

03:05 It's sometimes a little bit confusion between or and processes. But strictly

03:14 M p I is about processes communicating cooperating and solving a problem, regardless

03:24 how they're then mapped onto execution units former process source. Okay, And

03:32 you will see in the examples and common practice the way in which fasters

03:40 programming is, as I mentioned this so called spin, the model

03:45 single program off the podia and So, um, in this case

03:58 , the parallel computation is a collection processes. Has someone said,

04:06 when we talked about open MP It . A collection of threads and threads

04:14 kind of in some ways, like processes, and I'll remind you shortly

04:19 the distinction. Um, but they're fundamental differences between how interaction between threads

04:31 happening and how interaction between processes are . So when it comes to

04:41 really, for them to share that information, some corns and then form

04:49 has to be done through some form communication, which might be system

04:55 Or it may be called the which is the case when it comes

04:59 M. P. I. so here's, I guess, a

05:06 about the difference distinction between threats and in the sense that Fred's um,

05:16 really liked right versions, and they less context in the process that also

05:24 , in many cases counting information, Donell. The use of resources should

05:30 charged, but also what the privileges for the different users that it's usually

05:36 to. The account information and processes not share code. So the East

05:43 has also their own codes and their sort of memory space and its own

05:50 of data. So is this, , I guess hopefully making the distinction

06:00 little bit clear if it wasn't So, as I said before,

06:07 only thing that eyes global is whatever programmer keeps in the head. And

06:15 means from being stuck. Data needs be divided up among the processes and

06:25 today, but in some coming I'll talk about tools for figuring out

06:30 to, in a good way carve data structures into pieces and allocate parts

06:41 the global data to each of the . But and then, uh,

06:50 . If it's not just a collection independent job, so to speak and

06:55 do not need to share data, the programmer also needs to. It's

07:04 manage. The interaction between the process to get the correct cancer. So

07:11 quite different from the contracts that open where there was a shared memory and

07:18 other space was common toe all the and sharing all information between dresses simply

07:28 access and variables in the shared Mm. And the message passing model

07:37 a citizen inside it, Very flexible this was used way, way back

07:47 terms of doing operating systems for distributed . So there was one operating system

07:58 Multex. I can't remember what somewhere the sixties and seventies used the message

08:05 between the different processes, uh, by the operating system to manage the

08:13 and otherwise essence that it's non trivial implement and so hopefully will become

08:22 And so talk a little bit more this message. Person system works.

08:31 any question of this very general background empty I, then I'll talk a

08:37 bit more about the principle of message . It's not so. Maybe somewhere

08:55 all of you have taking networking and that was about TCB, for

09:00 , and not on the I P or how T subpoenas communication. But

09:10 it puts together packages that has information source and destination, and all kinds

09:19 other things about the data is being , and MP eyes or the message

09:27 in general works on that idea. so the So that's what's this

09:34 Try to really write illustrate that, , the very basic model is that

09:42 is kind of first a handshake between sender and the receiver to make sure

09:50 the receiver is ready to accept some . And if that's the case and

09:57 acknowledgements to the sender is, send , say, that's okay. I'm

10:04 , Noel. And send whatever you . Aan den. They actually ah

10:11 . All the data in this case being said. So this is the

10:16 simple model and that waas the basic in the first couple of integrations of

10:22 p I. Now there is also known as for the one sided from

10:30 where I only one party is kind active and you can decide to right

10:38 some other process memory space or uh, retrieve data from some other

10:46 . Praise And I was talking about , but that will be on Wednesday

10:51 today. Today I will just cover or basic aspect on the That's his

11:03 . So things that has to be upon for this that's expressing like NCC

11:11 is how you describe data to be and that maze on, uh,

11:23 Foreman and it is perhaps straight forward do. But it's important to understand

11:29 what level you actually do describe the and s I wanna talk about in

11:35 bit, that for MP. I to do the description of the data

11:43 the level that is consistent with hello is used or define in programming

11:53 programming language. Do not taking into uh huh, or what the machine

12:01 off the data actually is. And a very important aspect when you think

12:08 , um, distributed systems, where by no means clear that the different

12:17 runs on the same type of hardware has, uh, and potentially their

12:24 processes are different representations off integers of conta. So in that case,

12:33 , by staying at the language the programmer at least gets shielded from

12:37 on its and the business off the Zoff, the supporting routines to make

12:47 that the right things actually do Other things is to figure out how

12:54 identify, like source and destination and , like in the TCP. I

12:59 in that case, and I'll be since and in terms of clusters,

13:05 may also have need tohave ways of source and destination. We'll talk about

13:14 , and then there are other issues trying. Thio. If there are

13:18 of messages coming around, how do kind of make sure that when you

13:24 messages, it's exactly what the center you to receive? Some owner needs

13:31 be some way off, matching up is being sent and received. And

13:39 , you know, finally, what that mean? Things to be

13:44 And there are lots of different options I will talk about how this is

13:50 being done on scope. Send on line. Years basically thio this

14:01 as I said, for clusters or systems that in principle could be in

14:12 locations over a local wide area So it may not necessarily. We

14:20 together with the dedicated network, and was see a number of slides that

14:31 is. Yeah, NPR has grown this point. It's probably justified to

14:38 it kind of full feature, but is important, I guess as I

14:46 to that, things they're trying They kept up to language level and

14:52 off the hardware on which it So is the idea is that the

14:59 called should be portable as long as correspondent compilers knows what to do on

15:06 right set off libraries are being used M. P. I is based

15:14 library call to carry out and look process communication. So it's not director

15:24 , like an open and to your a sissy. So in this

15:27 it's such a divers that has to both built and installed on the platform

15:34 the interaction between processes to work using . P. I. There are

15:42 types of communications that I will talk a bit about, I think,

15:49 the next few sides. So there two broad categories, I would

15:56 in terms off what one kind of for applications. This was typically Andonis

16:02 to point on another one, as says on this slide is collected communication

16:08 the collective communications way. One example was talked about when I talked about

16:19 compilers and open empty What's this reduced ? Their corresponding things that works across

16:28 not just and in a simple address . It allows on that Thio have

16:39 to define data types, and that be quite useful. And we'll talk

16:45 that as we go through. And I were dimension, it's supposed to

16:51 across heterogeneous platforms. More clusters on is just a little bit trouble.

17:01 reference points of the MP I forum org's ISS, where you will find

17:06 standards and you'll also find a number other you know, presentations and

17:13 There will also she's gold around on Web. You will find lots off

17:21 and examples of how to use and there's several books, and these

17:27 somewhat old. But after has been releases. Usually there are also some

17:34 , and some of the books are and you can download off the Internet

17:38 free. So there's a little bit when and when not to use

17:51 So they start with a not so you can solve the problem on a

17:56 note, it's probably not wise to M P I from a performance

18:04 There's something wrong you will get the . But as I mentioned both when

18:10 talked about open, empty and today that interaction between processes in there's a

18:18 more cumbersome or expensive from a performance of view, then just accessing memory

18:27 , which is the way things communicating and peace. If functions solve it

18:32 a single note, Um, then probably not the right thing to try

18:38 use. So NPR so predominantly use I if they want more compute capability

18:48 higher performance, and you can get of the single note. Uh,

18:55 because off se floating point integer or type capability. Or it could also

19:05 because off memory bandwidth, since Morneau's higher memory bandits, or even if

19:11 bandwidth constraints than memory bandwidth constraints, , you may choose to use more

19:18 the minimum number of all the single . For instance, it's also the

19:23 that the working set that you have larger than fits in the memory and

19:29 single notes. So simply just to ableto work on the problem. That

19:35 interested in on wow in memory, to speak, you're supposed to paging

19:42 the disk. You may also choose have a larger number of nodes.

19:47 you get more memory, the more you're actually using you know other things

19:54 that in terms of similar parallel not, uh, it says here

19:58 one of the bullet. That's another that when I choose to do some

20:02 flexible programming model that simply can't problem so on. There is a little

20:13 whether it's a sad whether NPR is featured and out that started out with

20:19 fairly limited number of functions. And it was a big jump in.

20:24 it comes to about 2015, the motion three showed up and people are

20:32 working on the new standard and uh, but it isn't there quite

20:42 . Now it's a lot of which is good. It's a rich

20:47 , on the other hand, many programs can read, be written

20:52 do quite well. But the It set off NPR functions of library

21:02 , and we'll talk about Elise That, you see here is a

21:06 set in the next service lights and here's a little bit busy

21:15 If anyone is interested. What shows little bit of the architecture and similar

21:19 spirit to slide the short for P I. The various layers in

21:25 implementation. In this case, there's library, so the application basically sees

21:33 library calls that are at the language , and the implementer is of the

21:39 routines they have to be aware So the protocol stacks for doing communication

21:48 the network that connects up the notes the various networks protocols that maybe

21:56 um, the communication hardware. Eventually goes down to sort of a level

22:04 . Um, drivers up put spits buyers, and potentially, that is

22:13 with the network part that does. voting in the networks. Come that

22:19 up the no such in some are all points on the network will

22:26 go into the details of this in course, but just to be

22:30 So, for instance, one could TCP IP for being used off in

22:39 libraries, Um, that the application to send and receive messages, but

22:46 TCP I P is not used because the protocol, overhead is quite

22:53 so usually and something that is much efficient in terms off using the communication

23:05 . Given that the scope off communications applications on clusters is different, from

23:12 the scope is sending messages in the . So, um, any questions

23:24 this general stuff? So now I get a little bit more to the

23:29 itself on principle, the the way since we send messages through TCP

23:37 there's no difference between, uh, sending messages to each other versus

23:43 sending a request to Facebook or something that. They, uh, core

23:54 or function is the same. To , uh, data from one point

24:00 one process in this case to another that is the same whether they use

24:09 i, p or some other like infinite Band or Ethernet or that

24:22 used to implement the sending our messages And if anyone is taking someone networking

24:38 , uh, you know that depending what protocol is being used for network

24:47 Uh huh message order may be guaranteed not guaranteed in terms of the FBI

24:54 the way things are implemented, the guarantee message order as we will talk

25:02 , uh, in terms off security a lesson aspect off sending messages,

25:10 are also handled differently. And NPR I t c p R p So

25:17 the first business to get data in this case, the process to

25:23 process. The processes may actually live the same. No, As I

25:29 , you can use it for a note. So and typically in clusters

25:37 do not have I p addresses when can't use I p addresses for sending

25:45 receiving messages, start finding out source destination. So they're not Tell another

25:55 . Yeah, Yeah, for Eso part of the reason for security

26:03 uh, clusters air rarely seen on Internet people to go to all kinds

26:13 lengths to make sure that, note addresses a not accessible or have

26:23 p addresses because some of the large they are they want to do denial

26:30 service attack. There are really powerful . Thio generate a tremendous traffic,

26:40 is the problem. Uh, if is in the residence off exploring sort

26:48 distributed computing, so many in the community at times kind off. The

26:54 things cuts, uh, hacked to degree things were hacked or manipulate today

27:04 to be able to build sort of platform across the internet, and then

27:09 was convenient to use term I p . It becomes a bit of a

27:16 and you don't fixed. I p . Well, that's sidebar.

27:24 Any other question? There's nothing stopping from, um, using open MP

27:32 of an MP I program right if want all the notes to use their

27:36 course. So it's a very good I was going to come to that

27:41 is actually to correct. So any of the processes that to use an

27:46 p I can be multi threaded. for that, people probably used open

27:54 . So that's typically no, it's a hybrid programming model where you

27:58 open MP for the notes internally in to make things more likely on the

28:05 . And then you use M. I. For the more cumbersome communications

28:09 processes, start typically runs on different over a network. Any other

28:21 Question? Mhm. Yes, I say you know the right thing.

28:28 use open, empty or opening Sissy , right, right coat for each

28:36 and then you stitch the things together N p I across source s

28:50 This is a typical way off category . The library functions in M P

28:58 and these four categories. And so first one is obviously. Then Thio

29:07 terminates and manage the M P I computing environments being created through the NPR

29:21 and I mentioned or their world. also then very useful in when,

29:26 use M p I. And we'll to that probably next lecture, not

29:31 , and they are illustrated a little today, but that one can defines

29:39 on data types that can and then properties of the application beyond the sort

29:48 primitive data types of comes the languages by the forethought included in the

29:55 But then it comes to the I guess core of M P

30:00 That is the communication and the typically about two groups point to point family

30:07 pairs of processors that communicates and send messages and the other one ISS to

30:14 communication between groups of processors and saying of mentioned in terms of talking briefly

30:21 reduction as an example. So here a little bit Mahdi. Tell her

30:28 I just mentioned that on the first is to set things up to enable

30:38 between the processes and, um, out and I'll talk about him or

30:46 these things, how you identify or source and destination addresses for messages.

30:55 then there are as well ways for groups or subgroups off processors on just

31:06 briefly. So since you have been , Thio dealing, where matrices and

31:16 previous assignments that certain things you know from applications in that case is that

31:23 may want groups that defines columns, , and I will make some other

31:32 . They defined roles, for off major cities or something. Instructions

31:37 money. What did you mentioned the being able to play, create these

31:47 own data structures, and then they operations. And I'll talk a lot

31:56 about these things and lecture today and lecture. So the thing to the

32:05 is how this communication that works and is some, um, entity,

32:18 if you want to call it that known as a communicator and NPR,

32:22 kind of a contract that has to associated with it. It's a

32:30 an account and well illustrate. What what the context is on slides to

32:42 . The group is perhaps family That means to just define a collection

32:50 processes that you would like or the has, uh, kind of a

33:02 context, and you want to Like I said, maybe things within

33:08 is useful entities, so one can between, uh, nodes or

33:15 Sorry, that, uh, hosts in the same column, but they

33:23 the interaction within columns to be respected, different columns. And so

33:30 were not mixing interactions with them. column for interactions within another call in

33:39 context that provides, uh, I would say for how that communication

33:48 is supposed to happen. No, process i d or that is,

33:57 you like. In the context off . P. I is known as

34:02 rank, and the rank is unique each of the communicator because you can

34:17 many communicators has has said, and I will illustrate more that in slice

34:25 come. And it's also the But coming back to this simple matrix

34:34 that if you look at the name of the Matrix, then obviously it's

34:40 a member of a Rwanda column and the same ways. Then a process

34:52 be a member or more than one . But then it has unique rank

34:59 each Oh, the communicator by which support. And then there is

35:07 This MP I come world that is communicated that is always present. You

35:18 have to define it. And it's communicator that includes all the processes that

35:26 part of the application. Yeah, , so the term rank used just

35:37 uniqueness Or is there some sort of or hierarchy implied? There's no hierarchy

35:46 . The there is? No, would say yeah, ordering as arbiter

36:02 by the fault. I need to back and refresh my memory.

36:07 Thio what extent you can also, and I think there are ways of

36:12 forcing a particular ordering within the communicator general rule is the numbering start from

36:21 and depending on what there is, , at the artery level, there

36:31 identities or numbers for the different notes which there always notes, notes.

36:40 so it's a question How then the I library uses those whether it happens

36:49 do the ordering in the way the happens to be done in the

36:54 Ah, in terms off increasing or order. But, um, in

37:08 of the actual hardware, it doesn't has, um, knowledge, all

37:20 particular topology. And we'll talk more that. I'm not talking about

37:28 Um, So depending upon how the are interconnected, there is also the

37:34 of respect to the interconnect. But , um, at first order,

37:41 . Also has a number that is independent off how things are connected

37:46 And then when you then allocate processes , um, the processors and the

37:56 . And if you come back to simple you matrix example, yes.

38:03 . The how successive processes may have allocated to those notes may not be

38:14 in the hardware process or numbering or number. So for now, I

38:22 a It's arbitrary. And when I refresh my memory exactly what all the

38:26 are, what is a good So every communicator has his own separate

38:39 , and they all start we rang . So there is no unique number

38:49 range for any of the communicators. a partial answer. Yeah. Thank

39:05 . Thank you. Dr Johnson. go refresh my memory that now it's

39:10 for next lecture. So I think said this, that when it was

39:18 us a kind of generic basic version WAAS in the first couple of versions

39:24 NPR was this cooperative things, that and receiver has to agree on,

39:29 to do to do and when things ready to be done. Um,

39:34 performance reasons, people were not totally with that and shape model. So

39:43 one in, uh, NPR version something to point something one sided.

39:51 was introduced. Or, as I , it's okay for one process

39:56 you know, right in somebody else's space or retreat data from another

40:04 Mm. Today I was only talking the corporate they're parked and next lecture

40:10 the one side in the version. the one sided version is to try

40:16 get a little bit off the notion how shared memory processes works where you

40:25 what's known as just and put which is God kind of one

40:34 Whenever we say cooperative, we mean , uh huh, the parties,

40:39 in real time, right, so sided is there? I'm assuming one

40:45 . It has some, like similar an ssh key, Um, agreement

40:49 advance of the the center? Or how does how does the one sided

40:57 work in terms of No, I'm necessarily. It doesn't, um,

41:04 this case, it needs to be the same communicator. But it's in

41:07 case. It's up to the program to make sure that things don't go

41:14 . Okay, I will cover that lecture. How this actually works.

41:25 today and even this cooperative is have shades of gray. President probably will

41:34 today. So next it's a Mesic , and that's again. It's like

41:48 anyone familiar with against his pipe protocols there's a payload, and then there

41:56 kind of a header, and then one goes a said envelope instead of

42:00 header. And then the payload then its description that will talk more about

42:07 subsequent Streit's so there. What's label the message body or the data in

42:19 case is just a description to try figure out, uh, in one

42:26 from where to get the data and much and how We'll talk more about

42:32 on coming slides as well as on receiving and where to put the data

42:40 how and the and middle. this could give source and destination,

42:48 it has the communicator or defines a processes that are involved in this

43:01 Andi, I'll talk more about that well as about this thing. That's

43:07 . Attack that is important in the helps do the matching off sending and

43:15 messages on the Caesarea source and the communicated and attack. So this

43:24 that the tag allows Senator Receiver If are lots of messages going between

43:30 um, for the receiver to perhaps look for it particular type.

43:40 message. Since the descriptor of the descriptor that's good tells, uh how

43:49 it doesn't tell labor, the data to what the properties. Otherwise,

43:57 So that is something we can use for. And this just tell us

44:05 little bit what the's, uh, pieces are. I think you show

44:10 , I guess, on the next . So basically it requires appointed Thio

44:16 addressed all where data is supposed to retrieved from in terms of the center

44:22 it's supposed to be deposited in terms the scene. Um, and the

44:28 types gives a lot more information about it's very nice being either collected in

44:36 of retaining data or distributed in terms receiving or storing what is being

44:47 And the count is the number of off the data type that is being

44:54 , and I'll give examples income on . But before that, I guess

45:01 top of it about execution model is . Thio what they have and open

45:15 peace or not, the processes are , typically doing the same thing as

45:23 the model. So every, program or process then has the same

45:32 of cold, and then they somewhat things redundantly. Except that's not what

45:42 want to do normally. So it on different data. So that's was

45:49 what was implied in early on the up the data and typically what,

45:58 , high level description of NPR is people often call they sort of.

46:06 want to compute. And so, course, the processes need to

46:10 But eyes process kind off. we'll do more work on its own

46:18 than getting data from other processes because things over the network or through message

46:24 a slow compared to memory references. that's why you have this kind of

46:29 computer all, um, but it again. The program tends to be

46:35 same, but the data is so execution paths are not necessarily the

46:40 , but accepted with Julia programs. it's typical. Then you used this

46:49 MP. I run, as I at the bottom of the slide

46:53 and then didn't tell how many processes won't substantiated. And this is

47:00 kind of what do you use and in terms of Sturm in trying to

47:06 Resource is you want. So here just trivial example of what this NPR

47:14 for a number of processes for being the same program is. Then

47:19 she ate it all in the four that than typically is over, allocated

47:28 for process sores from, and then upon again, how many instances you

47:37 , you get the same thing executed many times since against being the

47:46 Um, there's just shows also, they're handling, and I guess it's

47:52 thing to know, practically unless one something different. If there is any

47:57 in any one of the processes, they called tends, bought by

48:04 But there's ways of avoiding that, on what problem wants to happen.

48:09 , perhaps ignore what happened. Is safe for the application? Or you

48:16 depending on what the area code, may want different actions to be

48:20 So that's what can be done and question of this execution model. Gist

48:32 . Think of it this pindi you know, all threats, getting

48:37 copy unless you have a work sharing ever in the open and P and

48:44 that executes the code. But in case, colds doesn't need to be

48:48 because it's in a shared memory In this case and NPR, things

48:53 replicated because it's not shared. Memory and processes are replicated, and the

49:01 them out. Okay, so here a little bit then about the program

49:10 and how get my difference? I a few slides that has both sea

49:17 fortune. They're still depending upon students the class. If you come from

49:22 computational sciences um, many calls are important. So on that convenience,

49:31 of them sure or try my most are seeing examples. But so first

49:39 as money's include father got the proper off and the library things,

49:46 I think 10. And then there to mandatory thing also in it and

49:55 Thio in it for creating the parallel environment and the Empire finalized to the

50:06 what was created for the MPR So, uh, that's something

50:18 So now I guess a little example , uh, similar spirit too.

50:27 I had when I talked about open in terms off how to customize work

50:38 of having all processes do exactly the thing. So in this case,

50:42 task is to have different processes. guess computer generate different powers off the

50:54 X, um, that all the would start, but the value

51:07 So now the process id once rang it comes to NPR. So I'm

51:21 maybe I'll ask what's in my little . I was too skimpy Descriptions of

51:27 MMP I t and want to have . Maybe conceptually, you can tell

51:33 how your condition the competition based on process ID. All right,

51:48 the best we need to find out the rank is. So there is

51:52 this n p I query command that the process to find out which rank

52:02 has been assigned within a particular So here's kind off in the very

52:14 example in this case, that new this routine N p i com rank

52:22 for this example, they assume that processes are in the scope that you

52:34 so on that uses default communicator empty communication world com work and you get

52:43 ranch and scientists and my i b . So then you can have your

52:53 right on open, imperious the I thread ideas correct. Thio distinguish

53:04 what it was supposed to do. now I'm just going to have this

53:08 of, if statement that sort chart on what my idea is, what

53:14 was supposed to do. So what means now the different processes,

53:22 computer, different powers, events. then I think and so this is

53:32 what I said And there's something the slide that's just showing that a little

53:37 how you can use more compacts instead holy if causes you to just

53:43 um, I azan expression and there function. So any question on this

53:59 Not specifically on this example. but a little more, general.

54:04 , so we can use M. i on a single node. And

54:10 only difference would be that it spawned separate process for each or whatever you

54:15 with the minus MP. Right? . So I guess what are the

54:21 between having for processes on a single versus having four processes? What,

54:28 two notes? Um Okay, so should both cases should they turned the

54:43 results. Hopefully it should be correct . See how maney nodes in this

54:49 you work to use if they use . She said four processes. If

54:53 use 123 or four, they will in our case because he doesn't have

54:59 d mainly. But so then, on what the application does it,

55:09 , one or another, maybe resulting the lower execution time. The reason

55:20 as if it's memory band was The more knows you have, they

55:26 more bandwidth. You get. So that's the dominating park, then and

55:32 is not much communication going on between processes that could produce the running

55:42 On the other hand, if memory , it's not an issue,

55:46 then it could still gain if it's compute intensive because you get more functional

55:55 . So the execution time may go again along as you don't have a

56:01 of data interaction between the cross. is, on the other hand,

56:07 lots of data interaction between the Then the more heavyweight sharing of information

56:17 message passing will probably result in lower by spreading things out across notes.

56:28 , so get clearly it's application but I think the more direct question

56:34 had was m p. I treats process as it doesn't know whether it's

56:42 a certain note or anything, So it only looks up.

56:45 they process level. Good point. . So that depends how the implementation

56:53 the MP I library's being done. there is no standard that specifies how

56:59 should do it. Because, if it happens to run on the

57:09 note, whether, um how, the message passing is being implemented.

57:20 . It depends upon, you knowing that you can, in

57:23 just, uh that is kind of physically in a shared memory. So

57:33 mechanistic dreary going between even though the memory spaces. But they're physically on

57:40 same note. Uh, they wouldn't to use C C v I,

57:46 , for instance, as as one in terms off getting to the other

57:51 memory space. So, depending on it's implemented, whether you pay the

57:58 overhead or reduced overhead if it's on same no depends on the NPR

58:07 Okay, that makes sense. So the programming model perspective, they're all

58:12 class citizens but the implement or might to do a check and see whether

58:16 not you can skip the TCP And okay, some other primitive

58:22 Okay, awesome. That clears things . Thank you. It's a good

58:30 . You have lots of discussions about . When one looks at performance

58:35 I'm trying to figure out what Okay, s so that was

58:43 Then a little bit the language. little bit harder. Things aren't some

58:49 written, so let's see. I underscore something where something first have

58:57 capital letter, and the rest of that follows air. All our

59:01 that happens to be the fortunes. is up, reflects. And then

59:06 a few constants. And they are Com World is kind of a

59:11 and so is NPR. Riel and of the other things are both in

59:16 and for eternal capitals. This, , now about the data types.

59:24 is kind of a recursive way of to or why so defining things in

59:31 kind of illustrated on the next few , things can be defined. Some

59:40 them can just show the next few . Ah, yeah. One part

59:45 I guess I should stress that that , um even today, for

60:00 floating point representation are not totally standardized all platforms I made or notes or

60:08 used in clusters. So anyone off not to speak up if you knowing

60:18 this notional little and big Indians? , we do with like, specifying

60:29 of left siders is the right right? Yeah. So basically it

60:37 is is left most bit in a unquote word in memory. The most

60:46 release significant. So it's, can be ordered effectively left to right

60:53 right to left. Um, now has. And for a while I

61:02 . I was big producer on chips well. They have to listen.

61:07 processors now and are limited to the . Serious but wide range of process

61:13 years back and intel on the other and they did not agree on.

61:19 ordering with our bits and several data . Mhm. And I'm not just

61:26 on those two, but they are of big players. And so But

61:31 essentially was little Olympic Indian wars and down for, I think, a

61:36 of decades now I think both at the major ones IBM in the until

61:47 believe in India all agrees on the of bits, but not necessarily all

61:54 vendors. So all that mess in to then make sure that in value

62:01 data item is interpreted in the correct that it was intended from the applications

62:09 of view, regardless of the platform which just happens to the use.

62:14 when you send something from one platform won representation, Thio another platform that

62:19 representation than the library. Teens need resolve the difference to basically do this

62:30 , even though there may be an floating point data to make sure it's

62:35 same value on both platforms. So Z yeah, and then over the

62:43 of computing, there has been some . It's just that came because these

62:49 , we're not done correctly. So part of the things again assisted the

62:55 empty library tried to keep things that language levels or applications written,

63:03 should be able to trust and is , for the library, implement er's

63:06 are closer to the actual platforms. make sure that the correct things happens

63:14 restrictive from using, um, packed structures. Uh, if, for

63:22 , you're using a 32 bit into , specify 32 different Julian's um,

63:27 is the case with a lot of . Libraries were used and to

63:31 and one or two at a Hey, yes, so it respects

63:40 data types. So now if you're of clever, Thio, instead of

63:47 as a single, integrates a collection bits not, um, but the

63:56 ordering so whatever is on the sending , you know, bit number five

64:03 the right, Um, in authority it, where it should still be

64:11 the other side, the bit number on the right, even though physically

64:18 may be stored from the left. , you know that. So the

64:29 ordering that is soon, if you on it, should not change

64:39 It made. So I'm not going talk to. But this is just

64:47 your reference, I guess a bunch these things that are basic NPR data

64:53 that is available without you having to . And it's in the standard,

64:59 there's more of them in the But this common ones Ana, and

65:04 that the C plus plus or for Let's hear this is And then there

65:12 some pointed out or the sun special types. That is, by the

65:19 in there that the same pr underscore , that defines the communicator. And

65:24 what we can then use thio shows the bottom on the side here,

65:30 you can then use up Thio to your own communicator. You know,

65:37 common in this case, um Um, no. I think the

65:46 thing you know slides will show some of using this way doing your own

65:53 right data types and how that may convenient. Uh, okay. I

66:03 the one most longer. Uh um terms off their continuous and there is

66:10 vector and that there are things that needs todo commit, I guess have

66:14 define it and to make sure that structure is kind of set up properly

66:21 allocated properly, and then the So now just what I thought was

66:28 next. So this is a way can then, uh, sort of

66:42 construct based on the structure off, things into something that eyes continuous in

66:51 ways. On that the structure you can call it. And then

66:55 all data topless. The peace of structure that has shown for each one

67:01 the little graces Rockets story on here a little bit of another one that

67:10 , to define it vector in a of different ways on as you can

67:17 and what can do The to sort actors in this case has a stride

67:27 the structures have repeated in the lower hand corner. And then you can

67:33 and all set. I think that , actually, when the next

67:37 yes. So here shows a little of these things is constructed with a

67:42 . It can do it, in several different ways. So this

67:49 be in very useful again, coming to the simple matrix examples where,

68:00 data are laid out sequentially. So you want to retrieve things in a

68:06 , you can define director, but or in a row dependent upon that

68:10 column Major ordering Thio receive things by or column by having destroyed ID

68:19 come. It's and that Yeah, can also do a little bit,

68:27 , similar Thio. Well, the What? I would call some together

68:34 type types. On this case, have displacement. Tells you a kind

68:43 distorting position. If you're what s this case, the green boxes that

68:54 nine, 13 and 17 position. then you have the length. That

69:01 for each one of the positions from first to investors. One green box

69:06 fire still being boxes. And so 13. So there's a bit more

69:12 way of sort of defining this at data that you would like to use

69:19 a particular competition or action on. , and this is another way,

69:30 , why you can be summaries. price. Creating your own data

69:39 Let's see on then, yes, thing, I guess the clean up

69:48 with free and duplicated That's an interesting this is kind of a selection order

69:58 is pretty defined in terms of the or project and name derived data

70:11 So let's see if it was just . Use on the city slightly below

70:17 the middle of the first thing is thio get the ranking size. And

70:22 there's this continuous, uh, the data type that Tonto commitment for.

70:29 then you can do something with it eventually have to freedom. Yes,

70:35 structure on how to use it. designed Dana Times Hold that for a

70:43 . I think before I move So it's again. It's flexible way

70:53 the finding data types and those so then you use in the, um

71:00 sending and receiving messages, and that you I'll give you flexibility on how

71:08 retrieve data from memory when they're going send it, sources said. You

71:13 use the vector of strident to collect war Collins off data which may not

71:21 continuous in memory. On that you the construction of sports maintains competitions than

71:30 index versions. Maybe the better version identifying what things you want to

71:40 Tom says that more or less kind like defining point arithmetic. Yeah,

71:47 , yes, I guess that's a analogy. And it's needed because this

71:54 only needed because of the fact that it provides supporting data types.

72:01 so right. So instead of having , uh, um, communication actions

72:20 contiguous blocks of data, it allows to pull fairly flexible pieces of memory

72:28 tow. Buffer that then the ends being one block off data being

72:38 So it's fairly arbitrary ways, in case, unless what kind of report

72:46 the gather scatter, it sort of things out of memory into a buffer

72:51 using these data types. And um, the other part was

73:03 and that tells you how many times that you do that in order to

73:11 the total buffer eso. Basically, stand from the actual communication routine from

73:21 the buffer. It basically has it and a total length computed based on

73:30 has been gathered into the buffer. then, um, the communication routines

73:38 to deal with. It doesn't need . In itself, being with a

73:42 of structure is just kind of one of data. And then the receiving

73:49 has than enough information in terms, and data type to figure out to

73:56 , so to speak or untangled this into the pieces and distributed into memory

74:03 you may want it to go. , um well, it's a couple

74:14 minutes. Eso I want getting too to sending receiving today. Well,

74:20 , try to illustrate a little bit communicators before I finished today. So

74:26 communicator, as I guess, necessary handle. So there's an identifier

74:35 you can use to do things with communicator and come where is the pre

74:40 ? But then you just NPR come define your own. But let me

74:44 what I want to show you. pictures that maybe the most effective way

74:51 getting an intuition what it is. , I think I already said

74:57 So when I talked about some of earliest slides that, um not thinking

75:03 the process can remember on many communicators I think that's important that we discussed

75:08 little bit of the assignment of rank process idea is done. Uh

75:15 Completing you start with zero. And each communicator on the range is independent

75:21 each communicator. But how the ordering within the communicator. I did not

75:29 answer. So here is the picture perhaps that illustrate things a little bit

75:39 the income word is the default communicated includes all the processes that you have

75:47 the application want to route. And , from the applications point of

75:54 you may want to treat different collections processes in different ways. So maybe

76:03 from the applications point of view, . Defined a few groups on.

76:11 this case, each group ended up five processes. A number of processes

76:16 the group is totally arbitrary. Is to the application todo on. In

76:22 case, if you look in the of the slide, one says Group

76:27 and Group, too. You will that the that's still the labeling or

76:34 off for the ranks of the processes the com World Communicator. So they're

76:43 . Distinct in the Groupon. too. Uh, it could be

76:50 case again that some processes were part both groups. That's perfectly fine.

77:00 , but it's not illustrated on this . And then, in this

77:04 over to application defined communicators. Come and come to then as theme.

77:13 process is on the left and the . Now, if you look at

77:17 numbering in these two bubbles slightly below middle, you will see that the

77:25 is from the or two foreign. also illustrates the point that the numbering

77:31 unique on and actually did to each . And, um, for each

77:37 the communicators. But this is said on the processors processes and sorry again

77:45 easy to slip and Group one, ranked zero in communication one in fact

77:54 the same as rank zero in the . Um, I do not

78:02 I'll see if I can find a answer for next ledge. So how

78:06 number in this town, and then bottom set. The bubble shows that

78:10 known as interact communicators that is communications each communicator. And that's perhaps the

78:20 common way of communicating. Probably the why you formed this to communicators in

78:26 first place of the things that they to do separately. But sometimes you

78:30 want to do things between the groups for that sense, also inter

78:38 Andi. I think that's very much it says on this lines. Then

78:44 you can do this defining other Uh, based on things you started

78:53 come work, and then you can inclusion and exclusion off and processes from

79:03 icon would or any other communicator into you want to know. Um,

79:10 in terms of communicator or you Can find them from scratch is in the

79:15 create, will display the size and a query functions that allows you to

79:23 out the size of the communicator you , or the rank within the communicated

79:29 to create it. And I think will stop with this shows a little

79:36 , though it's done in terms off . Com rank give it taking name

79:42 the communicator and then returns the rank that communicator for the calling process on

79:50 with size. It's the size off communicator for which the processes depart.

79:57 then I will not go through this . But I leave it for

80:01 Them too. Uh, take a at. But it shows how you

80:05 use. And that's including includes, , library calls to build two separate

80:14 or in this case, all and processes from the global communicated to Come

80:23 . And I will stop in this on bond questions. So next time

80:31 will talk actually about sending receives. there were a shades of gray,

80:34 they actually worked and, uh, point to point in the collective communications

80:44 . Then it's time permits. I get to the one sided communications next

80:49 . Otherwise, I will do that after it. I think next time

80:54 Josh will do is some Gmail or these things work. That's one.

81:02 any questions? So you know. So you show me, you

81:28 available on email. If you have for him before Wednesday, otherwise you

81:36 the part of the class on Wednesday do some demo of N P.

81:54 Okay? It's no questions. I'll you for today and and they

82:07 Thank you, Dr Johnson. Thank you so much. All See

82:11 Wednesday. Yeah.

-
+