© Distribution of this video is restricted by its owner
00:01 | Yeah. Okay. Okay. I'll show some of uh some |
|
|
00:10 | And you go for a couple of . That might be useful for |
|
|
00:15 | Not just for the assignment but also your projects as well. So I'll |
|
|
00:22 | with some simple examples and a couple things related to process, uh burning |
|
|
00:28 | mapping and so on. You can you in case you talk so |
|
|
00:38 | So simply starting with already uh simple and it's not physical things. The |
|
|
00:47 | . Yes. So uh this is uh basic skeleton. Apparently I program |
|
|
00:53 | like uh so you need to make you have the header file, FBI |
|
|
00:58 | included in your program and then any uh standard C or C plus plus |
|
|
01:03 | files that you may need. Um most important all that you need to |
|
|
01:10 | with babies in it. And right we're just do uh evil initialization with |
|
|
01:16 | passing any extra parameters to it. This program here, it's simple to |
|
|
01:22 | of the first opening the program is we just requested uh number of threads |
|
|
01:30 | respond in the parallel section and what ideas were and the printed them. |
|
|
01:35 | here with N. B. You can do similar things. But |
|
|
01:38 | first this called empty icon size which you that there are no longer of |
|
|
01:44 | in your uh that belong to this communicator. And this is the first |
|
|
01:52 | communicator that NPR provides you. That's Congolese so that contains all the processes |
|
|
01:57 | you will you respond when you run program. So this program gives you |
|
|
02:02 | number of processes then? We're in those processes. You can also request |
|
|
02:08 | rank of each process by using this empty icom rank and again use the |
|
|
02:14 | as the context for your for the of processes that you're referring to request |
|
|
02:20 | ranks and this is the variable that get the uh result of this. |
|
|
02:26 | this function called, you can also some information about the process of name |
|
|
02:32 | using this Mp I get processor I'll show you how the output looks |
|
|
02:36 | . And it was just another simple state mental world. And when you |
|
|
02:44 | with your program you need to Colombia finalized too, remove all the uh |
|
|
02:49 | Mp I uh set up was done your program. So any FBI program |
|
|
02:54 | look very similar to this. It shortly have this one M P I |
|
|
02:58 | and then one and one FBI finalized . Uh huh. Well as we |
|
|
03:05 | talked about it in the lecture last that FBI is a standard and there |
|
|
03:10 | several implementations of it. Um so this case I'm right now on on |
|
|
03:15 | bridges bridges to cluster compute note and guys have the open MPI I uh |
|
|
03:23 | of M P I. Um so do that to use that implementation you |
|
|
03:29 | need to do model Lord of an . Yeah and that lets you is |
|
|
03:35 | particular modules? This is the Open module that you need to use on |
|
|
03:40 | trampoline. Do you also have access the internal mbia implementation which the package |
|
|
03:47 | for which goes like I M P and whatever version uh for that package |
|
|
03:53 | be there. Mhm. Now when want to combine your FBI programs for |
|
|
03:59 | program uh you simply do it the N C I C C rather than |
|
|
04:04 | C C C R I C C . Uh If I just stood here |
|
|
04:08 | couple of times, it will show whether um available commands. Here's A |
|
|
04:13 | B I C plus plus is the cross compiler for FBI. If you're |
|
|
04:18 | FORTRAN then you have the FORTRAN compiler well for and then N P I |
|
|
04:23 | is what you will use to execute FBI programs and specify how many processes |
|
|
04:30 | you want to spawn for your for program. So here are simply too |
|
|
04:37 | And so yeah, there's nothing special the compilation Choir that's simply just replacing |
|
|
04:44 | you. General gcc compiler compiler with from Thailand and that generates your |
|
|
04:52 | Now there's nothing that you have done your program. You have only added |
|
|
04:59 | . I also your your program. not like open Mp where you specify |
|
|
05:03 | number of threads or anything inside your right yet. The the only time |
|
|
05:09 | uh your program will be replicated across . We'll be venue around it using |
|
|
05:16 | Iran. So that yeah, you provide and live life which stands for |
|
|
05:20 | number of processes. And here let's idea of four processes and then run |
|
|
05:25 | program. Uh So what's that that's to do. It's going to replicate |
|
|
05:30 | program, the entire program on four . And when I run it, |
|
|
05:39 | is weird. Mhm. I didn't any bindings. Okay. Mhm. |
|
|
06:08 | . Yeah. Okay. I don't what's going on here. Yeah, |
|
|
06:17 | know. I know. I asked 65 when I requested these resources. |
|
|
06:29 | sure I have 65. Yeah, requested 65. Of course when I |
|
|
06:33 | getting these resources interesting. Yeah. me see if I can do this |
|
|
06:44 | stamp duty up the there was a . All right. Mhm. I'm |
|
|
06:58 | about that. Okay. Yeah, . This number is coming here. |
|
|
07:17 | . Okay. I'll have to track reminding example was on the bridges |
|
|
07:21 | so I'll need to see what can done to show that. But |
|
|
07:25 | I'll start with these simpler examples So here. Um Yes, rather |
|
|
07:30 | open MPI I'm here using the intel so you get to see both of |
|
|
07:36 | right now because of this problem, . Okay, so yeah minus. |
|
|
07:42 | before hopefully this works nothing but got up. There we go. |
|
|
07:49 | So yeah, you get four Uh which which got the entire program |
|
|
07:56 | be executed there um the second simple that we saw last time was how |
|
|
08:05 | do communication between two processes. So again I have simply M. |
|
|
08:10 | I in it uh downsized rank and I chose one of the processes to |
|
|
08:16 | sores. There's one if condition that that checks whether the world frank equals |
|
|
08:21 | source strength and that process becomes the in this case and then I have |
|
|
08:27 | process which is processed one the destination posts a um receive uh and statement |
|
|
08:37 | . So yeah as we saw last that we need to have both both |
|
|
08:40 | uh a pair of FBI sent and for data communication for these blocking calls |
|
|
08:47 | then in the in the end you M. P. I finalized. |
|
|
08:50 | here uh the sender is sending just value of five uh to the receiver |
|
|
08:56 | in this case and then compilation will the same. Mhm. And yeah |
|
|
09:07 | let's do this. Yes come on yep. Yeah cross zero sent five |
|
|
09:20 | this one and uh receiver received that uh 100 later element here. Uh |
|
|
09:28 | uh this is one of the Again we saw last time that what |
|
|
09:35 | if you have um the receives and that do not imagine because dreadlocks a |
|
|
09:42 | to see both ranks, they post receive, two from uh to expect |
|
|
09:48 | data from the other rank is zero zero and one drinks performing communication here |
|
|
09:54 | in this case the last time that will cause a Denmark here. So |
|
|
10:00 | I understand this program here but uh processes yeah, it will just keep |
|
|
10:09 | , it will not show any uh how would just stand on that log |
|
|
10:13 | both processes are just waiting to just some data from the other process |
|
|
10:23 | And the solution I believe is in one where either you can rearrange your |
|
|
10:31 | and received to be correctly posted or can use the nonworking calls. So |
|
|
10:36 | ones Mp Sand and NPR received, are blocking calls. That means the |
|
|
10:41 | does not progress until until those calls finished is non blocking cost. Did |
|
|
10:47 | get posted? So the data gets to some internal sufferer. Local buffer |
|
|
10:52 | processes and then uh progress with whatever instruction they need to execute. Similarly |
|
|
10:59 | I receive we receive um instructions are her and then uh there might be |
|
|
11:05 | other around by mechanisms that might be might handle any data that was received |
|
|
11:11 | the central process at a later point time. So it does not wait |
|
|
11:15 | the Los Angeles actually dissent from that not. Yeah. Yeah, so |
|
|
11:27 | you run this now let's shoot hopefully finishes as you can see there's no |
|
|
11:34 | no implied order between any of the as well here. The receive and |
|
|
11:40 | for rank one was posted before rank so you need to keep in mind |
|
|
11:45 | the similar cases open MPI that your outcome or collectors should not be dependent |
|
|
11:52 | the order of the processes in which execute. If you if you want |
|
|
11:57 | make sure then you can use any any barrier constructs that the FBI |
|
|
12:02 | But did you want? Yeah. . It's yeah. Okay. So |
|
|
12:17 | is one simple example of how you perform a broadcast cast and also how |
|
|
12:23 | can do some production uh in this here. So uh here what I'm |
|
|
12:29 | is just having a which is initialized some character values. It's uh area |
|
|
12:36 | characters and then the FBI Broadcast function called FBI Broadcast function is a collective |
|
|
12:44 | . So all the processes that are in this broadcast function need to call |
|
|
12:48 | together. Well, not exactly but at least they they all need |
|
|
12:53 | call it. Uh and as you the offer, that will be just |
|
|
13:01 | will be broadcasted to all the Um and you need to remember that |
|
|
13:07 | receiving processes should also have um initialized declared the same offer in their memory |
|
|
13:15 | that sale here when initializing this uh this buffer there was an if condition |
|
|
13:22 | the only rank zero, initialized before zero declared it. But the other |
|
|
13:28 | other processes try to request some data it. It will result in an |
|
|
13:33 | because that memory location will not exist you clear it. So make sure |
|
|
13:39 | you uh that memory location is accessible all the processes. And here, |
|
|
13:45 | is the number of elements, So that's not the total size of |
|
|
13:51 | area, a total size of the uh computer by NPR and I'm using |
|
|
13:56 | NPR types you provide in these so in this case that's empty uh |
|
|
14:01 | , that's transparently a character which is . Right? So n times one |
|
|
14:06 | will be the number of bytes that be transferred or broadcasted to all the |
|
|
14:12 | . And then you also need to the root process for your broadcast |
|
|
14:17 | And this is zero process is the having zero practice here, uh is |
|
|
14:22 | root and then these are all these belong to the com board context. |
|
|
14:28 | I also need to provide that. Yeah, you need to pass a |
|
|
14:38 | to a memory location, president, would be an area of any |
|
|
14:45 | it could be appointed to a in which case you need to define |
|
|
14:49 | custom data types as we saw last to define the sizes and types of |
|
|
14:54 | uh infrastructure, right. Start anything, everything points to memory or |
|
|
15:07 | location. Right. It looks to uh it needs to be a memory |
|
|
15:16 | . Right, so this way you broadcast. Now, one other interesting |
|
|
15:20 | that's going on here is that these goals which are the NPR Waldrop |
|
|
15:27 | Uh that's being used here. So each each process is computing this local |
|
|
15:34 | uh locally for each each process, I'm doing here is trying to compute |
|
|
15:42 | the maximum minimum and the uh total uh taken by all the processes to |
|
|
15:50 | this NV broadcast function And this NPR dysfunction is again a blocking call. |
|
|
15:56 | even though process zero may send data some process, some that process may |
|
|
16:01 | some time. There will be a involved in receiving that data. Receiving |
|
|
16:07 | will not reach this. And uh call here that takes here, finish |
|
|
16:14 | time that takes it to finish. the these local times were different for |
|
|
16:18 | process here by using FBI reduce, can also check what's clear was the |
|
|
16:25 | minimum and total time. You can all the all the processes that were |
|
|
16:29 | more than this broadcast function. here is the simplest way you can |
|
|
16:33 | is buy providing the address to the to the local variable that contains the |
|
|
16:40 | time for each process. Yeah, there is a global uh max I'm |
|
|
16:47 | both, which will be, which end up which will end up having |
|
|
16:53 | maximum maximum reduction value only uh on 00 process, zero process will get |
|
|
17:02 | maximum value of local time out of the processes there. So let's say |
|
|
17:09 | have four processes, one process to seconds. Other one twenties and |
|
|
17:13 | Under 40 seconds. Right? So this reduction process, zero will get |
|
|
17:17 | know that 40 seconds was the maximum for some for one of the |
|
|
17:22 | So that's the maximum reduction. The reduction you can get. What was |
|
|
17:27 | minimum price for 10 seconds was one the processes that and then if you |
|
|
17:33 | to get the average time taken by the all the processes to finish this |
|
|
17:38 | function, we can simply do a reduction on this local time variable. |
|
|
17:44 | then towards the end. Yes, . And you can do sometimes divided |
|
|
17:53 | the number of processes. Look at , look at the average time across |
|
|
17:58 | the all the processes. Does that sense? Okay. Yeah. |
|
|
18:09 | And now, when I run it right, we do. But how |
|
|
18:18 | you compute averages across across the You need to get everyone everyone's a |
|
|
18:25 | of times and then completely average Uh huh. No, no, |
|
|
18:42 | each each process is running in a uh memories. Remember the address |
|
|
18:47 | So, one process does not know the local time of another process, |
|
|
18:53 | the daughter time in december. each Each process again, as I |
|
|
19:07 | , only knows about its local So, one process may finish |
|
|
19:11 | One event process may finish later All right. So let's say process |
|
|
19:17 | finished in 10 seconds. Some other finished in 40 seconds. How does |
|
|
19:22 | does process zero knows that the other finishing? 40. Right. All |
|
|
19:31 | . All right. Yeah. You'll the difference between uh the timing steak |
|
|
19:41 | so it's not significant right now. was hoping that the richest one which |
|
|
19:47 | . So I can show you something results using from happening. But let's |
|
|
19:50 | when I get to the end. , yeah, as you can |
|
|
19:53 | there's uh some difference between the amount time state and between different processes to |
|
|
19:59 | these broadcast this broadcast corporation here and depends on where your process ends up |
|
|
20:06 | the north or if you are multiple right now, I'm only on the |
|
|
20:11 | , but you can also do communication not something that is the agency |
|
|
20:15 | Then we conclude the interconnect lay agencies the infinite barrel it and that whatever |
|
|
20:22 | being used between the notes of your are communication across the notes. |
|
|
20:27 | Oh, you can go home. Today, these knows uh the slum |
|
|
20:36 | taking quite a while to actually access resources. Yeah. Otherwise I would |
|
|
20:40 | shown up if I okay, you be my resources right now. I |
|
|
20:45 | not get access to it quickly. Yeah, yeah, but again, |
|
|
20:54 | a little bit different. But not much. Yeah. Yeah, I'll |
|
|
20:58 | if I can run it on bridges uh towards the end. Show some |
|
|
21:03 | results there. All right. Uh , this point. Yeah. So |
|
|
21:13 | one it's just another version of Uh So I can show you that |
|
|
21:21 | it's not it's not very interesting. basically what's going on is every processes |
|
|
21:31 | uh addition of some uh elements in vector and N. B. |
|
|
21:36 | New computer reduction. Yeah, pretty the same way that we did the |
|
|
21:41 | of timing. So wasn't that interesting ? Mhm. Yes. So this |
|
|
21:51 | is one of the uh the only I think to uh create groups, |
|
|
21:57 | groups of uh the processes that you in your program. So this is |
|
|
22:04 | I'm trying to do is I have was born eight processes and then I'll |
|
|
22:09 | to divide them into groups of of of two, and then it probably |
|
|
22:14 | get its own local communicated rather than through the world communicator. So, |
|
|
22:23 | do that first, uh I have two areas of ranks, ranks once |
|
|
22:30 | uh include 012 and three, and to include 4567. So these |
|
|
22:36 | these will be the ranks that will divided divided different two groups from the |
|
|
22:41 | processes that I respond. Okay. , Yes. So, first I |
|
|
22:50 | to get access to the handle uh P I call set for the for |
|
|
22:57 | original group and to do that, simply call this function F B I |
|
|
23:02 | group and provide the communicator for that and that gives you the handle inside |
|
|
23:10 | variable, which is of type Np group. So that's where you get |
|
|
23:16 | to the handle of that growth. , once you have got that you |
|
|
23:24 | call uh you can have uh left condition to uh decide, decide which |
|
|
23:33 | were going to which groups are Whichever bank has um idea of number |
|
|
23:39 | drugs might do less than number of by two. So that's uh whichever |
|
|
23:45 | is less than has less than Process. I. D. Will |
|
|
23:48 | in this air condition here. And I'm creating the group hereby inclusion. |
|
|
23:55 | that that means we call FBI group as I. N. C. |
|
|
24:00 | . Lonely And then provide the handle the original group 1st. But we |
|
|
24:05 | up here. You need to also how many uh banks will be part |
|
|
24:12 | this this new group. And then need to provide the ides of these |
|
|
24:18 | these ranks here. Uh That will part of this new group. And |
|
|
24:23 | you also need to provide they handle un initialized handle to this new group |
|
|
24:33 | is also an act of M. . I grew up here. |
|
|
24:38 | Yeah. So any any process that have uh I. D. More |
|
|
24:45 | more than four here for more than . We'll go to this health condition |
|
|
24:49 | will be part of the other group . The interesting thing is you don't |
|
|
24:54 | to use blue handles for each of uh private of these groups because the |
|
|
25:02 | gets replicated. Everyone will have one of this new group variable basically. |
|
|
25:09 | whatever they initialized in their negro variable be local to them. All |
|
|
25:14 | So you only need one valuable for the all the processes. Once you |
|
|
25:20 | hold these NPR group uh functions, you need to do is create a |
|
|
25:27 | for this for this new group. that you can do by doing |
|
|
25:31 | I can create provided the communicator for original group we handle for the new |
|
|
25:38 | and a new communicator for this new , which is Newcombe, and which |
|
|
25:43 | of type np I come here. basically you get handled and then you |
|
|
25:49 | create a new communicator for that for new groups. Oh, and |
|
|
25:57 | this is this one already uh quality you can do on any local offer |
|
|
26:02 | these new uh newly created ranks here the FBI group drank gives you the |
|
|
26:11 | of a particular process inside a particular . Initially we were doing mp. |
|
|
26:17 | comrade which which gives you the global of uh the well using using the |
|
|
26:25 | . And it gives you the rank the process. You can you can |
|
|
26:29 | can do FBI comrade again on the communicator that you got to get the |
|
|
26:34 | rank as well. But this is way to get ranked by using the |
|
|
26:37 | handle here. And this is similar open MPI where the think yeah. |
|
|
26:50 | where the group group thanks uh private do each of the groups here. |
|
|
26:56 | global banks as you can see, from 0-7. And then in the |
|
|
27:00 | bank you don't want to and three you don't want to. And so |
|
|
27:03 | are local local to each of the . A very simple example here of |
|
|
27:17 | pie computation that we saw in open um um yeah, like they're all |
|
|
27:31 | reduce is basically yes. Uh it basically gets all the values from |
|
|
27:41 | all the processes and rather than that value value and depending upon only the |
|
|
27:48 | , it ends up on all the that were involved in the very emotional |
|
|
27:54 | . So in normal reduction, let's you had five processes, right? |
|
|
28:00 | type of production you perform, the value ends up on the route. |
|
|
28:05 | note, Let's 00 process. And all reduced, the reduced value |
|
|
28:10 | up on all the processes that were , not just on the under root |
|
|
28:15 | . That's right. So that's that's only difference between produce and all |
|
|
28:20 | Oh, oh, okay. Right. Contact. Okay. |
|
|
28:40 | Uh huh, mm hmm. mhm mm. Okay, mm |
|
|
28:55 | Uh huh. But yes, it confuses leah the total of the global |
|
|
29:04 | and world. So 00 plus one two plus three is six 4567 added |
|
|
29:12 | 22. It always need to make that yeah. Uh huh. |
|
|
29:26 | Right. So in an mp I yeah, the reason I'm showing this |
|
|
29:31 | , it's not uh nothing special going here. That's basically um yeah, |
|
|
29:37 | FBI broadcast, which all the how many new iterations that we are |
|
|
29:42 | to run for this by computation So that's the end in this uh |
|
|
29:48 | this loop attractions here. The important I want to make from this example |
|
|
29:55 | , is that unlike open and which will distribute all the indexes for |
|
|
30:00 | and you don't need to care about indexes goes to, which uh which |
|
|
30:05 | in the general case, right? open mp. For FBI you always |
|
|
30:12 | to use uh this methodology of check computing index through uh these process ID's |
|
|
30:22 | the program has replicated across all the . Right. So even though the |
|
|
30:29 | know how many processes processes are in in the in the execution environment and |
|
|
30:34 | time is not dividing up the loops iteration into smaller chunks, you get |
|
|
30:40 | give them some iteration ideas that they compute without any uh involvement from the |
|
|
30:47 | . So in lmp, I whenever want to paralyze a for loop or |
|
|
30:52 | or any any iterative uh structure, always need to use that they started |
|
|
30:59 | the process highly. And then computing more of the of the number of |
|
|
31:04 | number of loops or do I blow uh some chung number of implement to |
|
|
31:12 | the uh the elements that you are to compute. Yeah. Cool. |
|
|
31:20 | . Yeah. But listen, we're for the example, the inter computer |
|
|
31:26 | exactly the same as you saw for MPI Again, the iterations are computer |
|
|
31:32 | on the uh based on the process rank and each process stops when it |
|
|
31:39 | uh values uh larger than as for . And the document that each process |
|
|
31:47 | is by a number of processes that involved in your in your uh parallel |
|
|
31:53 | of your uh your FBI program come by only one. And each process |
|
|
32:00 | might buy which is a local uh of the value of the fire that |
|
|
32:06 | compute. And towards the end, just adds up a local uh locally |
|
|
32:14 | values. Uh my pie into a variable if I and that reduced value |
|
|
32:23 | up in the process heroes, Which the root of this year's operation. |
|
|
32:28 | , yeah, yeah. Uh remind . They say that only you can |
|
|
32:41 | can hard code it but in this I didn't I didn't do it. |
|
|
32:47 | . Generally, yeah. But generally happens is an mp I pE program |
|
|
32:53 | people and women and a lot of the entire argument list, let's say |
|
|
33:00 | have an argument that when the arguments you pass to your program right would |
|
|
33:05 | red size, let's say you're doing multiplication, criticized. What's the block |
|
|
33:10 | that each program should get each process get and so on. Probably any |
|
|
33:15 | of arguments, right? Generally only Zero initialize is those uh those arguments |
|
|
33:23 | it gets from India from from the when they execute their program. The |
|
|
33:29 | step always is to broadcast those arguments uh to all the processes as a |
|
|
33:37 | general way of programming in FBI. , you will very, very rarely |
|
|
33:43 | that all the all the arguments are initialized on all the processes. That's |
|
|
33:49 | generally what I see him. I'm sure what the exact reason for that |
|
|
33:53 | , but that's the general programming paradigms people use when writing the FBI |
|
|
33:57 | arguments are passed on process zero and it runs what castle to all |
|
|
34:03 | the best processes kind of Yes. Yeah, yeah, yeah, |
|
|
34:18 | yeah, it's okay we're going And uh huh Come on Northern as |
|
|
34:36 | , a mountain is Yeah. Okay. Yeah. Running it |
|
|
35:02 | That's Yeah, I only chose the of processes times for a number of |
|
|
35:07 | that should have been performed so that steps in each process. Just only |
|
|
35:12 | four steps for the entire thing. . Okay, let me see if |
|
|
35:18 | can show you guys the binding examples I want to show. Yeah, |
|
|
35:26 | sure. Not sure what went wrong that execution environment on bridges. |
|
|
35:33 | everything works fine. Until you start them Mosul. Okay, let me |
|
|
35:43 | . Yeah. Okay. Yeah. that's uh Copeland mp. I um |
|
|
35:51 | is a very simple way to check your processes are blinded or pinned the |
|
|
35:57 | resources, right? Uh similar to we saw for open empty. All |
|
|
36:03 | . So, so with the FBI parameters that will be using and I |
|
|
36:10 | say in Colombia there is a different of doing it there, do it |
|
|
36:14 | environment variables which I won't get into now, but we can see that |
|
|
36:21 | and it's a good question, I know. Yeah, so but in |
|
|
36:29 | you do it using uh some of and what many variables that provide access |
|
|
36:34 | ? Yeah, yeah, give them . Oh no, someone was trying |
|
|
36:40 | get in, I think I let in. Yeah. Mhm. |
|
|
36:46 | With you can simply do that with of the flags uh that M P |
|
|
36:52 | run provides you and to get some about what uh what those are the |
|
|
36:58 | don't you simply do. Nt I uh dash edge. And these are |
|
|
37:04 | general flags that you can use but are some other categories that you can |
|
|
37:09 | can also get some information about a of them that are interesting for us |
|
|
37:14 | now. Uh one of them is and the other one is binding and |
|
|
37:20 | get some some extra information about those of uh options that NPR and |
|
|
37:27 | has this message here, says FBI to help and about mapping and these |
|
|
37:36 | the flags that you can use here mapping. What it means is when |
|
|
37:43 | have that more than one processes, tells you mapping tells the runtime that |
|
|
37:50 | does the next process should be? respond around it too. So let's |
|
|
37:56 | if I specify mapping as poor. first process will be spawned or invited |
|
|
38:03 | it on 44 0. And the process will be a bind it to |
|
|
38:08 | next score as the next physical physical . If I specify mapping as socket |
|
|
38:16 | let's say you have to stop Then first process will be blinded to |
|
|
38:20 | zero. And then second process will blinded to socket, socket one. |
|
|
38:27 | right. So mapping decide where does where does your process end up physically |
|
|
38:33 | the honor system. Right. And then there's another Okay. How |
|
|
38:39 | that's the minding. Right. Which physical location out of your sockets or |
|
|
38:48 | trust. Even where your process will up right. Where it will execute |
|
|
38:58 | is a there is a default value everything. Right? So, if |
|
|
39:02 | don't specify whatever the default is for one time that has been uh configured |
|
|
39:08 | he's calling FBI packages that will be generally it's either core or socket or |
|
|
39:14 | . That makes sense. And for applications. Alright, I'll show you |
|
|
39:21 | that looks like I'm not. I'm to get into that before I get |
|
|
39:25 | it. I was just giving us definitions there. And so that was |
|
|
39:30 | binding means what all locations can your move between? That is where can |
|
|
39:37 | operating operating system move uh your process it once it is spawned. So |
|
|
39:44 | you as you remember, the operating has the authority to move your processes |
|
|
39:49 | places in the uh in the So if you let's say specify binding |
|
|
39:56 | poor, that means the process process stay only on that particular core for |
|
|
40:01 | entire execution period. Operating system cannot it. If you specify binding as |
|
|
40:09 | , that means the process can move all the cores in Silas socket. |
|
|
40:14 | operating system has the authority to move process in on any of the course |
|
|
40:19 | that in that socket. So let show you a couple of examples then |
|
|
40:23 | uh finish with that. Okay? , so let's say Okay, |
|
|
40:36 | that NPR and if you want to uh the mapping is you just need |
|
|
40:40 | provide this display map flag, then need to provide first. You're mapping |
|
|
40:46 | was okay, that's not good. , just look at my allocation. |
|
|
41:02 | , Sorry? Yeah. Okay, you. Yeah. Okay. All |
|
|
41:17 | . Mhm. Okay, I'm a , this place play up manna by |
|
|
41:27 | this I will choose as good for , um I'll also choose the finding |
|
|
41:34 | the core And then I was on asked for 65 last year, 65 |
|
|
41:42 | uh and then execute that. maybe I there we go. |
|
|
41:52 | so what this output shows you here okay. It will be fine if |
|
|
42:01 | make these things smaller, will actually it more readable I think. Uh |
|
|
42:09 | I know it looks small but it make sense when I show you the |
|
|
42:12 | . Yeah. Yes. So I map by score. Yeah, I |
|
|
42:23 | it's hard to see but yeah, with me here. Okay. And |
|
|
42:29 | this case apparently I got between these uh square races. That's only one |
|
|
42:36 | here. As you can see. that means I only got one core |
|
|
42:40 | in the first socket when when someone me access to it. And then |
|
|
42:45 | got a bunch of course in the socket. That's between these two square |
|
|
42:50 | here. So because I specified map four, that means each consecutive process |
|
|
42:58 | up on the next physical court. you can see from this lee that |
|
|
43:04 | up on these uh these locations So each process is find a top |
|
|
43:10 | to the next available court. And using bind to core, it also |
|
|
43:16 | you that the processes are only buying single single. It's not going to |
|
|
43:22 | to any other physical court or any socket here. Right. And so |
|
|
43:27 | of designing these 6065 processes that I and ended up getting blinded like |
|
|
43:36 | So all the way up till the of the second subject, uh one |
|
|
43:40 | by one core. All right. difference, I will show you now |
|
|
43:50 | if I say map by socket and this case, because I'll show you |
|
|
43:59 | I'm doing this in this case, processes will be handed to the same |
|
|
44:05 | core in this case. So, also need to tell it that you |
|
|
44:10 | allowed to overload a code a That means you are allowed to have |
|
|
44:18 | processes on the single on a single . So, you need to tell |
|
|
44:22 | otherwise it gives you an error that are trying to map multiple processes to |
|
|
44:26 | same physical core. So this flag you do that. So core colon |
|
|
44:32 | dash allowed. All right. so here you can see because I |
|
|
44:40 | map by socket. So, first went to socket zero. The second |
|
|
44:47 | went to sock at one. But , rather than going to the next |
|
|
44:52 | and socket, second socket For the process, the third process now came |
|
|
44:58 | to start at zero. Yeah, we were mapping my socket earlier. |
|
|
45:07 | were mapping by pause so that each went to the next available physical |
|
|
45:14 | Now, because we were doing map socket. Now, you can see |
|
|
45:20 | uh there are uh the next process up on the previous uh relative side |
|
|
45:27 | sockets here. Does that make Because we're happy in my pocket. |
|
|
45:35 | , so as I said, mapping where does your next uh process ends |
|
|
45:40 | on the physical resource? It it goes to the door in the |
|
|
46:00 | next mapping place. So even though said mob eye socket, it went |
|
|
46:07 | zero on first uh for the first And it went to stop zero and |
|
|
46:13 | one. Now, now, because I only got access because of |
|
|
46:19 | one board. The process I live on the same golden in soccer. |
|
|
46:23 | . The third process, but the process actually had access to uh access |
|
|
46:29 | places. Uh One more war on top of zero. So that's why |
|
|
46:35 | rather than going to the same core socket, socket one It went to |
|
|
46:39 | next next available, go on Socket . So process movement to Sockets. |
|
|
46:45 | , 0 on Socket one. Process went to core one on south at |
|
|
46:50 | . So whichever available? Poor. is there? It will go, |
|
|
46:58 | will be, it will be the time you'll see uh this process uh |
|
|
47:04 | to come back to you and you with all all the available on that |
|
|
47:11 | ? Uh no longer than them and back Because zero Unsalted 1. |
|
|
47:16 | if it has more processes to Yeah. Is that making sense? |
|
|
47:33 | . Yeah, so shocking, You know what, what is it |
|
|
47:40 | Slurp gave me access to only one now because because I asked for 65 |
|
|
47:46 | on these nodes, you have 54 on the socket. So I got |
|
|
47:50 | got 14 or on socket zero and 60 64 on socket one. |
|
|
48:01 | correct. Yes. If I had all the 1 28 course I did |
|
|
48:05 | because again on bridges I'm getting along times because of that. But |
|
|
48:10 | if you were requesting all 1 28 , then you would have seen two |
|
|
48:15 | lines going all the way towards the . Right? Yeah, but |
|
|
48:24 | And just one last example I show then I'll let dr johnson continue. |
|
|
48:30 | was it? Yes, map eye and trying to start it as |
|
|
48:36 | So it was binding to cause so inside the socket on lady process can |
|
|
48:43 | between uh entirely inside. Just want that. There's no movement at |
|
|
48:47 | basically. Oh, so if I bind through socket kit, then the |
|
|
48:53 | looks something like this. I know a little bit hard looking, but |
|
|
48:56 | guide you through it. All Yeah, Yeah. Mhm. All |
|
|
49:02 | . So, this here is processed . Which again, there's only one |
|
|
49:08 | from its mapped and bind it to socket. zero. Yeah, is |
|
|
49:15 | Yeah, in this case it would better area. Yeah. Okay. |
|
|
49:25 | that will make sense. The scroll here. So, so uh zito |
|
|
49:35 | floor on the on the soccer zero . But if you look at process |
|
|
49:40 | which is here I think. And yes. So it's violent to |
|
|
49:52 | the cores that were available on on on the second pocket. So mapping |
|
|
49:57 | that socket. So each process goes an internet socket. But now the |
|
|
50:02 | are blinded to a socket or rather a call. So the weapon system |
|
|
50:07 | removed. The process between any of physical cores there. So depending on |
|
|
50:15 | you want to do in your application happenings and my links can be. |
|
|
50:20 | huh deciding, deciding factor in what you get and that's you. |
|
|
50:28 | Mm hmm. Yeah. Let's see this makes sense. So what I |
|
|
50:36 | was around the uh the broadcast operation I showed you with 100 megabyte of |
|
|
50:42 | exchange to be done. So this the execution call that I performed. |
|
|
50:47 | now if you see my assignment um in between uh to do prophecies you |
|
|
50:54 | see the difference and now you can what happens when you go to. |
|
|
51:01 | Still I'm on the same note, only communication is open between those |
|
|
51:06 | So even with different sockets at least minimum time is at least three times |
|
|
51:11 | less than you can follow them. maximum tender. Uh huh. Uh |
|
|
51:25 | . Here also the process, the time very close to that. |
|
|
51:35 | Yeah, 100 is really not the . All right. Yeah. |
|
|
51:44 | I credit. Oh exactly. Uh Yeah, that's pretty much what |
|
|
52:02 | have, Yeah, yeah. Oh, mhm Yeah, can't do |
|
|
52:19 | . Mm hmm. Mhm Right. , especially him up over the final |
|
|
52:28 | . I will. Mhm Yeah. . And it's on us so greatly |
|
|
52:34 | time and it should be on the represents in terms of projects, upload |
|
|
52:44 | again, if it's not there, think they all done this, that's |
|
|
52:51 | Change. The old one. Is still? Yeah, yes, that's |
|
|
52:58 | I was up to. The deadline on top here. So I the |
|
|
53:04 | to send it up and and since started to talk about it and it's |
|
|
53:08 | to get started early that I wanted be this project description and went |
|
|
53:17 | I think today, no, the structures, so it's not very onerous |
|
|
53:24 | do the project description is just, just think about it. Um and |
|
|
53:30 | there's a bunch of texts here. but the point is um we need |
|
|
53:37 | write normal than age in terms of you want to do, What resources |
|
|
53:48 | need to do the project and then dramatic 16 years clusters or computers being |
|
|
53:55 | in the class and even have their and I want to reject it, |
|
|
54:00 | not something that is in process and , there's a good chance you can |
|
|
54:05 | it. And then once I mentioned time, do you need to describe |
|
|
54:12 | data sets are going to use and you're going to verify correctness and then |
|
|
54:25 | , so part of the thing is common among students are too and data |
|
|
54:33 | that they have in mind is too to make sense of to started to |
|
|
54:40 | out either paralyze or trying to tune the running time, the milliseconds and |
|
|
54:47 | doesn't matter what to do, you're going to measure over. So it |
|
|
54:52 | to be enough of a workload not connect some decent observations about performance. |
|
|
55:02 | that's what I'm trying to get out the description. So I'll give your |
|
|
55:07 | but it's a final guy. I you to just our ideas and that's |
|
|
55:14 | much I think all these taxes and I'm here in the long list |
|
|
55:22 | projects students in this past over the have done um some are and it |
|
|
55:31 | basically what their students interests have Uh huh. So it comes from |
|
|
55:40 | different aspects of different applications or some it is more just there have been |
|
|
55:49 | fifties, uh huh. You might interested in another something where this aspect |
|
|
55:57 | it and there's something three slides So there's lots of things to |
|
|
56:03 | Some of it comes from engineering scientific and someone with the systems come from |
|
|
56:10 | image based applications and um thank you some, you know, competition |
|
|
56:21 | second physics or dynamics or what many and disciplines. So these size, |
|
|
56:32 | are kind of the old examples, they should be on the website of |
|
|
56:36 | problem again and so on. This just to give you a deal. |
|
|
56:42 | right. Yes. And I'm sure they have things in mind that do |
|
|
56:48 | classes or from it or thesis that this system vote from the class |
|
|
56:55 | perfect. The violence of work on them next to the donald. |
|
|
57:00 | That's about me. We're gonna do you're interested in. Mr um so |
|
|
57:11 | can do it. You know, more than mp mp. I clusters |
|
|
57:15 | A C or whatever. Uh programming and and choose choose spine is known |
|
|
57:23 | the appropriate for the project. And as we said in the last |
|
|
57:32 | that the focus is what to learn to the techniques teaching in the |
|
|
57:44 | If you get fabulous performance in the of close to thanks performance or high |
|
|
57:56 | , that's terrific. But even if don't, as long as you understand |
|
|
58:01 | you have to fix it, uh uh that's fine too. So long |
|
|
58:12 | . And a high performance is not fraction of deep, high efficiency |
|
|
58:17 | not all this easy, most of time. It's not. But the |
|
|
58:23 | of the classes that we should have idea how to approach it. And |
|
|
58:27 | can tell you no one steps from first inclination what, but, you |
|
|
58:32 | , efficiency of that and then where are, it's time to wrap up |
|
|
58:37 | project. And um and then see else you would have done if you |
|
|
58:43 | more time. So it's yeah, guess I did say that, did |
|
|
58:49 | say, I think that's the beginning the chorus sets, um the written |
|
|
58:56 | on the presentation, so the exam , final presentation time and we want |
|
|
59:02 | get both the written report and I want to say at least 24 |
|
|
59:10 | at the end of the presentation, everything to cut it preparing for feedback |
|
|
59:15 | time from the presentation. Mhm. the I think you said on the |
|
|
59:25 | one here a check, but still the example is on december time. |
|
|
59:35 | it's a monday, I have it to fight. So it's actually, |
|
|
59:41 | think the first thing they sound Oh, but I also know that |
|
|
59:52 | uh students take off of the holiday and if the class of the whole |
|
|
60:04 | to earlier, you can certainly do process, but that's what we kind |
|
|
60:12 | agreed upon the students in the past to what happened for. So I |
|
|
60:25 | questions in relation to projects, that's last time, you know, I |
|
|
60:35 | the individuals projects but to his final , some type of students, Children |
|
|
60:43 | could come back 13 smile, that very nice. The Syrians. |
|
|
60:51 | that's fine. Yes, you some students have done an empty and |
|
|
60:59 | opening here or open MPI and openness see and so or the serial versus |
|
|
61:09 | , it's fine. So I just to understand we have whatever problem we |
|
|
61:17 | uh to understand. What efficiency is cold yet. Um it's when it's |
|
|
61:33 | good from the first information, not look under cold. What steps did |
|
|
61:38 | take and why did you take And did it actually pay off? |
|
|
61:45 | . So we have talked about some it, I haven't talked about all |
|
|
61:49 | much yet, but the talks a bit, at least some of the |
|
|
61:54 | to write the check to the memory and try to act efficiently. So |
|
|
62:02 | is a little bit, I'll talk about that turns out four months structure |
|
|
62:08 | Soul. The thing we didn't talk was a couple of open, empty |
|
|
62:14 | you are to share, divide up work among friends and how to also |
|
|
62:25 | of manage memory in the sense having variables versus shared variables based conditions, |
|
|
62:31 | it's also memory utilization those kind of . We talked about talk a little |
|
|
62:38 | from the used GPU that's worried about transfer between the two and my complaint |
|
|
62:42 | producing and they spread a little bit the demos that compilers have gotten better |
|
|
62:48 | last year, whether you use the versus non managed memory yourself and you |
|
|
62:55 | try and see if it can be . Uh that's sort of uh I |
|
|
63:03 | play around, but uh scalability in of others escaped in a number of |
|
|
63:09 | you use or um number of threads also in terms of the problem size |
|
|
63:21 | for manifest the common things wow. the comments of americans, process |
|
|
63:32 | yes, process not being talked about . Both open, empty and FBI |
|
|
63:41 | and in particular Mattis for those, know, as shown uh if you |
|
|
63:51 | few processes and pretty much everything is one or two notes french fry it |
|
|
63:57 | doesn't matter too much. And she to reduce the number of notes. |
|
|
64:04 | , so in the election, find can do it. I'm not to |
|
|
64:10 | whatever Since nowadays there's a fair number course and there's no also say I |
|
|
64:16 | to use 100 what are the number course and you can use it on |
|
|
64:21 | minimum number of notes or you can it thin. Yes it did. |
|
|
64:26 | will take only a few times to a bunch of notes industrial one or |
|
|
64:30 | courses now but you can experiment and the uh huh spending things out |
|
|
64:39 | So that was, I think I about it in terms of connectivity in |
|
|
64:44 | of the scatter or our contacts are with different names of the things minimize |
|
|
64:51 | number of military use for the number threats that we want to use processes |
|
|
64:57 | you want to spread it out and spread it out to the advantage because |
|
|
65:02 | advantage case for the number of Uh huh Sometimes that's a good |
|
|
65:08 | Sometimes the gun culture cluster, sometimes extra communication as those that otherwise you |
|
|
65:15 | the game became then we're done with communication that manage it from the |
|
|
65:22 | Also fixed and also spreading moving selective . But we're combat things again because |
|
|
65:39 | Level one and on both Bridges Center some speed 11111 to our private and |
|
|
65:51 | . So that means how you might sometimes they're sharing, compete for the |
|
|
65:59 | resource and it's better to spread things again that a memory about that he |
|
|
66:05 | up but it also can result in one so cash lines sourcing Iran and |
|
|
66:13 | forth sharing. So that can degrade in the intention was good. So |
|
|
66:19 | some things again, playing around with . And I'm just so again, |
|
|
66:27 | understanding of some of the tools brought , but it works for the application |
|
|
66:34 | a hand stuff. All right. were also comments and seeing them past |
|
|
66:47 | in europe with an application make justice times. Remember the access by them |
|
|
66:56 | have a significant empire. So the applications in china. Yeah. |
|
|
67:03 | I have a reputation. No. huh. Yeah, memory bound provides |
|
|
67:11 | stream or fixed artists that you have your assignment. What kind of situation |
|
|
67:19 | devices, execution that's not the Keep the change. Well mm |
|
|
67:27 | Find an application of it because it not do what? Optimize the application |
|
|
67:34 | that for shoes. Oh, uh . So coming insane when the other |
|
|
67:50 | needs extreme groups. And the point to, so in that case, |
|
|
67:55 | , about the memory system, I a fairly arbitrary actually planning a number |
|
|
68:01 | distribution of performing well compared to something with the race so that we also |
|
|
68:09 | something that could potentially be a Winchester playing on and see the difference between |
|
|
68:17 | does make its operations as far as operation, different potential structural things. |
|
|
68:27 | huh. Again, good performance at sparse matrix. Um, and again |
|
|
68:37 | and other things so forth. Whenever is some good um, hold up |
|
|
68:46 | open source or it's even if it's some vendor library installed on systems, |
|
|
68:53 | have access to encourage students to invest use those that are presumably highly engineered |
|
|
69:02 | get good performance. That's a reference . Compared to what we're talking about |
|
|
69:09 | when it comes to 50 service. , sign for the rentals or I |
|
|
69:17 | corresponding math library. You are well and it doesn't mean that you can't |
|
|
69:24 | them. But but in general they work really well. So it's a |
|
|
69:30 | sense of getting the sense for our cold complexion. What could be expected |
|
|
69:41 | um, it can't all this expect Yes, 100% efficiency. Well, |
|
|
69:51 | kind of ridiculous statements of most But um, in a certain computations |
|
|
70:00 | simply can't do it, bring it because matrix operations and genetics, |
|
|
70:07 | So major protector as a balance. multiplies is perfect for what dominates today |
|
|
70:13 | a confused not by ad architecture. the next single structure, you do |
|
|
70:18 | on supplying that. But if you have the same number as multiplying good |
|
|
70:27 | , they can't benefit from that. there is no way that you're going |
|
|
70:32 | get that. People look at the line size to there was sort of |
|
|
70:36 | . If this doesn't hold that this the best performance. You can. |
|
|
70:41 | . So then they go that's realism on the application. What's best for |
|
|
70:48 | maximum performance is that You mentioned that and 50s. The standard model. |
|
|
70:56 | is a complex. The complex at doesn't have the same number of |
|
|
71:01 | And multiplies because the conflicts operation. it's Where is the four months of |
|
|
71:06 | and six ads. So if you big performance, assuming you can do |
|
|
71:12 | multiplying us at the same time. is not enough multiplies the matchup for |
|
|
71:17 | ads. There's no way they're going get 100% efficiency and that's nothing wrong |
|
|
71:22 | it. But then they need to set your expectations store. So that |
|
|
71:29 | you know, and maybe you can for instructions are best about debts and |
|
|
71:35 | do that is just that Make up get to the 10 opposite tastes. |
|
|
71:40 | complex mixed response invitation. So so they should start, you know, |
|
|
71:47 | you can get close to that number it comes to be about 70% and |
|
|
71:52 | still doing I don't know something. then um so it's again try to |
|
|
72:05 | the call them how to relate to architecture and what souls checks can |
|
|
72:12 | Uh huh. The gap right. to write, you know, drive |
|
|
72:20 | sports car figure out over the Yeah. So I mean 1 1 |
|
|
72:38 | to think about it is rather than jumping into analyzing for other applications. |
|
|
72:46 | . About the members. The way start is the way we started doing |
|
|
72:51 | is can I do things that no a single first step? But then |
|
|
72:56 | my problem is that I don't have compute resources to get good efficiency, |
|
|
73:03 | there it would make sense paralyzing the otherwise. Uh huh. Uh The |
|
|
73:14 | class act. Thanks. Are we the structure of the Yeah. |
|
|
73:26 | Right. Mhm. I'm trying what's the video practice much. Uh |
|
|
73:33 | Well, what's up? They're not much I'm good for? Yeah, |
|
|
73:40 | can. Right. Yeah. So yeah, that's something I |
|
|
73:50 | students too start with trying to figure well, uh how much work did |
|
|
73:57 | ? But what is the work of moment? A long time we take |
|
|
74:02 | terms of are really successful. And then it should be in it |
|
|
74:07 | the order of seconds at least total that I am in the best possible |
|
|
74:12 | initially. And they were translated to with the measure things any may end |
|
|
74:18 | taking hours in the end, but a starting point to make sure you |
|
|
74:23 | have to smaller data. Okay. you so much. Well, and |
|
|
74:32 | recording. Stop recording. Mm All right. |
|