© Distribution of this video is restricted by its owner
00:00 | before. So, um, today's essentially focused on what it is like |
|
|
00:17 | work? Uh, and they said the way you will be working for |
|
|
00:26 | assignment and most likely the project as . So and for that, there |
|
|
00:36 | a lot of background. Ah, . I would say that is |
|
|
00:41 | too. Uh, get used to it's, um, what critical in |
|
|
00:49 | these environments are used, as well for the assignment you're going to do |
|
|
01:00 | binding and many of you may I'm these use trail environments. But |
|
|
01:07 | most of the time, student birth not for me with it. So |
|
|
01:13 | talk quite a bit about resource managers in particular Nolan's learn. And then |
|
|
01:23 | will goeth through some of it slowly some of it quickly, and then |
|
|
01:29 | . Josh will do a demo that , I guess hopefully give you some |
|
|
01:40 | feeling for how things actually work in shared in Rome. It's then I |
|
|
01:47 | either I or so I will talk them. Module commands, which is |
|
|
01:54 | common, is used in the shared . It's a command that best will |
|
|
02:00 | you to load software environment or a of software that is useful for |
|
|
02:08 | but it's not necessarily accessible unless you this model command. So things will |
|
|
02:15 | the paths to the very soft energy . And I talked about timers because |
|
|
02:22 | of the assignments you will do I need you to figure out what |
|
|
02:30 | the cold, because down the classes my performance. So for performance, |
|
|
02:35 | need to understand how resource is still , one of the most basic |
|
|
02:41 | miss timers. And it turns out to be all that simple to do |
|
|
02:45 | good job in timing thoughts and then simple hints about I have to collect |
|
|
02:54 | and how to report it for The sameness. And again, for |
|
|
03:00 | course all this time, this is on development understanding. Have a |
|
|
03:07 | behaves on the detector platforms. You're more soul than for you too cold |
|
|
03:13 | the platform sounds game. The incidence you know what to do to get |
|
|
03:18 | good performance. All right. Delancy all right. So some of you |
|
|
03:33 | or ever gone ahead, a sports last lecture to get yourselves accounts. |
|
|
03:43 | , do the platform so I think may have omitted attack on the horse |
|
|
03:48 | , but it is on this So three of you before costs already |
|
|
03:58 | been used for a council. Both and, uh, I have confirmed |
|
|
04:03 | you have account access to the camp attacked Account for the class force approved |
|
|
04:10 | . So you remain this offer that that useless to the account may not |
|
|
04:18 | Run yet. So But tomorrow I'm sure you will have access to the |
|
|
04:23 | . Can't, um, in a so in conventional last time that the |
|
|
04:35 | personal computer and central has his computer or cluster known as bridges. And |
|
|
04:40 | the one it would have account? , they kind of portal or than |
|
|
04:48 | through with you get access through. system is known as exceed for the |
|
|
04:53 | science in the dining and discovery environment is, an NSF funded into |
|
|
05:00 | Attack is, in fact, also of exceed. Um so feed is |
|
|
05:06 | an umbrella organization and just off of centers participating in C. And when |
|
|
05:16 | comes to tax, that is the A trance computing Sentra. We kind |
|
|
05:21 | have a little bit of the privilege in Texas so we can get access |
|
|
05:31 | the Texas institution separately from its So part of the reason why I'm |
|
|
05:37 | it's not listed as an exceed sentries we use the Texas privilege to get |
|
|
05:42 | accounts, you know, through the exceed organization. Um, so this |
|
|
05:53 | something on a pretty much already sent . Works that, um you just |
|
|
06:03 | to have their own account and then use of them get linked to possibly |
|
|
06:11 | different projects in the North. you. I will link you to |
|
|
06:15 | accounts. And some students may already accounts from this very insistence, and |
|
|
06:21 | all fine. They should not be . And the one thing I do |
|
|
06:27 | for kind of honesty used the class , four class assignments and projects, |
|
|
06:35 | not for whatever signs you normally be institution. No problem in getting accounts |
|
|
06:41 | that as well. But we get TV educational cool and more fairness. |
|
|
06:50 | should not abuse it. Anyone that not have seen clusters are put in |
|
|
07:00 | kind of picture off both homegrown and professional build clusters. And I'll talk |
|
|
07:06 | little bit more about that. Just nobody would never seen. Wanted for |
|
|
07:12 | to look long when it turned Just log in remotely. Someone doesn't |
|
|
07:17 | know what the physical thing may look . It is gonna live with more |
|
|
07:24 | all and again on the left son in kind of homegrown thing. Where |
|
|
07:28 | on the right side? You kind seem one of a professional put together |
|
|
07:34 | . And we'll talk more about actually put together this just to give |
|
|
07:39 | guys a notion of what it might in real life. So no to |
|
|
07:47 | Kamler, that is important for Using the resource manager as well as |
|
|
07:57 | understanding your coats. And some All for capture is on the figures. |
|
|
08:06 | unfortunate some of it is somewhat So I tried to be consistently in |
|
|
08:14 | class, Um, and this slides of going from the top Proton for |
|
|
08:24 | bottom. It is a little bit a bottom up. You off what |
|
|
08:30 | going to use. So what you in the upper left hand corner is |
|
|
08:35 | a soda off what's known as a . Mississippi's a silicon that has always |
|
|
08:44 | logic and crashes and what not on ? No, Uh, that's what |
|
|
08:56 | I will refer to as and processor or it used to be, in |
|
|
09:03 | good notion. CPU Central Processing Unit back when there was only one processing |
|
|
09:09 | at most in the beginning, that not in abundance portrait that was put |
|
|
09:16 | on the circuit court. Nowadays, have many course. Even things in |
|
|
09:23 | phone or laptop tends to have a course, even if they may not |
|
|
09:28 | that many. So this piece of that is more in the company's like |
|
|
09:39 | , Intel and the I am designs her produced. They package it |
|
|
09:47 | And what's kind of this process is you let you see in the |
|
|
09:52 | nothing gets, then plugged into what's as a socket on the circuit |
|
|
10:01 | So on the right hand side in top row, you see sort of |
|
|
10:06 | instance on the circuit boards and somewhere the red green years as under upped |
|
|
10:12 | basically the processor Hyzy and they when comes to clusters, one returned to |
|
|
10:23 | to these individual PC's, which could or servers just a normal things when |
|
|
10:30 | go to a website and trying to something you can order a PC over |
|
|
10:33 | server on. That's usually then something the circuit board, possibly packaged in |
|
|
10:44 | couple of different ways, as so rack units or blades. Bracun interplay |
|
|
10:53 | really just for your information, it's something that you well actually need to |
|
|
10:59 | for your class. Want understanding course and knows it's essential. Then, |
|
|
11:10 | when it comes to the things that being put together to form Custer's, |
|
|
11:15 | on either uses rap units or when blades, cracked units, they gets |
|
|
11:23 | mounted into racks on the door left corner, whereas a place gets first |
|
|
11:32 | in the was known as the surface then the chances goes into the |
|
|
11:39 | We won't go into too much, we'll talk a little bit more about |
|
|
11:45 | . Many let Chris down there into glass on the reason for blazes |
|
|
11:52 | They tend to be more, and efficient. Then just using rack units |
|
|
11:59 | though it's the same process is being . There's a test to deal with |
|
|
12:03 | packaging and part of The reason that a more efficient is that come in |
|
|
12:10 | as your point of view is that chefs is allowing it to her if |
|
|
12:18 | , because then the cooling infrastructure is for all blaze at once, and |
|
|
12:24 | fans tend to be more efficient than small fans. The Finn fits in |
|
|
12:29 | raccoons gun in a way that to home message sent. You need to |
|
|
12:35 | this processor, cores and notes, the rest of it is just general |
|
|
12:41 | for how things are put together. is what kind of more functionally a |
|
|
12:49 | looks like. So, um, kind of a schematics, I |
|
|
12:58 | on the right hand side of this , um, so clusters on up |
|
|
13:04 | put the homogeneous toe have different All knows. So there is a |
|
|
13:11 | note now someone you connect to when try to use a cluster, and |
|
|
13:19 | it's not just a single load in . It may be single if it's |
|
|
13:23 | crossed over a few users, but ah, cluster, that sort of |
|
|
13:28 | of useless like stumpy to attack on . There's usually a few of |
|
|
13:35 | but not that many, so they missing. The logging those are supposed |
|
|
13:41 | do is basically to allow you to to the cluster, so you're not |
|
|
13:50 | supposed to do anything on and the mistake off. I was saying new |
|
|
14:01 | , which is the case for many . They tend to, Jin |
|
|
14:04 | Sir, Computer and in principle you do anything you want on it. |
|
|
14:10 | you shouldn't because it's not configured and not, um, there's not enough |
|
|
14:18 | them to actually support other things. this administration of uses from a logging |
|
|
14:26 | . So one thing never compiled or coats on logging knows then clusters have |
|
|
14:36 | knows that you're not really going to uh, exposed to. That's something |
|
|
14:41 | the system of mean used to managed cluster and their compute notes. Those |
|
|
14:49 | the ones they're going to use for and given other kinds of things. |
|
|
14:54 | your colds on the first and compute tends to not be one of a |
|
|
15:02 | , either. There you feel a kinds on its Custer, and they |
|
|
15:10 | different in terms of them out the on the note on and as you |
|
|
15:14 | see on the Regis cluster, and they have three or four classes on |
|
|
15:22 | nowadays. Respect the how much memory is on the notes. So they |
|
|
15:27 | regular memory knows which is what you have for the class than their what |
|
|
15:32 | call large memory knows on there extra member nodes and then they have |
|
|
15:40 | Old that is, knows what has deep use on them that comes to |
|
|
15:48 | . Other sites may have f GS there knows as well, and then |
|
|
15:54 | separate, Iron knows. So it's to I understand that bargain notes |
|
|
16:05 | They are just for logging purposes. there are other no such a use |
|
|
16:11 | compiling and running coats. I A there's something again that is Carter all |
|
|
16:18 | you actually build or configure your But usually users, at least for |
|
|
16:24 | class you don't need to worry Are you notes as well? |
|
|
16:30 | yes. This is something you can season, particularly. Some things flavor |
|
|
16:36 | what you will get on bridges. it is a regular memory knows I'm |
|
|
16:43 | focusing on up to 752 Children um, they Each of them |
|
|
16:54 | uh, the somewhat old I It's called Intel Haswell Processors, which |
|
|
17:00 | 14 course on them. Sorry. it's got confused. That's what |
|
|
17:09 | uh, sorry. I would backtrack for a second, because these are |
|
|
17:13 | things like that I brought up at . Um, slight. So let |
|
|
17:18 | go back and try to get the slide. Ah, so for home |
|
|
17:41 | from this business. Very. You see in this lineup? |
|
|
18:57 | Okay. Sorry about that. So is There's no more specified that |
|
|
19:04 | This regular member knows their 14 uh, safety here. And it |
|
|
19:13 | what's known as a dual socket type . No. So there's two CP |
|
|
19:18 | in each note and each see you as 14 course. Then there's on |
|
|
19:25 | 28th gigabytes off memory in each knows means up the two cp use on |
|
|
19:34 | implied 28 course in the know They do share the same 120 gigabytes |
|
|
19:40 | them. Right in the note. , then there is the GPU knows |
|
|
19:49 | you will use for the exercise are you, and potentially in projects on |
|
|
19:56 | . If that's what you do on they have two sets off. Knows |
|
|
20:04 | JP use have 16 knows with an version of immediate, diffuse no |
|
|
20:12 | Jepson. Okay, any model and they are free to nose for the |
|
|
20:18 | reason. Keep you that a C 100 nodes. It's still not the |
|
|
20:25 | recent deep use from a media, they are fairly recent on this Tempe |
|
|
20:35 | system that is more recent version off Euston. It's available on bridges, |
|
|
20:46 | it's known as Skye Lick. that's what they called it Escape on |
|
|
20:52 | like it's typical interviews. All comes naming on their processors. People look |
|
|
20:59 | but a camp alongside have basically I would say numbers the It's good |
|
|
21:09 | know for actually exactly what it but the name Sky Lake, or |
|
|
21:15 | full tells you a little bit more . How old than what generation technologies |
|
|
21:19 | used anyway. So these are also soften nos. On. In this |
|
|
21:26 | , there is 24 course for seeking subject. Um, it's a little |
|
|
21:33 | more memory and use one of them , and this kills 1 92 instead |
|
|
21:37 | 1 28. Um, so I that's wasa little bit just arm |
|
|
21:48 | And next, and talk about the management software and the questions on the |
|
|
21:57 | clusters in general or the particular processors no such it will be using. |
|
|
22:13 | , um, so, um, is commonly used on lots of |
|
|
22:24 | not just academics or this this type but also in industry is something called |
|
|
22:33 | that stands for a simple, lunatic limited utility for this mismanagement. There's |
|
|
22:39 | open source piece of software on, so the little bit that works. |
|
|
22:44 | then I will fairly quickly go through few sites about how to use CERN |
|
|
22:55 | this nicely basically a way for you remember the demo that so yes will |
|
|
23:03 | so and look trying to spent too time on the sites because that's reliving |
|
|
23:10 | documentation of the demo of them and other purpose so serious all these resource |
|
|
23:19 | type of things work. Um, haven't used it before. Floor |
|
|
23:26 | actually, and the one that uses puncher or Maxwell August. Your visual |
|
|
23:33 | Irma's forests and remember, but in the way you worked is you |
|
|
23:40 | to the logging note over the Internet hard when it comes to these centers |
|
|
23:48 | from the lucky knows you, then low interest of nature jobs to the |
|
|
23:56 | manager, um, there is force in submitting the job, you need |
|
|
24:04 | tell the resource manager what joy job needs. So that means you need |
|
|
24:12 | tell it. How many nose do want to many course? Do you |
|
|
24:16 | how much memory? No de along all of that information, then |
|
|
24:23 | handed Today resource manager that, Then I'm a bandage system mission and |
|
|
24:34 | the job to the actual clusters. , things are not sitting idle, |
|
|
24:43 | that means jobs ends up in the on getting huge for some time and |
|
|
24:50 | most situations. And certainly that is case for both bridges and stampede to |
|
|
25:00 | more than one Q. Because the institutions set them up so that |
|
|
25:09 | there may be accused in the think bridges for knows that has caught the |
|
|
25:16 | memory, large memory and honor to for GP juice. There may be |
|
|
25:22 | for short running jobs and the separate for very long running jobs. Never |
|
|
25:28 | yet another queue for jobs and Very large number of nos exception. |
|
|
25:34 | all of that thing is dampened by resource research manager, um, and |
|
|
25:44 | never run jobs on the lovey. , and sorry for harping on that |
|
|
25:50 | much, but it tends out. a common mistake than if both I |
|
|
25:56 | us tend to stress not sitting And I'm very, very quickly. |
|
|
26:03 | , bits here about Sturm. It's fairly substantial piece, but softer |
|
|
26:10 | Last time I looked in the tense , it was again over half a |
|
|
26:16 | lines of code that then manages the and the next lie open oil a |
|
|
26:24 | bit ahead of myself. But it's for run on potentially and has is |
|
|
26:32 | on very lost system, with Sicilia more than 100,000 notes. And and |
|
|
26:42 | , you know, millions of fares not familiar or thought more about threats |
|
|
26:46 | the bill are basically execution streams of paralleled calls tend to have potentially very |
|
|
26:53 | number, and and then it man just a large number of jobs as |
|
|
26:59 | . Um, it's open source is , you know four lee necks and |
|
|
27:06 | in most fair versions of Lennox. , so you know, a little |
|
|
27:17 | that is critical to know you when request resource is Yes, sir. |
|
|
27:31 | have a way off the question that are the only use it all the |
|
|
27:40 | is you request. That's not necessarily case. Otherwise, um, the |
|
|
27:49 | system man choose to, for have other jobs working on some course |
|
|
28:00 | the Noja using because, remember, are not, of course so, |
|
|
28:07 | terms of registers 14 course on it's and a total of funny course and |
|
|
28:14 | , you know, and they always decide to share some of these course |
|
|
28:21 | other jobs. So when you do of colds and you want recently good |
|
|
28:32 | , it, um yeah, stable timings. Then the timer will |
|
|
28:41 | count time that that is used by jobs. But it's still a lot |
|
|
28:45 | resource management that goes on. If several jobs running on the same No |
|
|
28:51 | , then potentially even on the same . So bunion, it's not the |
|
|
28:58 | , and you're doing called development or on some flavor. But if you |
|
|
29:02 | the time Something you should use known exclusive old side. Um, that |
|
|
29:15 | . The rest of the thing on , like this Gibbs wants pretty apparent |
|
|
29:19 | resource management that it manager shared It is a little bit of this |
|
|
29:27 | slur. So, through your various are ways of the question things from |
|
|
29:35 | resource manager, so on and suggestible them or some of them. |
|
|
29:44 | useful thing is to request information s differs. Slur himself in front of |
|
|
29:51 | so you can get info with The information of accused. You can |
|
|
29:55 | the information, but the count's and they kind of stand it. |
|
|
30:01 | you're definitely reduces the run command that is the way that basically to commit |
|
|
30:07 | job to slow him, to manage was kind of control demon and runs |
|
|
30:17 | some Mine has been no. And on each one of the notes that |
|
|
30:21 | been used for the job, there local or compute. Nadeem Damon. |
|
|
30:26 | possible controls Eamon about what's going on the job on. There is a |
|
|
30:36 | coming back to this notion, off processors and nose. And actually, |
|
|
30:46 | morning and this slow. There's something as partitions, so starting with the |
|
|
30:53 | inside petition is a grouping. All , as I mentioned, that there |
|
|
31:01 | then for the regular memory and olds keep you nose and it's also knows |
|
|
31:07 | grouped into petitions. So when you the job, you're submitted job given |
|
|
31:16 | . There's a fair amount of flexibility hunting petition or configure petitions, but |
|
|
31:24 | not something you will be doing in class. But it's gonna be useful |
|
|
31:29 | know that that is so. Petition not necessarily be distinct, so some |
|
|
31:35 | maybe pork Immonen. One petition that matters a bit in terms of again |
|
|
31:41 | training, son curing times and the on the left hand side of the |
|
|
31:50 | have this notions off threads course on diskette sockets or processors. So Threads |
|
|
32:01 | comedy unit of execution that is managed the operating system. So it is |
|
|
32:09 | sequence of instructions that then has their dedicated registers. Azaz, for |
|
|
32:21 | So until sometimes referring to threads in context is harder threads, it's not |
|
|
32:31 | me. It's an is normal because no particular harbor, um, connection |
|
|
32:40 | gets kind of petition up to some among friends for that, since it |
|
|
32:47 | resources. Some of them are unique but not home resources and unique to |
|
|
32:57 | . So threads they do actually execute the course, which is the physical |
|
|
33:04 | . And, of course, the resources. What it is, regardless |
|
|
33:10 | , how many threats you want to on a particular court when it comes |
|
|
33:18 | Intel. Uh, think still, all of the front current products, |
|
|
33:28 | support to up to two threats sites configure if they allow was known as |
|
|
33:37 | threading. That means manual are more one friend on in court. |
|
|
33:43 | but the maximum is to when it to internal. The trip for was |
|
|
33:48 | , that's nice landing that I think love for threats. Other silicon |
|
|
33:56 | MD also uses the maxim tooth France court by the M for their car |
|
|
34:04 | processes. And now more. this was on the micro photograph that |
|
|
34:12 | early on today. Several, of , and are cooked, um, |
|
|
34:19 | common piece of silicon that is, processors are there many course, but |
|
|
34:23 | 14 from the comfortable GIs and whatever number was 24 or something on stampede |
|
|
34:35 | socket is unique name again. That's thing on the circuit board into which |
|
|
34:44 | process of package, uh, gets . Now things are someone than being |
|
|
34:53 | . Is there meaning or CPU? when it comes to slur CPI, |
|
|
35:00 | is kind of the least schedule So when it comes to slowem, |
|
|
35:10 | CPU is effectively. That's right. that's where we typically people refer to |
|
|
35:20 | processor. We'll see P year as physical entity that houses the large |
|
|
35:29 | of course. But when it comes CNN, sleep you is. That's |
|
|
35:40 | . So again, one used to track of this notional course Processors were |
|
|
35:52 | and notes on petitions toe with Some of the jobs on this picture |
|
|
35:57 | or less just says forever. it is a little better and suggest |
|
|
36:07 | them of this. So that one how many knows one wants. |
|
|
36:15 | Specify how maney the hospitals this one in this job want to specify. |
|
|
36:27 | wants the task elevated to cores sockets . Two notes on not today, |
|
|
36:36 | much later in the course, we'll about why you may want to control |
|
|
36:41 | the various friends were allocated to Sockets notes because there are many shared resource |
|
|
36:51 | that effects the performance off your and that's already said that they're |
|
|
37:02 | So shed reinforces starting with the chip the processor. The core's yes, |
|
|
37:12 | talk about later on, too. tend to have their always have their |
|
|
37:19 | execution unit. They always have their registers and then depending on the |
|
|
37:31 | Some caches are private to the but not all of them tend to |
|
|
37:37 | private. Two courses. Some of cash is no shared to all the |
|
|
37:43 | on the processor. Now, when run single process of jobs, then |
|
|
37:52 | may not matter. Exactly. The capped. I can't but normal it |
|
|
38:00 | . But when you run jobs that uses many nodes move schedulers today. |
|
|
38:11 | you know care involved how the nodes connected? But the connection between knows |
|
|
38:19 | network that is used has an impact the performance venues use most people's. |
|
|
38:29 | in that case, if you Johnson is harsh number and your nose |
|
|
38:34 | demon, a few number of The performance may very depending upon where |
|
|
38:40 | the network of snows are located and for that purpose, you can specify |
|
|
38:47 | nose and want. Jemaine no has very good computation capability between the |
|
|
38:59 | the networked and also Fisher. So , flu cold has significant dependence on |
|
|
39:09 | between notes. It may be effective other jobs running on totally different |
|
|
39:14 | because the packets up there used for may interfere with his job. |
|
|
39:24 | then this pointed out, really course notes election there is. And the |
|
|
39:30 | day that shouldn't be. But in out that it's not all not uncommon |
|
|
39:36 | even if on the notes are supposedly to be identical, the same processors |
|
|
39:43 | the same amount of memory, same systems and same everything it happens that |
|
|
39:51 | still run a different cock rates. you will get different performance, even |
|
|
39:57 | you shouldn't expect it to be the . And someone always has to |
|
|
40:01 | ah, conscientious that if there are hard behavior, you may not necessarily |
|
|
40:07 | using a code that maybe something Um, no. So this is |
|
|
40:17 | a little bit of commands, and will go through a few slides |
|
|
40:23 | and then I will left hand. is just to try to Endemol. |
|
|
40:29 | , um yes. I already said policies and all this other kinds |
|
|
40:37 | So this had been talked about. is a few of the floor commencing |
|
|
40:42 | years in particular. I was me , maybe the council command and chasing |
|
|
40:50 | messed up a little bit and one killing the job. Uh, there's |
|
|
40:54 | info on the to command is strong . Um, there's an issue. |
|
|
41:00 | , we will not. Then we'll of them. But against that, |
|
|
41:03 | on this run and in full definitely. Um IHS again. Information |
|
|
41:14 | can get out of the infocomm man it tends to have the number all |
|
|
41:20 | you got and this ice off the . Physical memory. This Andi, |
|
|
41:29 | of these things have thank you with also. I will not just flip |
|
|
41:35 | STAIs here because you will see these real life on But just is a |
|
|
41:43 | precepts on the left hand side in dark screened shop that you see in |
|
|
41:49 | middle of this life and attempted what patrician name is the tendency weather the |
|
|
41:55 | of that petition in this case for event there was up than attention. |
|
|
42:00 | the time limit has been given to jobs in terms of hours, minutes |
|
|
42:04 | seconds. Um tells you also ah list that has been allocated or reserved |
|
|
42:12 | the job on the ritzy news um, Best more tells you whether |
|
|
42:23 | running or something in the human. is another one from another side, |
|
|
42:32 | command that tells you that's pretty much same thing notices. The different frustrates |
|
|
42:38 | . The acronym is no reflecting in cluster and zero um, there are |
|
|
42:48 | running things interactive, LeAnn, and unusual for call development. This is |
|
|
42:56 | for other things I personally would encourage to do that's submission. Write the |
|
|
43:04 | and let the racecourse manager handle and it's a sit and wait until |
|
|
43:11 | job runs. Andi thinks gets and then you can go on look |
|
|
43:16 | the but suggestible from comment on uh, so so then necessary. |
|
|
43:26 | are best documentation. Slide voice on video should also captured there. If |
|
|
43:37 | and the devil that substantial make so on time, stop here. And |
|
|
43:47 | so yes, do the demo, then I can resume one suitable. |
|
|
43:55 | I'll figure out where Bristled wants to us just down the devil, |
|
|
44:03 | Should I just go ahead and start my screen? Yeah, it will |
|
|
44:07 | Stop shirt. Okay. Okay. can everyone see my screen? |
|
|
44:18 | For the moment, it's a blank . Now I can see something. |
|
|
44:26 | , So in order to connect to clusters, he will need faith as |
|
|
44:33 | such client on Windows, you can a flying such as footy for another |
|
|
44:41 | white for of necessity on Mac, can you pretty much have the message |
|
|
44:48 | mineral already on the consoles off even that now to connect to the regis |
|
|
44:56 | that you are all that you need put in using me. All |
|
|
45:05 | which is you see a seed. this will directly connect you through a |
|
|
45:17 | in north on the edges. Closer you are connecting to in the corn |
|
|
45:25 | for the stamping to cluster. So , in that case, you're you |
|
|
45:36 | Look, something like this. We'll a name and stamping. Dude attacked |
|
|
45:40 | utexas so These are the two ways can get connected through a loving on |
|
|
45:48 | clusters. For this demo, I'll connecting toe affected, the bridges |
|
|
45:55 | So just go ahead and the biggest the a C. Now, when |
|
|
46:17 | are connected, just putting password and be connected to a lot. Clears |
|
|
46:29 | font size, right? So Aziz, you all know these clusters |
|
|
46:42 | based. So all the cluster, the commands that you would generally run |
|
|
46:47 | UNIX operating system you can from them . The first thing to notice is |
|
|
46:55 | then used and you noticed the You would see that if you don't |
|
|
47:00 | walking notice so that circles making nature now the simplest remind that you can |
|
|
47:07 | to get details self more see for is what the processors, processors on |
|
|
47:15 | particular note even use NFC for That will give you details off more |
|
|
47:24 | available. So I'd say until it , if you that has 14 cores |
|
|
47:33 | its socket, you are two subjects we just find the slides on this |
|
|
47:40 | and yeah, this The CPU itself from the hospital my for architecture from |
|
|
47:48 | And then you can see details is like the cache sizes and notes are |
|
|
47:55 | for how the course? I can it on this particular. So that |
|
|
48:00 | be part of one of your first to query the CPU. So you |
|
|
48:07 | to know what kind of hardware you're on. Great. There are a |
|
|
48:14 | more months that you can use for the amount off memory that's available so |
|
|
48:21 | can used that slash process slash mammon that will give you the amount of |
|
|
48:28 | as the DDR memory. So as can see, there's almost from 128 |
|
|
48:35 | , gigabytes of memory available on the . And then there's the partitions. |
|
|
48:41 | mentioned there. You can also use command Ellis Beauty. These minus A |
|
|
48:51 | will give you information about what the system does notice running. So that's |
|
|
49:02 | pretty much all that come on steady usually here, apart from the commands |
|
|
49:10 | requires administrator rights, you won't be to run that because it's a shared |
|
|
49:18 | , okay, so that's the comment it is not for this particular you |
|
|
49:25 | the screen, but in general and when you are putting together a report |
|
|
49:33 | paper for publication, you should always clear on exactly what Francis A model |
|
|
49:43 | this table data as well as the environment in terms of operating system and |
|
|
49:48 | and center that was used for from experiment or for your project because things |
|
|
49:56 | different also on the different operating So this type of information should really |
|
|
50:04 | every paper that tell us anything about to performance. Unfortunately, it's not |
|
|
50:11 | the case, but mining. Keep in mind. Always find out what |
|
|
50:19 | versions you have and put it into reports. Papers that also cost |
|
|
50:28 | Yes, uh, right. So was general commands that you can turn |
|
|
50:38 | the machine now coming through this long that has disorder, there are quite |
|
|
50:44 | few of them will show a few them that, like you will be |
|
|
50:49 | for the most assignments. And obviously the first assignment as well Eso The |
|
|
50:55 | one is in Pokemon, which is long term on for getting information about |
|
|
51:01 | the partitions. So when you run this is a sort of for |
|
|
51:10 | So now on the left side, will see all the partitions. |
|
|
51:14 | So these are regular memory knows some a regular, maybe notes which have |
|
|
51:21 | memory than hurt. Those, the GPU notes and some notes with |
|
|
51:28 | are Did you use that also support . A. Here is a |
|
|
51:37 | There is a large memory notice but are mostly allocated for more scientific, |
|
|
51:43 | computational intensive jobs, difficult lots of and lots of processing. Follow this |
|
|
51:51 | . You can also see the time again as we signed the screenshot on |
|
|
51:56 | as well as what are the states ? Each of these knows if it's |
|
|
52:01 | or if it's training or if it's guest. One of these notes is |
|
|
52:06 | as well as a foul, so can get all the information about the |
|
|
52:12 | that are available on this cluster. the next command you can use a |
|
|
52:19 | is sq. Come on. So if I run this, it's going |
|
|
52:23 | be over. This is a Answer yes to Stressful said earlier. |
|
|
52:29 | don't think on there. Uh Our fourth particular on the 700 knows |
|
|
52:40 | can see that the note number ranges quite different, so they're not likely |
|
|
52:45 | be next to each other. But resource manager tried to find enough no |
|
|
52:50 | they might be. Three when the jobs again, unless you steer it |
|
|
52:56 | may be here called unquote. Far in the network. Three questions are |
|
|
53:08 | is probably not running on. The note is running on a separate cluster |
|
|
53:14 | notes, so it doesn't interfere with user interactions directly because it's kind of |
|
|
53:25 | monitoring on scheduling and during the hue off potentially hundreds of users at the |
|
|
53:34 | time. So it's not usually running configured to run on a log in |
|
|
53:39 | that runs on seven noise. that was the question in chat, |
|
|
53:50 | ? Yes. Oh, this is answered. It's not just pick up |
|
|
53:54 | mute yourself speaker. For some I can't open the check. |
|
|
54:03 | so so that Okay, trying to . Um seen so the question also |
|
|
54:20 | how do we communicate drinking? Morris precisely. Probably communities using Sturm |
|
|
54:37 | so well says you may want to it, but, um, the |
|
|
54:44 | command is what you used to submit job on the Sturm, and then |
|
|
54:50 | can use informed few commands. But suggest may want to talk more to |
|
|
54:55 | point of hungry interact with CIRM. the way I'm questions. Yeah, |
|
|
55:03 | think I think you pretty much answered . That's there are dedicated North spatter |
|
|
55:09 | this lot of child continuously monitoring states each, funded all the other compute |
|
|
55:16 | , and then you don't These demands commands are submitted with those are |
|
|
55:22 | which provide you with all these in and how to submit the jobs. |
|
|
55:27 | get to that in a minute. next part. Okay, that was |
|
|
55:36 | good quick enough of unanswered at this and feel free to come back. |
|
|
55:41 | but more questions as suggest continues. yes, just to give everyone an |
|
|
55:50 | how many jobs there are currently So as as soon as I entered |
|
|
55:55 | command, you will see quite a list that's off the jobs that's that's |
|
|
56:01 | on on the cluster right now. obviously, when you run sq |
|
|
56:06 | you don't want to see all of all the jobs that are running on |
|
|
56:13 | you may want to start them in of the mouth. Eso There are |
|
|
56:17 | a few flags that sq has a of times that you may want to |
|
|
56:22 | . The first one is the hyphen flag with which stands for the partitions |
|
|
56:29 | case you're going to see which what are running on a particular partition that |
|
|
56:34 | saw in using, Yes, info . So you can just provide the |
|
|
56:39 | off that partition and see what jobs running so you can see the user |
|
|
56:45 | . Uh, what notes are being for that particular job and how? |
|
|
56:49 | long? Those, uh, job funny you can. You can also |
|
|
57:00 | jobs from a particular user as As of now, I don't have |
|
|
57:04 | jobs, so I'll just make any these user names and, uh, |
|
|
57:12 | white that the hyphen you fly to desk, you and that will give |
|
|
57:16 | all the jobs that are running for particular views. This is useful when |
|
|
57:21 | will submit a bunch of jobs like your assignments to make sure, Basically |
|
|
57:29 | the progress off all your jobs and if any any of those have failed |
|
|
57:33 | not or what state they are in have been allocated. The resources they |
|
|
57:38 | have been completed or not. So this this command will be very useful |
|
|
57:45 | on. Uh, right. So when you have to run jobs there |
|
|
57:53 | the most general terms, you have base don't to submit your jobs to |
|
|
57:59 | computers. So first thing noticed that still on the log in north. |
|
|
58:04 | I had a simple We'll work around just inside the world to see |
|
|
58:15 | uh, already have confined it using can. I used to see Seacon |
|
|
58:22 | that's available here. You can use content, Compiler. If you happen |
|
|
58:27 | choose to use in the compiler, just need to change GCC to I |
|
|
58:32 | c. Attrition and you're down a All right. As of now, |
|
|
58:42 | I don't quite foot. You're So right. So, as I |
|
|
58:53 | , there are three ways crew running your jobs. Uh, first is |
|
|
58:58 | using the Estrin. Come on. what? When you are on a |
|
|
59:04 | in note? What Estrin command does it submits your job the way computer |
|
|
59:10 | on what parameters You passed to So in this case, what I |
|
|
59:14 | do is fast the end tasks which for number off tasks for a number |
|
|
59:23 | instances off the job that you want run and give us the executable |
|
|
59:31 | You know when I do that. , as you can see this, |
|
|
59:35 | has given it a job. I be and it's waiting for The |
|
|
59:39 | to be are located. Now, you can see, when you use |
|
|
59:43 | run, you have to A You to wait until the job has been |
|
|
59:49 | to the sources and the job has executing, which is not very useful |
|
|
59:56 | you're trying to do ah lot of parking or making sure your quote, |
|
|
60:01 | working fine and or doing a bunch testing in their sort of things. |
|
|
60:10 | it gets done quickly, just And this is also a good example |
|
|
60:23 | show you that the resources are shared you may not get access toe the |
|
|
60:30 | that you want instantly, so they working on your stuff early. Don't |
|
|
60:36 | until begin because it's a shared We're not controlling what's going on on |
|
|
60:42 | . So it may take a while get your, Uh, no, |
|
|
60:49 | think I'll just skip it, because in, uh, as soon as |
|
|
60:56 | resource is get located, you will another message that your job has been |
|
|
61:00 | the resources, and you will see output off your program right here on |
|
|
61:05 | console so that its company headstone Uh . So this but that was the |
|
|
61:16 | way you can run your jobs. , is to run your jobs. |
|
|
61:21 | me by getting interactive access to the ALS that you can do by remembering |
|
|
61:28 | command of what you're trying to It's now the on there. I |
|
|
61:37 | there was a screen. Shoulders were the slides as well, so you |
|
|
61:40 | take a look at that later is . But do is the most important |
|
|
61:46 | that you would want to fast going command are the number off nodes that |
|
|
61:51 | want so that it goes by capital and perimeter. So that's if |
|
|
61:57 | remember the noticed one full entity that those two to see if you mother |
|
|
62:04 | on it. So that's the Now, as we also saw on |
|
|
62:10 | slights that each off these notes has two C p use, which have |
|
|
62:16 | 4 speech. So the number off that you want to get access cool |
|
|
62:23 | you, Bill Facet using the hyphen and narrative. And for one |
|
|
62:28 | you can have Max give it 28 there are 28 force on each note |
|
|
62:36 | of 26 years or four cores, , uh, on stampede. If |
|
|
62:41 | try to do with that because there hyper threading enabled on Stampede knows you |
|
|
62:48 | give double the number off physical course are available. So that way you |
|
|
62:54 | access to the courses. For I just get access to one note |
|
|
63:04 | one court off. It does believe busy so as you saw, |
|
|
63:11 | as soon as I did that I waiting for. Resource of that has |
|
|
63:16 | allocated resource. But the important thing notice is an hour. Council has |
|
|
63:21 | from Logan this article is eager to , which was one of the North |
|
|
63:27 | the partition are in the autumn small that resigned the lesson for So that's |
|
|
63:36 | simplest way off, making sure that you're on a logging road or if |
|
|
63:40 | on a cure now, one thing remember is when you use the interact |
|
|
63:47 | month, let's say you gave a of nodes as to and number, |
|
|
63:56 | course, as eight. Let's So in total, you will be |
|
|
64:01 | access to eight course on are on , not 16 force. It will |
|
|
64:09 | in a total of eight course. , if you if you happen to |
|
|
64:14 | something like this, then you will only access to flick off the |
|
|
64:23 | Still, you will get access to force. Even if you get a |
|
|
64:26 | nerves. Remember that this is the number of fours. If you're getting |
|
|
64:32 | to no one's here. Once you're on a computer, you can |
|
|
64:39 | You was Esserman. Do it on shoulder. Just get you will get |
|
|
64:49 | outfit not seen from what happens when increase the number of tasks more than |
|
|
64:55 | number of force that I asked It will get you get a letter |
|
|
65:00 | Sloan that you're requesting more resources than have information for, so you can |
|
|
65:06 | run or snow will allow you to jobs. Only on the resource is |
|
|
65:11 | you have been allocated. You cannot jobs. And more than that now |
|
|
65:17 | directors, very useful to useful. you're trying to deepen your forward and |
|
|
65:23 | a bunch of testing, what happens you are done? But all the |
|
|
65:29 | and you are you're comfortable with your that everything is working fine. Now |
|
|
65:34 | have to get a bunch off performance or any kind of measurements from your |
|
|
65:39 | . So in that case, you use the third way off submitting your |
|
|
65:45 | , using the S and magic a so that just picking throughout off the |
|
|
65:53 | . So again, as you see back on back on a log in |
|
|
65:58 | . So again, probably off. meeting your judges using a bad |
|
|
66:05 | Thank you Will submit your single test . Come on, now, These |
|
|
66:09 | script, if they have all the is that don't require for for the |
|
|
66:16 | of your job and the words the will also have submit the command, |
|
|
66:21 | commands or the sequence of the months you want. No, When you |
|
|
66:29 | to submit this script, you can simply use the s batch command and |
|
|
66:34 | the name of your bad street That not necessarily has to be bad star |
|
|
66:38 | edge. It's you name it. you want on, just simply do |
|
|
66:44 | . If you go and check using , you will see that this is |
|
|
66:51 | job. Its state is said as leave. It stands for ending. |
|
|
66:57 | they requested only one note. As can see in the bat script, |
|
|
67:02 | was the job that this was name for a job which we provided here |
|
|
67:06 | so on and which partition we So that's still it. And check |
|
|
67:11 | its skill offending. Uh, but it's still running. But when? |
|
|
67:22 | it will be done, What you see in the same directory that you |
|
|
67:27 | your John from you will see another here which will name something like slur |
|
|
67:35 | , most likely the the job number something. It will be named something |
|
|
67:40 | that. So when you opened that , you will have the outfit off |
|
|
67:44 | program. So what that means is , as I said after indirect, |
|
|
67:48 | you're done debugging your food, you your jobs using its patch. You |
|
|
67:52 | have great for your jobs to finish to submit it and just go away |
|
|
67:57 | do something else. And when you're , hopefully your job will be |
|
|
68:00 | And three output will be in one the friends. So, uh, |
|
|
68:07 | for a second and see if there's questions. Okay, Now, that |
|
|
68:15 | for me. Okay? A few is everyone wants last depression. |
|
|
68:36 | So, uh, here's the thes the contents of the batch five. |
|
|
68:42 | else Can you show the front ends the batch fight, so yes, |
|
|
68:48 | also, I will post along the off some samples on on the blackboards |
|
|
68:55 | . You don't have to worry about . Everything but everything will be |
|
|
69:08 | Okay, right. So they final that I would want to show. |
|
|
69:16 | , there we go. That job finished. So as you can see |
|
|
69:20 | , this file here that that's came and if you just open it, |
|
|
69:26 | will have for the world four times we asked for four tusks and just |
|
|
69:32 | job to run. He was impressed . Come on, that's the |
|
|
69:37 | Now let's say your job has been for too long, and now you |
|
|
69:42 | you want to make any changes for gold and don't want branch off will |
|
|
69:45 | any four so you can use the s canceled and just provide the job |
|
|
69:53 | that you get using the sq commode white death and your junk could be |
|
|
70:03 | , right? So I believe those most of the commands that everyone will |
|
|
70:09 | using and pretty much all the assignments , no coming to the model |
|
|
70:16 | which is so marketing is a package in so many sensors. So when |
|
|
70:24 | have, that's a library that's in on the faster you want to |
|
|
70:29 | If you indeed the model like a three sperm package manager to load it |
|
|
70:36 | your use so you can use first award models or packages are available on |
|
|
70:42 | first really big simply used what will when you do that, you can |
|
|
70:49 | all of the packages that are available the festers where you can see fight |
|
|
70:53 | packages. You can also see the's compilers over here. Also extension for |
|
|
71:02 | , just raid and all different kinds packages we could see here. |
|
|
71:09 | if you want to change what models your account poverty has loaded by the |
|
|
71:17 | you can use or has already you can use the command more your |
|
|
71:25 | , and that will show you all currently loaded modules for your for For |
|
|
71:32 | of the assignments, you may need load some new modules. Let's say |
|
|
71:38 | one that we use the dune, is a which is a profiler from |
|
|
71:44 | . If you want to know that , you just go ahead and buy |
|
|
71:48 | . You, Lord Reid oon if you want to be more |
|
|
71:52 | you can also give the version But you just get the name of |
|
|
72:00 | most updated was, you're not that it was 2019 06 So I think |
|
|
72:06 | same way you can also unload a that morning. Very move. Now |
|
|
72:15 | that many times if you haven't unload death, uh, on which there |
|
|
72:26 | Congress, there are other models that the defendant models for will also be |
|
|
72:31 | . So make sure you just don't what the mornings that you actually want |
|
|
72:37 | and that Because if you happen to more deals that you that you want |
|
|
72:43 | use and to go ahead and try run your cold, they would most |
|
|
72:46 | not worth found. Now, this kind of a common mistake that happens |
|
|
72:57 | and forgets to make sure that the margins are loaded so things don't work |
|
|
73:05 | . So I will try to provide correct versions for the most things, |
|
|
73:11 | began, to run the cords for assignments. But yeah, if if |
|
|
73:16 | see any beard better by running make sure you have the night mornings |
|
|
73:21 | this work. That's pretty much it our guest. Any questions from? |
|
|
73:42 | . So stop shooting now. yes. So no guns in freighting |
|
|
74:10 | questions on So Eyes on the other more signs that it's not covered by |
|
|
74:20 | demo. So these nine say There's one question in the chat. |
|
|
74:27 | , now someone disappeared. We have to the cluster after creating that account |
|
|
74:34 | exceeded off No. So you have create your account on exceed, and |
|
|
74:39 | send your user name to professor and , and we will add you to |
|
|
74:45 | class location on the clusters that you run jobs until have been added to |
|
|
74:51 | allocation for the last. Make sure one more step to get their account |
|
|
75:05 | needs to be taken before you can start to run Coats. All |
|
|
75:20 | so think of one's given the but out of time. So which was |
|
|
75:26 | because most of this canceled. It already a demo old. And I |
|
|
75:34 | know if you wanted to say something the control command on, but there |
|
|
75:45 | and us, you know, Stammel trying to new scripts. Once you've |
|
|
75:51 | the debugging kind of done and use batch command to submit things, |
|
|
75:57 | I talked to my dad hit on . This is just if you want |
|
|
76:01 | learn more about Sturm, it's not something you need for, um, |
|
|
76:07 | assignments. Um, And here, , there is stuff slide that as |
|
|
76:17 | number of useful anywhere else and storm finally boosters and his open source of |
|
|
76:23 | . It's a storm, um, for around period. But they're also |
|
|
76:30 | off George data centers and very nice that provides kind of could use, |
|
|
76:39 | information about So So there is a that gives you links on, uh |
|
|
76:49 | talks about the module command. And a few size again that supports what |
|
|
76:55 | talked about already. And there's also showing a little bit about that Would |
|
|
77:01 | useful for assignment one or two. remember, um, the best. |
|
|
77:08 | shows how you get to you processor software information that so Joshua de Mold |
|
|
77:17 | so I think these Oregon things that captured by But you talked about, |
|
|
77:27 | that was it. Yeah, Maybe want to make some comments about |
|
|
77:33 | I said, That's Ah Trickett thing usually for your assignment, the best |
|
|
77:45 | is to use a timer, not clock cycles or sometimes called ticks. |
|
|
77:55 | the most precise measuring forget, whereas of day or wall clock time, |
|
|
78:03 | not quite as except, but I , one awas on this So I |
|
|
78:12 | or not. But one of the that that is here about timers is |
|
|
78:20 | . Clock ticks should be fine. you do something health, you need |
|
|
78:27 | be aware off. There is time . Oh, your time. Many |
|
|
78:40 | the examples you will do are sufficient small that the execution time may be |
|
|
78:50 | than the resolution off their cart. all the time information that the time |
|
|
78:56 | use imports, it's effectively nonsense. the solution of timer compared to the |
|
|
79:06 | time on the coat, it's something really need to have a good understanding |
|
|
79:13 | . If you use clock ticks of count, then you don't need to |
|
|
79:19 | about it. But anything else it like to give you a lot |
|
|
79:26 | kind of missing from it. So see a show. I don't |
|
|
79:30 | You may want to comment more on , right? A few dining libraries |
|
|
79:43 | see for slightly provide some samples on blackboard is fourth, But make sure |
|
|
79:54 | use a star, Fluency said. time of that has enough resolution theater |
|
|
80:03 | . Keep in mind is other than saying, the wall clock fine, |
|
|
80:10 | may also include time for your is actually running diamonds for you, |
|
|
80:18 | may also include the time that the was waiting for resources because you will |
|
|
80:24 | running in jobs in a shared so you should use diamonds that report |
|
|
80:31 | actual see for your time for your . I have seen a couple of |
|
|
80:40 | , I believe. Last year, was some students job? I had |
|
|
80:46 | really small execution time so they didn't the correct timing measurements. The simplest |
|
|
80:54 | to get around that is to have program run for quite a few |
|
|
81:01 | At least that gets the total time . Multiple inflation's off your job in |
|
|
81:10 | in the cloths resolution and then tried escalate the average time for penetration off |
|
|
81:18 | job. Right? So I guess was a couple of slides down. |
|
|
81:24 | hopefully we'll see this line that this out and it on throughout the |
|
|
81:29 | It was distressed. You need to , Figure out how to get an |
|
|
81:34 | of what the running time should be you have a good cold. So |
|
|
81:40 | that you need two things. You to understand what the workload is. |
|
|
81:46 | , how much work visit to execute ? Well, there is memory references |
|
|
81:51 | or large corporations or arithmetic operations. notion off the workload, and then |
|
|
81:59 | need to have a good understanding of capabilities of the platform you're using. |
|
|
82:06 | that's the one of the nutrients have reasonable expectation for how long time is |
|
|
82:11 | to take. And that also helps avoid some of the pitfalls we talked |
|
|
82:16 | that you're you know, you made the matrix. That is 100 felt |
|
|
82:22 | by 1000 matrix and the mustard by of them and that maybe not enough |
|
|
82:31 | to be discovered by a clock if don't use again something that can cycle |
|
|
82:38 | clock ticks. So current processor pretty . So things like 1000 My 1000 |
|
|
82:46 | problem is a very small problem. this is what I'm trying to do |
|
|
82:52 | side too again from this in you what to expect them assuring you you |
|
|
83:03 | typical time. But I will point , what do you use again when |
|
|
83:07 | comes to the assignments and the time up? But the rest of this |
|
|
83:13 | today has more. It's not and it's basically advice. What should |
|
|
83:19 | to put together a report since you do time in your work problem produced |
|
|
83:24 | a few numbers and you should set up some town, have some script |
|
|
83:32 | a Sarge run the numbers and organize . So fight to make sure that |
|
|
83:38 | can use tools to process the That's some tips in this fights that |
|
|
83:48 | the discussion off the customers and how use the clusters in this neck for |
|
|
83:57 | . Send a noise. So says them exposed to it both in |
|
|
84:04 | and in his own products are to need tools to process time, just |
|
|
84:09 | of the data. So I don't if you have suggestions. Well, |
|
|
84:16 | I said, for when you're submitting for us to make sure that everything |
|
|
84:22 | running correctly, just do some naive games. Once you're confident. Figure |
|
|
84:30 | use scripts to order. Make data because you don't want to sit for |
|
|
84:37 | multiple hours. Just collecting data annually a script that gets all for the |
|
|
84:43 | that you, I think when my disappeared. So the various unused yourself |
|
|
84:59 | as questions also, and some shrapnel that was your time is up. |
|
|
85:09 | closed the session himself. Shares That's what. Okay, now, |
|
|
85:35 | stuff. According so |
|