© Distribution of this video is restricted by its owner
Transcript ×
Auto highlight
Font-size
00:00 my name is and I'm a professor the family computer science. And this

00:05 my mini talk to talk to you my research. So my research area

00:12 formally known as natural language processing. basically what we do is to design

00:18 development of computer programs that take us human language and perform some hopefully meaningful

00:27 task. And I think right now is becoming increasingly commonplace for us to

00:35 with technology developed in our field. you think about smart devices like your

00:42 or your google home, these devices are operated by voice um the field

00:49 natural language processing together with advances in learning machine learning and high performance computing

00:57 made it all of this together. it possible that we now deploy natural

01:05 processing systems or technology into everyday Um I want to start by discussing

01:13 research group. Um here I show of my PhD students. I have

01:18 new PhD students but I still don't a picture of her. So hopefully

01:23 I'll be able to add her to . And I have one master student

01:26 well um that are currently helping me progress on the research agenda. So

01:34 we have very little time, I'll give you a brief overview all the

01:38 of projects I've been working on. hopefully if you're interested, I want

01:44 learn more to shoot me an email we can talk about it A lot

01:49 the research that I've done um in last 10 years or so even more

01:54 have to do with the fact of to process automatically user generated data.

02:02 do we mean by that? Is generated in social media platforms?

02:07 We know users are pouring their lives into the social media sphere either

02:14 YouTube, you name it, And so there's a great value in

02:22 able to automatically process this type of . There's also a lot of potential

02:27 um good relevant uses of this technology some of the challenges that we have

02:36 we try to do that is a that because it's social media, it's

02:41 very informal general communication which means that will see a lot of banks.

02:47 there's a large vocabulary, there are radiation, there's, you know,

02:52 type of data represents basically a large of the population and therefore it represents

02:59 large number of topics. And so make it difficult for our technology to

03:03 able to work with this type of . So I work with my group

03:07 trying to develop technology that can process this type of data. Um

03:15 also I started working on multimedia type projects. What I mean by multimedia

03:21 that this is a project that not take text um as brother input but

03:27 morality, for example audio or in case video. And this is a

03:33 that is recently funded by NSF in with the research group from dr And

03:42 we care about there is can we users to make the decision on whether

03:46 want to watch the content that is provided in a specific video, for

03:55 , um by automatically detecting the type content that could be considered objectionable type

04:02 content that according to research or cultural um could be questionable. So so

04:10 have approached on that and so where research angle um or the difficulty lies

04:17 how we combine the evidence that you from the multiple sources. So we

04:23 text that is being provided from the on the videos. We have

04:29 right, that is being provided, have the images as well and all

04:32 this contributes to determine whether a specific can be classified as objectionable or

04:38 So, this is again a recent we're working on and on the other

04:43 , I also have research working on user reviews. So again, we

04:50 that before we do proceed with an purchase or to book a hotel or

05:00 go to a specific restaurant, some us read the product reviews, they

05:04 a lot of useful information. And for the longest time we have been

05:09 in the field in sentiment classification, ? So that we can understand it's

05:14 really positive or negative and and you have the star classification but going into

05:19 detail with that type of data, helpful that we can also distinguish what

05:26 the users talking about because for some , some aspects of a particular place

05:33 be more relevant um in the space restaurants for example, some people may

05:39 more about the price um some others the food and some others may be

05:44 or interested in the type of service they provide. Is it a fast

05:48 ? Said high quality service. And , but to understand which review is

05:53 about these things? Um it's a of work and so we have this

05:59 uh area of research where given an review like the one you see

06:06 we want to be able to extract aspect terms in this case the value

06:12 social service that are mentioned here as as associate an aspect category to each

06:19 these terms, for example, value to the price of the food dumplings

06:22 the food, sushi to the food service to service. We also want

06:26 attach a priority to each of these categories. Right? Because a single

06:33 is not necessarily always entirely positive or all negative. That could be most

06:40 there's some X most likely there's there's things that the person, like there's

06:44 things that person didn't like. So want to disentangle and be able to

06:48 process all this. So this is interesting project in my group.

06:56 I just want to mention a few and comments about the field. Um

07:02 I'm not being machine learning are super fields to be working on right,

07:06 room that we're seeing. Thanks to great progress of the learning approaches has

07:12 a lot of enthusiasm and a lot need for people that have expertise in

07:17 type of fields. So you're working this, you're studying this, you

07:22 have no um no lack of job . Your job prospects are very

07:31 And obviously going after the bigger model very exciting and interesting and we keep

07:38 that big tech companies are going after , bigger bigger models, but the

07:43 models are not necessarily the silver bullet all our NLP problems. So there

07:49 a lot of research angles that are important um that we can work on

07:54 that is not necessarily trying to go the bigger model. There are many

07:58 where these models are not solving the um to satisfaction, So and where

08:03 cannot address this with a big model we have, for example, under

08:08 languages, right? Where a Multilingual as big as it may be,

08:14 the model has not seen this the model will not perform well.

08:18 , so we still have work to specialized domains, the type of task

08:23 we care about in the real you are more likely not going to

08:29 able to use your big deep learning out of the shelf. So there's

08:33 lot of work into how do we tune or how do we prompt these

08:40 to work. Um and recently our has become increasingly aware of the dangers

08:46 this technology. So we care about well understanding biases, biases in our

08:51 , viruses, in our methods as as what what are the ethical implications

08:56 the research. And lastly there's a of need for explain ability because these

09:02 learning models tend to be very Understanding them tends to be equally

09:09 And so there's a lot of need explain ability. How can we understand

09:15 the model is coming up to this decision or the specific prediction. Um

09:22 in general if you care about this anyone of this or a combination of

09:27 topics I mentioned what I think are elements for success. Um if you're

09:33 under research studying language and domain task you know, there's there's no way

09:37 then creating resources, resources that will you motivate the research community, but

09:43 resources that will help you develop your research. Um there's a lot of

09:50 for creative people so that we can understand um find the interesting angle for

09:58 research problem but also creatively come up a solution that successfully and efficiently solves

10:05 problem. Um there's a need for as well because the field is moving

10:10 an increasingly fast pace and uh it people that that can stay on top

10:18 everything instead of top of how the is changing and a lot of self

10:23 . Because it just it doing research effort, doing research takes a lot

10:29 reading right, what is happening? reading papers um and sell you,

10:36 will find a need to be so if you want to succeed in this

10:40 . I always have some type of for talented undergraduate students and master

10:47 Um Not always, but unfortunately, I always try to get funds and

10:52 so if you're interested interested in joining group or learning what we do,

10:57 has showed me an email now, be happy to talk to you.

11:00 thank you for watching the video. you have more questions, you can

11:04 the groups, the group website, ? Um or contact me

-
+