© Distribution of this video is restricted by its owner
Transcript ×
Auto highlight
Font-size
00:03 Hello, I am professor rakesh verma today I'm going to give you an

00:11 of the readers lab research. So a stands for um the reasoning and

00:22 analytics for security laboratory and a cool about the University of Houston is that

00:31 are a Department of Homeland Security, Security Agency Certified Center of Academic Excellence

00:40 cyber defense research as well as which is a rare honor granted to

00:46 about 50 or so universities and colleges the US. Some of the recent

00:53 of readers are listed here, several students, master student and a couple

01:00 bachelor's students also. Uh these are of the current members of the readers

01:07 and all the members are listed here on this slide and the readers lab

01:17 built up over the years a lot expertise in algorithms, design and

01:24 cybersecurity, natural language processing and symbolic and logical reasoning. And I will

01:35 a few of our current projects as of the kind of work that we

01:40 doing. So. The first one is on automatic deception detection. The

01:47 is the natural language processing work And I will also talk a little

01:52 about data quality and augmentation and analitico cybersecurity project that we're working on.

02:02 we started with phishing emails and we both texting analysis as well as link

02:10 and so the link analysis was published 2017 and the text analysis in

02:17 And then from there on, we started looking at fishing website detection automatically

02:24 a phishing website. And most recently have worked on a broader class of

02:32 attacks such as fake news, spear um and misinformation, disinformation and so

02:42 . And the latest work on automatic detection appeared in ACM kord sp conference

02:49 this year and the natural language processing , we started as consumers of basic

02:59 language processing operations such as part of tagging, named entity tagging and so

03:05 . So we our very first project on authorship detection When we looked at

03:13 set of articles and books by which supposed to have been authored by Daniel

03:20 , an 18th century British author. then a couple of professors from Oxford

03:25 removed several of his works. About of his works from the list of

03:30 attributed to Daniel Defoe and thus began project on whether those works were de

03:36 correctly or not. So whether they authored by Daniel defoe or not.

03:40 from authorship detection, we moved on automatic summarization given a set of documents

03:46 a single document, construct a summary of the information. The useful information

03:52 the document. We also looked at answering and we have systems for both

03:58 and question answering in question answering. are given a set of documents and

04:02 list of questions and then you're supposed answer the questions based on the information

04:08 the documents. We started looking at scores in the competitions organized by National

04:17 of Standards and Technology and I. . D. As it's called?

04:21 we found that we are lagging behind human experts by almost a point on

04:27 question answering task. And so we looking at why this was the

04:32 And we found out that the fundamental operations such as as part of speech

04:37 and named entity recognition etcetera were not correctly. And that's when we moved

04:43 the producer side and we did some on idioms detection and location detection and

04:50 and colorations are special phrases they have meanings when two words two or more

04:56 come together the meaning changes in the things like high school for example.

05:01 so we also received an award for work best paper award for our work

05:07 ka location detection at the cycling 2016 . So that's a little bit about

05:12 natural language processing Workbench and the motivation the work bench is somewhat obvious because

05:19 have huge amounts of unstructured data on World Wide Web moving on to our

05:25 on data collection quality and augmentation. even though there is lots of data

05:32 learning models require a lot of data training. And also data augmentation is

05:40 for determining whether the machine learning models robust or not. And so we

05:46 been working on several projects in this which are organized around quality and augmentation

05:55 the last example project that I will is adapting data science for security and

06:01 may say why do we need to data science for security And there are

06:06 reasons for that. Some of them listed on the right and the main

06:10 is of course the active attacker who constantly trying to defeat the machine learning

06:16 . And so we really need to data science, not just apply it

06:21 security. And if you're interested in exploring some of the work that I

06:28 today, you can take a look our books, cybersecurity analytics co authored

06:34 dr David Marshak. The work that are doing in the reader's lab is

06:42 recognition. So I mentioned the best award for provocation detection research. Our

06:48 also got an outstanding paper award from American Educational Research Association and I mentioned

06:56 couple of the awards that went to us students best PhD student award to

07:02 Baki and the Computing Research Association honorable to Bhutan Faridi who did research with

07:11 as an undergraduate at your patch and was recognized with the mentoring award for

07:20 research and if you need more information or if you have questions please feel

07:27 to reach out to me um at office or we are email and I

07:34 given you some links to to take look at and dig deeper into the

07:39 that we are doing. And I be happy to discuss more with

07:43 Thank you for

-
+