Transcripts

Coding 101 59 (Transcript)

Apr 5th 2015

Netcasts you love, from people you trust, this is TWIT! Bandwidth for Coding 101 Is provided by CacheFly. At cachefly.com.

It’s time for TWIT’s annual audience survey and we want to hear from you. Please visit twit.tv/survey and let us know what you think. It only takes a few minutes and your anonymous feedback will help us make TWIT even better. We thank you so much for your continued support. TWIT.tv/survey.

This episode of Coding 101 is brought to you by Lynda.com. Invest in yourself for 2015. Lynda.com has thousands of courses to help you learn new tech, business and creative skills. For a free 10 day trail visit Lynda.com/c101. That’s Lynda.com/c101.

Father Robert Ballecer: Today it’s time to sort. Coding 101 is next. Welcome to Coding 101. It’s the TWIT show where we let you into the wonderful world of the code monkey. I’m Father Robert Ballecer, the digital Jesuit and joining me is our super special guest co-host mister Lou Maresca. A lead software developer from Microsoft, Lou, thanks for coming back on to be our guest host.

Lou Maresca: Thanks for having me again. I love that title, super special guest co-host.

Fr. Robert: We had to come up with something that actually made sense in the doc, but actually it does stick. You are a super special guest co-host.

Lou: Love it.

Fr. Robert: You may notice I’m not in my normal spot. Normally I’m over in radio corner. We’re going to start shaking things up a little bit in Coding 101. You have noticed we’re starting to bring you a variety of topics including things that crossover into hardware. That’s not the only format change you’re going to see. We’re going to bring you the same good content, but with more guests and more explody things. Lou, you’ve got an interesting story here that you queued up for us to start off with and it kind of runs on some of the sentiment that came out of last week’s episode of Coding 101. There were people who were defending Java Script. Saying wait a minute, Java Script gets a bad rap, and that got us thinking. There are a lot of different languages that get bad raps as either insecure or too hard or too easy or too easy to mess up. You actually took a look at which languages are good at what.

Lou: There’s a lot of different interesting articles out there the last couple weeks around how safe is this language, etc. and it’s kind of an interesting topic of you think about it. You can really – in any language you can really write a bunch of insecure code. Especially if you have access to low level functions in an operating system you can be a menace. It doesn’t matter if you write it in Ruby or Perl or C Sharp or C++, you can pretty much be a menace in any type of environment. And this article specifically points out things like Perl, PHP and Ruby.

Fr. Robert: Why is it though? Because we’ve all heard it. Everyone picks on Java Script. And they all say the same thing which is, the sooner that Java Script dies, the better the internet will be. Or Java Script is a natural language for people who want to program poorly. And as you pointed out, any language can be used to write bad code. So why it is that certain languages seem to be the focus of ire?

Lou: I think one’s because there’s a lot more people using things like Java Script. If you think about it, no JS is one of those big buzz words that’s running around the web right now is people using no JS all over the place and that’s pretty much coded in Java Script. And if you do things like getting data from users and you’re not sanitizing data before you put it into your database, you could be injecting data into your database that could cause vulnerabilities to your system. And these kinds of things, when you’re coding in Java Script and a lot of people coding in Java Script they like to kind of put the button on Java Script because that’s really what people are using now days.

Fr. Robert: Bad programmers are bad programmers no matter what language they’re using. But let’s talk specifically about Perl, PHP and Ruby. What’s the general consensus about how easy is it to make code that can be exploited?

Lou: So I think that with Perl, in this article the guy was talking about that he feels that Perl is a very safe language. He feels like there’s not much you can do, there’s memory management in the language and there’s not really much you can do to kind of cause issues. But again, he points out that you can pretty much, in any language, be a menace if you put bad data in your database or not sanitize data that’s coming in or that kind of thing. And then in a PHP sense, PHP doesn’t really allow you to do your own memory management, but it can also cause problems if you don’t sanitize your data. But in PHP’s case, there was actually a vulnerability just recently called The Ghost. And it was a system level call that you were allowed to make from PHP that calls what they call a buffer overflow on your memory. And that can actually cause applications to crash, it can cause your machine to blue screen in some operating systems. So it’s actually a bad vulnerability. So again, that language normally dubbed pretty safe as a language, because it doesn’t have access to operating systems and low level functions, actually can cause a big problem.

Fr. Robert: Right. Ruby is interesting when you compare it to other languages. First, it’s relatively new when you compare it to Perl or PHP. But also, one of the things I’ve always liked about Ruby was that it actually does boundary checks before it accesses memory. Which, it’s one of those things where you wonder why doesn’t every language do that? Doesn’t it make sense that before you access, before you write, before you do anything with memory, you’d check to make sure that there’s nothing in there that’s active? And that’s what Ruby does.

Lou: Exactly. I think the key to Ruby, especially, is some of these newer languages use a virtual machine, they run them. For instance C sharp will run in what we call the CLR which is the common language run time. And Ruby has its own BM. A lot of these- Java runs its own Java virtual machine. So I think these virtual machines allow you to not have to worry about memory management and then when you do do some low level functioning, some of these languages allow you to do low level type things in the operating system. And what they do allow you to do, and especially in Ruby’s case, like you said, it does check boundaries and prevents buffer overflows. Which is a big security risk in things like browsers and that kind of things. So I think it’s all dependent on whatever language you’re in. some languages are riskier than others. Like if you’re in Assembly, I could probably crash your machine in 30 seconds. Less than that. But with C++ the same thing. Because you have low level access to things that can really cause craziness in your machine. But I love this quote in this article, it says “the seductive power in any new language is an assurance. It will provide better, faster, and more inherently insecure or secure solutions.” So it’s really seductive to say hey I really want to try this new language because it could be faster and better. But you don’t really know if it’s secure or not. You don’t really know if they put that on the top of the stack as a priority. So just be careful with whatever language you’re using and just kind of understand how you’re using it.

Fr. Robert: Yeah, that’s absolutely true. The way that you get people to switch away from a language is to tell them it’s insecure, or wow, that’s just slow. The way it handles instructions is just ridiculous. It doesn’t communicate properly with low level hardware. And the way that you sell a language is to say oh, this is so much better because of X feature. And X feature was the one thing that was wrong with Y language. I’ve got to ask you, as a person who does this day in and day out, you are a programmer, you watch other people programming, you guide teams as they head into projects, what do you look for when you choose languages for different projects? Because you’ve mentioned several times on the show that it’s not like you stick to one language. You don’t write everything in C Sharp, you will use the best language for any given project. So how do you decide?

Lou: Yeah, for instance, I wanted to create an app that worked across platforms like iOS, Android, Windows Store, that kind of thing. So what I decided to choose was, I chose to build the app in HTML 5 and Java Script, but then in order to get myself into the stores I built what we call a native shim. Meaning I built a little kind of wrapper that’s built in like Swift or Java, and then it just hosted a web browser control and then inside the control I would render this HTML 5 app that made it look like a native app. and then this way I could really kind of use the same app code across different platforms. And it was helpful but then I found out that it actually causes performance problems because sometimes HTML 5 and rendering and browsers are slow and they have problems. So it’s all in a learning aspect right. You choose it based off of your needs. So if you wanted to get to market really fast with an app, that’s a really good way of doing it because you can get across all these different platforms really quickly by doing something like that. But then you run into other problems like performance problems and Java Script issues and all these different types of things. So it’s all dependent on your need. If you’re just doing a desktop app, sometimes it’s a little easier if you want to build it quickly, is to use a managed language like C Sharp or Python. And it gets you there much faster. So I think it’s all dependent on the need at the point. But no matter what language you’re using you still have to really understand how you’re using it and how you’re handling the data that’s coming through the system.

Fr. Robert: Fantastic. And thank you again for bringing us a perspective from someone who is actively involved in the field. When we come back we’re doing something a little bit different. We want to, every once in a while, along with our wild card episodes where we interview people in the industry, we want to do a couple of episodes that bring back basic knowledge. Now this often comes from the Google Plus group. It comes from suggestions that I get on Twitter. It comes from people who may visit and say you know what, I really wish you would cover XYZ. Things that maybe we don’t think about because we just don’t want to do that. So we’re going to be bringing Patrick Delahanty, a fan favorite of the show back on. To talk about sorting. But before we do that, let’s take a moment to thank the sponsor of this episode of Coding 101, and its Lynda. What is Lynda.com? Lynda.com is a one stop shop. A repository for knowledge. Both of new knowledge and knowledge that you just need a refresher course on. Lynda.com is an easy and affordable way to help you learn. You can instantly stream thousands of courses created by experts on software, web development, graphic design, and more. Lynda.com works directly with industry experts and software companies to provide timely training, often the same day you get the new releases on the new versions on the street. You’ll find new courses on Lynda. So you’re always up to speed. All courses are produced at the highest quality. Which means it’s not going to be like a YouTube video with shaky video or bad lighting or bad audio. They take all that away because they don’t want you to focus on the production, they want you to focus on the knowledge. They include tools like searchable transcripts, playlists and certificates of course completion, which you can publish to your LinkedIn profile. Which is great if you’re a professional in the field and you want your future employers to know what you’re doing. Whether you’re a beginner or advanced, Lynda has courses for all experience levels, which means they’re going to be able to give you that reference that place to go back to when you get stumped by one of our assignments. You can learn while you’re on the go with the Lynda.com apps for iOS and Android and they’ve got classes for all experience levels. One low monthly price of $25 gives you unlimited access to over 100,000 video tutorials, plus premium plan members can download project files and practice along with the instructor. If you’ve got an annual plan, you can download the courses to watch offline. Making it the ultimate source of information. Whether you’re completely new to coding or you want to learn a new programming language, or just sharpen your development skills, Lynda.com is the perfect place to go. They’ve got you covered. They’ve got new programming courses right now including the Programming the Internet of Things with iOS, Building a Note taking app for iOS 8, and Building Android and iOS apps with Dreamweaver CC and Phone Gap. For any software you rely on, Lynda.com can help you stay current with all software updates and learn the ins and outs to be more efficient and productive. Right now we’ve got a special offer for all of you to access the courses free for 10 days. Visit Lynda.com/c101 to try Lynda.com free for 10 days. That’s Lynda.com/c101. Lynda.com/c101. And we thank Lynda for their support of Coding 101. Let’s get back to the magic, we welcome back to the show a now married Patrick Delahanty. Patrick, that’s some bling. So first, congratulations. You are here to bring us some foundational knowledge. And we got this idea from the chatroom. From Steve Gibson who said, let’s learn some of the concepts that go underneath the high level languages that we’re learning. And one of these basic concepts is this idea of sorting. For the people at home scribbling down notes, what is sorting and why is it important?

Patrick Delahanty: Sorting is when you have a list of items and you want to put it into an order. So for example a list of names and you want them in alphabetical order or a list of numbers and you want them in numerical order. And so this is how you would get that list, it’s all jumbled, and then put it in the order you want.

Fr. Robert: And this is something that we covered in a very early module in C Sharp. And we were showing how you could take memory cells that had different numbers, integer values, and you could use a very simple algorithm to repeat over and over again until everything was sorted out. And if you look through the history of computer science, there has been many different theories as to the most efficient way to sort a random list of numbers. In fact, let me ask Lou about that. Lou, I know there’s a lot of automated features. Every language has some sort of sorting feature, but some of the most hardcore programmers that I know take it upon themselves to write more efficient ways to sort any sort of data and are you one of those?

Lou: Yeah, It really all depends on the type of data you’re using too. Like I know a lot of people who maybe have some numerical numbers or something they want some pixel generation or sorting of pixels or something like that. They actually might write it in assembly because it is a lot faster to do it like that so you don’t have these big memory managers that are in like C Sharp or Java kind of getting in the way of getting in the way of doing things quickly. So I think it depends on the type of data that you’re trying to sort. It’s like large objects, like an object representing a library book, then it might be okay to say let’s build a function in Ruby or C Sharp or something like that. You use some of the internal storage algorithms that are already there to be able to sort it.

Fr. Robert: And to be clear we’re talking about any sort of data. It doesn’t have be numbers, it can be anything you need to sort, and you just have to figure out how you want to sort it. Patrick, I assume we’re going to start with some sort of small demonstration.

Patrick: One sorting method that everyone learns in computer science is the bubble sort. And you can do this in any language and so we’re not focusing on a specific language, we’re just talking about the concept. And a bubble sort, in simplest fashion, it compares two numbers. Let’s say this is a ray, we’ve got values 2,1,4,3. It compares the first two, 2 or 1, which is more. In this case 2 is greater so we reverse these. So then our list becomes 1,2,4,3. We move onto the next, compare 2 and 4, that stays the same, move on to the last position, 4 and 3, we reverse those so now we’ve got 1,2,3,4. And 4 is locked in because it’s the last position. It’s locked in the highest number. And then we have to go back through and do it for the first three and do it again for the first two. And so you have to do multiple innervations of this. So I can do a demo here. I’ve got playing cards. We’ve got a random order here. 8, 7, 10, 4, 3. So the first two, flip those because 7 is less than 8. 8 and 10, that’s correct. Switch these around, 4 and 10, then we’ve got 3 and 10 so 10 is now locked in as the highest number. We start over. 7 and 8, that’s correct. 8 and 4, we move 4 down. 8 and 3, and that’s locked in then we do the same thing. And now we’ve got just 4 and 3, so there. That’s how a bubble sort would work.

Fr. Robert: Of course, for something like that, you need to do as many sorts as elements you have in that array minus one. That doesn’t sound all that efficient really.

Patrick: It’s not your most efficient method, that’s for sure. But it is a very basic method that everybody learns and it’s easy to understand.

Fr. Robert: Right. And actually, if you want to see a code example of how the bubble sort works, go back to module 1 of Coding 101, we’re using C Sharp and we actually give you a way to program that algorithm. Again, not efficient, but very easy to understand because you’re just doing the same math over and over again and the math is basically greater than or less than over again. If the numbers need to be switched, it switches it. If not, it moves on to the next cell. But I would gather that there’s probably a better way to do it in computer science, otherwise you wouldn’t be here right?

Patrick: There are plenty of other ways to do that. Probably not the worst sorting method but it’s probably not the best. Another method we’ve got is called insertion sorting. This also isn’t the best but it’s fairly basic and easy to explain here on the show. With this one we compare pull out values and then we insert it where it belongs. So we have this list, 2, 1, 4, and 3. And then we can’t compare the first to anything, so we go to second position, and we compare that. In this case 1 is lower than 2 so we move this from the first to the second position and put this 1 back in the first position. And then we’ve got 1, 2, 4, and 3. We move to the 3^rd position and we compare that to the 2. And that’s correct. So we move onto this next one. We’ve got 1, 2, 4, and 3. We pull the 3 out and compare it to the 4. The 4 moves over, compare it to the 2, 2 is less than 3, so we insert the 3 where it belongs and we have it in the correct order.

Fr. Robert: Right. And then there’s also a combination of these two. I remember they made me learn this in college. The shell sort which is kind of bubble meets insertion. Did you ever have to do that Lou?

Lou: A shell sort? Oh yeah. The shell sort and one of the biggest ones I think is being able to do- I think the insertion sort and heap sorts are the ones that we use the most- but we have done shell sorts in the past.

Fr. Robert: How do you choose which sort that you’re going to use? Patrick, let me throw this over to you. Because this looks fine when we’re dealing with very finite data sets. So you’ve got a couple of cards on the table, we can understand what a bubble sort looks like. We can understand what an insertion sort looks like. But when we start dealing with huge databases, which is where this comes into play, this is why these algorithms are powerful, because we can sort huge data sets. I’ve got to figure that programmers need a leg up on which one they’re going to program.

Patrick: Yeah. Well, to be honest, in most cases I use the sorting functions that are built into the language. Because it’s the easiest. But if I had to do something that was such a massive scale that time made a huge difference, I would definitely research the one that would work best for the type of data I need. I’m pretty sure it wouldn’t be either of these. But there are dozens and dozens of different sorting methods.

Fr. Robert: Right. And I think this is actually one of the reasons why we were a little reluctant to cover sorting. This was one of the very first topics that was suggested to us when we started Coding 101 and we kind of didn’t want people to start programming customized algorithms. We wanted them to use the sort functions that were included in whatever language they were programming in. and as Patrick mentioned it’s often not a good use of your resources to redefine your sort. Especially since most high level languages have an efficient sort built in. Lou, has that been your experience?

Lou: Yeah, like Patrick pointed out, it all depends on the amount of data. So like most of these languages will have really efficient ways to sort data of all types. Whether it’s large objects or small numbers. But again, once you start to get really large amounts of data sets, then sometimes it takes you to kind of combine different sorting methods or use your own. So I think it all depends on the amount of data. And there’s actually a term for that. They call it the big O notation. And we won’t go into it but basically it’s a way to determine how efficient your sorting algorithms are by how much data is in the set that you have.

Fr. Robert: Can we go into that? I am not an expert in the big O notation at all, I’ve heard of it and normally I score incredibly low. But how do you judge how efficient a particular algorithm is? Because isn’t that going to depend on how randomized the data is?

Lou: So Yes. Some algorithms depend on type of data, but it’s all kind of dependent on how much data you have too. So like the big O notation takes into account what we call the variable N, and it’s the amount of data that’s kind of in there. And then there’s different versions of that. Like O to the 1 means that I can find the data and sort it very, very quickly. So it’s a search and a sort algorithm that can be found very, very quickly on small sets of data. But O to the N means that I need to basically travel through every element in that set to be able to sort it all out. So example, like a shell sort or something like that. is O to the 1, but bubble sort would be a little bit more- it can adapt in a way so that its more O to the N. so there’s different ways to kind of define based off the amount of data that you have. Again, it gets a little bit more into the math side of things.

Patrick: And there would be a best case and worst case. Like in bubble sort, the best case would be O to the N but the worst case might be O to the N squared. If it were all totally reversed.

Fr. Robert: Right, exactly. And again, depending on what kind of data that you are anticipating, and yes, you can actually know that based on what kind of data sets you’re pulling into your program, that’s when you determine how long it’s going to take for you to do your sort. Patrick, I know you did a lot of programming for a company that we don’t want to bring into this because it brings in really bad memories, but I’ve got to think that this sort of sorting actually does come in handy when you’re doing parsing of say, employment records and such.

Patrick: Yeah. Any sort of record. Whether you’re doing employment records or skill sets or work history. Or if you have a list of conventions and you want them in order alphabetically by name or location or zip code, there’s always something, especially if you’re presenting it to the user that you want sorted.

Fr. Robert: Right. I do want to give our audience a couple of links. A couple of resource pages where they can go where they want to start turning this into actual code. Again, we’re not showing you any code because as Patrick mentioned, every language has sorting algorithms. Everyone has a built in function and every one can be used to program the algorithm that you want. This is a decent site. It’s called sorting-algorithm.com. And this will allow you to look at different problem sized sets. So how much data are you looking at, how randomized is it going to be and then it’s going to let you choose some of the different sorting algorithms that are popular. Like insertion, selection, shell, merge, bubble, keep, quick, and then you can also define what kind of data you’re going to start off with. And it will do a sort. It’ll show you exactly how it’s going to work. For example, this is what a shell sort would look like starting from a randomized data set. Here is what a bubble would look like from a randomized data set. Again, between the two, bubble is taking a lot longer than a shell sort. We can compare that against the insertion sort which Patrick just played with and we can see that works faster than a bubble set but slower than a shell sort. So if you want to see these algorithms in action, this is a really good page to go to. Lou, I think you had another resource page, right?

Lou: Yes, we kind of mulled over really quickly the big O notation but there’s a site called bigOcheatsheet.com and it actually talks about searching algorithms and sorting algorithms and it breaks down based off of the algorithm, the type of structure that the data is in, and then what they call the time complexity. So based off of how much data is in there, the best, average, and worst case scenario in that case. And so this way you can kind of see- and again it has some links to some Wikipedia pages around how to implement each type of algorithm. So it gives you a better understanding, it’s a little bit math heavy but it gives you a better understanding, if you look down at the bottom there is also a complexity chart and it shows you a better understanding of what those big O notations kind of mean. For instance, O to the N means between 100 elements you run through pretty much every element in that sort. So this is kind of a good site to go to if you want to get basic understanding of the efficiency of these algorithms.

Fr. Robert: Right. I remember from my college days one of the humbling experiences was where we had a course where we were designing sorting algorithms. You would optimize your algorithm for a type of data set that you were going to get. And your instructor would come by and give you a completely different data set. And you would see your algorithm choke. So this is one of those things that is so easy to visualize, so easy to conceptualize, but in practice, different algorithms will die at different points of the data set. Patrick, any last words on sorting?

Patrick: Researching this and reminding myself about how to do the sorting, I encountered the spaghetti sort and was most entertained by that. I don’t know if you’re familiar with that but if you image yourself holding a bunch of spaghetti and then you just put your hand on top, you can tell which one is the largest. You pull that out, do it again, and you can tell which one is next largest. And they compared it to a parallel processing sort machine. So I thought that was pretty cool.

Fr. Robert: Okay, so that’s an algorithm that would work well if you had a system that could do many tasks at the same time.

Patrick: Yeah. And it just says okay, which one is the tallest, boom, pull it out. Next!

Fr. Robert: Alright. Well, again, we’re going to be including the links to the resources. We wanted this to be a foundational episode so we’re not going to be giving you any code examples beyond the one that you’re going to find inside those resource pages. But we want to do this every once in a while. And the way that you’re going to suggest topics is by following any of us on twitter or in Google + or our Know How group, and tell us what you want to see. This was sorting, what do you want to see next? Some people would really like us to go into some advanced mathematics. Some other people want us to back up quite a bit and talk about things like how do I determine whether or not I want to use an interpreted language here? These are the kind of questions that we want you to bring to Coding 101. Because when we’re in a module, it’s very difficult for us to alter our projectory. But when we’re in this wild card arena between modules, it’s all about you. Gentlemen I want to thank you for being here for this episode of Coding 101. Of course we’re going to find both of you on the TWiT TV network. But Patrick, if they want to find more about you, where do people go?

Patrick: More about me, twitter @PDelahanty, my other endeavors, I did backtothepredictions.com. Where every week day I’m posting a prediction from Back to the Future 2, and judging if they got it right or wrong. And boy did they get a lot wrong.

Fr. Robert: Folks, when we say that TWIT TV is populated by geeks, we’re not lying to you. We are a bunch of geeks. Also special thanks to our super special co-host, Lou Maresca. Lou, it’s great to have you back every week. This is the start of something fun and love talking to you every week because every once in a while you bring up a topic that I’ve totally forgotten. Could you tell the folks where they can find you and your work on the internet?

Lou: On Twitter, @LouMM, and about me, Lou MM as well. And all of my work during my day job is at crm.dynamics.com.

Fr. Robert: Fantastic. We know that this is a lot of information and we want to make it easy for you to follow along on the projects. So we’re going to make sure that in our show notes we’ll have links to the places you can buy the individual pieces as well as where you get to download the Arduino IDE and a few helpful hint sites so if you want to move ahead in the class, you can do that. But in order to get that information you can download the old modules of Coding 101, drop by our show notes page, at twit.tv/code. You can find our entire back catalogue of episodes which is important because it’ll let you download entire modules if you want to learn what we did in C Sharp or Perl or PHP, it’s all right there. It also gives you a place where you can use that little dropdown menu to get every episode of Coding 101 automatically downloaded into your device of choice. We make it easy because we love you. Also, we do this show live every week Mondays at 2:30 pacific time. You can join us at live.twit.tv. And as long as you’re watching live, jump into our chatroom at irc.twit.tv. You can follow me on twitter, @padresj. If you go there you can find out what I’m doing for all my shows. Coding 101, Know How, Padre’s Corner, Before You Buy, and This Week in Enterprise Tech. I make sure to list my episodes and topics there so if you want to see what I’m doing on TWIT that’s a great place to go. Thanks to all the people to make this show possible, to Lisa, to Leo. To my super special TD. Bryan Burnett, could you tell the folks at home where they can find you?

Bryan Burnett: You can find me and Padre doing Know How, Thursdays at 11. You can follow me on twitter @Cranky_Hippo. I also do BYB reviews. And now we’re doing that show together.

Fr. Robert: Until next time, I’m Robert Ballecer, end of line!

Coding 101 #59
Mar 16 2015 - Bubble Sorting with Patrick
How safe are Perl, PHP, and Ruby?

All Transcripts posts