The term plagiarism can be defined as “the practice of taking someone else’s work or ideas and passing them off as one’s own”. The topic of plagiarism is a serious one. In this school, students can be removed from courses if found plagiarizing work. If you continue on to a post-secondary institution like a university, college, trade school, etc. they take plagiarism so seriously that you can be expelled (removed) from those institutions (with no refund of your tuition). According to a recent New York Times article, at Brown University, more than half of the violations of the academic code involved cheating in computer science classes. Similarly, at Stanford, 20% of one computer science class were flagged for cheating. If you continue on to the software development industry and are found to be plagiarizing code, you can be fired and even sued (for example, this seven-plus year law suit between Google and Oracle over Google’s use of Java code to make Java applications run on Android). So it is important that we have a discussion about this topic!
But what is plagiarism in Computer Science?
When dealing with coding, what constitutes plagiarism can be a debated topic. Here are some things to first consider:
- Is using outside code plagiarism? (e.g. code from the internet, from a teacher, another student, etc.)
- What about copy and paste?
- Do I cite my sources? (like an essay, do I make citations, and if so how?)
I will address these questions in more detail later in this article. From my perspective, here are some sure clues that I suspect plagiarism has occurred:
- Multiple student end up turning in the same file (e.g. a PDF) and just change the filename. This can be easily seen by examining the meta data stored in the file’s header information. When I see it was made by the same user, at the exact same date and time (down to the second) and has the same number of bytes of storage – PLAGIARISM
- Two PDF files of code have similar looking lines of code, including the same logic, same variable names, same indenting, same number of spaces, and other clues within the code. I have a piece of software I wrote myself (although there are many commercial software packages and websites like “TurnItIn.com”) that does a text parse analysis of code to flag these consistencies (but after 20+ years of marking Computer Science, I often just “see” it) when marking – PLAGIARISM
- Two similar code files (like above) but have changed the comments, or the spacing, etc. just enough to make them look different. In this case, I would look holistically and decide – UNDETERMINED
So what would make me decide in that last case if it was or wasn’t plagiarism? The decision is certainly situational, so let me describe some hypothetical situations that are in the “grey” area like the third situation from above:
- Student A is having trouble solving a problem, so has a discussion with student B. He asks student B how he solved it, he even looks at his code, he asks questions about the solution. He then goes to his own work station and writes his own solution, using some of the techniques he saw in student Bs work, but uses his own code style, variable names, etc. NOT PLAGIARISM (this is still student As work using student B as a learning tool)
- Student A is having trouble solving a problem, so has a discussion with student B. He asks student B how he solved it, he even looks at his code, he asks questions about the solution. He then asks student B for a copy of his code on a USB, takes it to his station, copies and pastes it and then changes the comments and a few other details. PLAGIARISM (this is not student As work, it is student Bs work, the changing of those features in the code is not learning)
- Student A is solving a problem and gets stuck. They go into the example code learned in class as shown by the teacher and use “snippets” of that code in their solution to get to a solution. NOT PLAGIARISM (as instructors, we expect students to use the code we lay out in examples ~ although you should ask every instructor to confirm this ~ and it can be used as a learning tool to demonstrate what you learned)
- Student A is working on a final project and get stuck. They go online and find some code that helps with parts of the problem solving. UNDETERMINED (it depends how they present that code. If it presented as the entire solution to the project, and expect it to count for marks as their code, then yes it is plagiarism because they are representing it as their own solution, but if the code is used but a comment showing where is was found and it is not considered in the overall marking, then it would not be plagiarism)
I could continue to add scenarios, but I hope you are starting to get the point. In a famous case by a US supreme court justice when asked to identify “obscene” he stated “I don’t know how to define it, but I know it when I see it”. For you, it should be simple, if you are coding a solution yourself with your own style, variable names, spacing, logic, etc. then generally it is NOT plagiarism. You CAN reference my code, use the internet, discuss solutions and even look at code from other students – BUT when you are taking that work and representing it as your own, then you are plagiarizing. You may be all wondering if I found some examples of plagiarism in the work I have been marking this semester and this is what sparked this discussion. Well, that is a good guess, but that is also why I have wrote this long announcement as a warning to everyone (my online students and my in class students) so I don’t hopefully have to address this again!
- Mr. Wachs