Pages

Thursday, January 14, 2010

Another Instance of Plagiarism

This morning, quite a few of the people I follow on Twitter and my blogroll discovered that some of their blog posts had been reposted without permission on another site.  The person that had set up this blog has visited numerous respected bloggers' feeds, as well as Microsoft's TechNet content, and reposted it on his site.  This person didn't even bother to remove things inside the content that clearly indicated where it came from - making his transgression insanely easy to catch.  It seems that absolutely nothing on his site was created by himself, and it's simply an "aggregation" of other people's work.
The general consensus within five minutes of discovering this "rip-off artist" was to:
  • Flood his site with comments calling him a thief.
  • Send him "DMCA" style takedown nastygrams.
  • Post information on your own site with his name on it, branding him a plagiarist, so his future employers' searches would find out.
  • Getting lawyers on his case, to shut down his site permanently.
Stop, or I'll Shoot!
Whoa.  Scary sh*t.  Anything more than the first option up there probably has him filling his pants with bio-degradable material.  I know I would.
But I'd ask you all to take a step back for a minute here and get some perspective.  Take a deep breath.  I know I'm talking about sensitive stuff - this is material that a lot of people I respect very much have put a lot of effort into.
Examine the Evidence
First, by looking directly at the posts, it's obvious that this person is either completely incompetent at plagiarism, or has no concern for having the content thought of as something he made.  It's obvious because he hasn't removed glaring attributions inside the posts, or links to the original authors' other posts.  If making people think these posts were generated by him was the intent, he fails 99%.  In fact, he fails so badly at it that it raises the very real probability that there was no intent to do that.  (Sadly, I didn't take a screen picture showing examples of the obviousness of the scraping.)
Second, look at his bio page (which is probably taken down or changed by now - so here's a picture - I did have that page left open):
It seems quite clear to me that he's done exactly what he's said - he's scraped material that he finds very useful from other sites and reposted it here.  And he's very thankful for the existence of that information.
Inside the Criminal Mind
So, from a (hopefully) objective perspective, here's my analysis of the situation.
  • This guy loves what he's read online elsewhere - it helped him.
  • He wants other people to see it, and get that same help.
  • He's very uneducated in copyright law.  (He has his own copyright notice in the site footer!)
  • He is unaware of how it makes authors feel when their content is scraped and reposted without consultation.
Instead of unleashing ungodly fury on this kid for doing something he clearly should not be doing, I think he should be gently informed of the right way to go about satisfying what he wants to accomplish, lest he expose himself to sphincter-relaxing outcomes.  The "right way" is to attribute properly, with the permission of the original author.  That way, he gets what he wants - a collection of work he likes, and a way to share with the world his collection.  And the original authors get what they want - exposure.
The Victims' Reaction
Here's why I think that the "victims" went about this in completely the wrong way - and maybe I read too much TechDirt.
Ignorance Of the Law Is No Defense!
First and foremost, the people who had their work reposted are "educators" within the SQL community.  And yet their first reaction wasn't to educate this person, it was to draw and quarter them.  Shameful.  You're acting by assuming something that's done incorrectly by this guy has been intentionally crafted that way despite full knowledge of the correct way.  That's like seeing somebody's database with AutoShrink turned on and assuming they're doing it solely to piss off their users and SAN administrator.  You guys would never react that way in that case - so why do you in this case?
He's a Thief!
Secondly, going to the extreme by accusing the guy of "stealing".  This is a highly controversial and sensitive subject, and I'm definitely influenced by TechDirt on this one - but the logic is irrefutable.  You can't "steal" content, period.  There is nothing physical to "take away".  Just because someone else "has it" doesn't mean you still don't "have yours".  Mind you, I do look on this argument as splitting hairs - because it sure does feel like stealing, doesn't it?  But think about it.  It's not stealing - it's the possibility that he's receiving credit for your work that's "wrong".  In that respect, I can understand characterizing it as "stealing" - but not with attribution, whether expressly given or not.
Attribution Without Permission Is Still Stealing!
Third, would you feel any different if the content he'd "stolen" was attributed properly?  From the comments on Twitter this morning, most of you thought absolutely not.  And that's where I think you're absolutely wrong and short-sighted.  And this is the point I'd really like you to think about and take away from this post.  Think about the gains and losses of having someone re-post your content WITH attribution, but WITHOUT permission.  Stop reading now for a second, and please do that.  If you've been honest with yourself, I think you'll find there aren't any losses, and there are only gains. 
The "Loss Myths"
Did you "lose readers"?  No - because people reading your stuff on his site didn't find yours, so they weren't your readers to begin with.
Ah - but he's stolen search engine hits from you!  Maybe - but probably not.  If your blog/site is really as good as you (and I) think it is, then your site almost certainly ranks higher than his, so you should be getting those hits.  He's "earning" his own traffic through other means.  (And this guy, I'm sure, isn't out there trying to flog his site.)  Balance that with the fact that the attribution links to your site, so he's driving traffic and "respect" your way.  Think of your recent searches online.  I'm sure you've found hits (as I have) on "aggregator" sites - which are stupidly easy to recognize, aren't they?  Don't you click through to the source site?  Maybe not to read the original article, but definitely to see the author's other content?  Of course you do.
He's stolen ad revenue from you!  Again, I doubt it - same arguments as in the previous paragraph.  At the very worst, he's contributed to the devaluation of CPM rates across the internet by the watering down of ad clickthroughs.  But again, that shouldn't be a loss to you, since your site attracts knowledgeable SQL people - people advertisers want.
The Real and True Gains
Anyone who does see your stuff on his site and likes it will surely know where to go to get more.  And no - not his site.  Please be serious and think about how outrageous it would be for this guy to actually pass himself off as someone who could have produced that content.  Could he really land a job commensurate with that knowledge?  Get a book deal?  A speaking engagement?
You get validation that the great content you're giving away for free into the boundless internet is actually getting read and appreciated.  Sure, comments on your blog do that too - but this is one step further, IMO.
What The F*** Do I Know?
How can I comment on this, if I've never been plagiarized before?
That's not entirely true.  I recently discovered that SSAS-Info.com, run by Vidas Matelis (blog|twitter) - a very respected and knowledgeable SQL Server MVP (at least by me) had been reposting some of my content without permission on SSAS-Info.  I discovered this via Google Analytics of my own blog's activity showing hits coming from there, wondering why, and digging deeper into it.  Turns out that Vidas (just like many of us) scours the internet looking for good material.  When he finds it, he posts it on SSAS-Info.com - with a few important caveats.  He identifies the author and source on every post (not as clearly as I'd like), and only shows a paragraph of the article followed by a click-through to your site that hosts the complete article.
To be perfectly honest, my initial reaction was "WTF?  Where does he get off copying my stuff, making it look like he made it?"  (Of course, I didn't see my name there to start.)  However, after a little poking around, that initial feeling disappeared entirely. 
Was it that he only posted some of my article, and not all of it?  No - on reflection, I always hate it when I get RSS feeds that way.  I don't (intentionally) write my articles to have a teaser in the first paragraph to induce a click-through.  I want readers to read the whole thing.  They'll only read the whole thing if they like it, of course - and then my (and I think their) first reaction is to find out who wrote it so I can find more.
Was it because my name was there?  No - because to be honest, it wasn't very prominent, AND one could argue it made it look like I was a "columnist" there, and people could come back to SSAS-Info for more of my content.  (Never mind that fact is true - Vidas would have kept reposting new stuff from me, regardless of whether I found out.)
Was it because, by clicking through two other links (All Articles By "X" in the Latest Author Articles box, then the "read this" link) I discovered what Vidas was intentionally doing with his repostings?  Not exactly, although it reinforced the good feeling that I was getting.
My initial feeling disappeared because I realized I was getting exposure.  And that's one of the specific things I wanted by blogging.  Who cares if Vidas makes some money of ads on his site?  I don't have ads on mine at all.  Should I begrudge him for making money off me if I didn't want to make money off me?  Just because he's more aware, on-the-ball, or adept at doing that, I should try to stop him - even if doing so does nothing to increase my ability to do what he's doing?  Stupid isn't it?
As soon as I read his "procedure" I sent him an email asking him to post full versions of whatever he wanted to on his site.  I am now insanely happy with this arrangement - more so because Vidas decided to pick out my stuff all on his own without any lobbying from me.
The Verdict
As you've read, I'm sorely disappointed by the SQL Server "community leaders" reaction to this.  I think that those who have put in a lot of effort to educate the technical community about how to best use SQL Server would fly off the handle and jump to threats before considering a course of action that should be second-nature: education and understanding.  I do hope the site does change some things - attributions and permission would be a good start.  But more than that, I hope this guy hasn't written off the SQL speaker/blogger community as a bunch of selfish people.  I hope that he continues to "steal" content much like Vidas does to help promote the community that I'm a part of, and expand awareness that there is such a community.
No doubt there will be those who will plagiarize "better" than this guy - and intend to do it maliciously for personal gain.  Or some that refuse to attribute properly when asked.  For those, I don't mind taking advantage of the protections that copyright affords.  But rounding up a lynch mob to string up a guy on first contact is not the right thing to do.
Think.  Before you act.

20 comments:

  1. Having gone through a very similar thing recently I can say I agree with you about opening up with both barrels before having a civil dialog with the person. My situation was I sent the site operator an email and was basically told to go pound sand. My biggest problem wasn't with the fact he was reprinting everything I had written, it looked like I was part of his "company" and was writing for him. To me that is theft. He is taking my reputation and trying to enhance his business with it.

    Every situation is different. I don't know the whole story with this particular instance just what I read on Twitter like you. I can't say 100% what was done was right or wrong on ether end of this issue.

    ReplyDelete
  2. I agree Todd. Well said. The guy is a DBA and the worldwide community just chased him with pitchforks and noose. He'll be feeling like he's just ruined his career.

    I'm convinced it was a well-intentioned oversight on his part.

    ReplyDelete
  3. I'm giving him a chance to let us know his side. I started keeping quiet on the topic after the morning mostly because I want to hear what he has to say before making any more stabs at him personally and professionally.

    His comment area is still up and posted the following...

    Quite a bit has happened from what we (SQL Community) found today on your site. I'd like to ask a few questions now that at least I have taken a breather from knowing my content was abused here as well.

    Can you honestly tell us you have learned what the problem was? Many of us work very hard to get the content up for everyone to use and grow from it. copy\pasted it to your site takes a huge piece of respect out of that from us and that work that we have put into the community. That work is only at our own cost. In that, I hope you can see how it feels you have completely disrespected us as people that truly are trying to help and build the SQL Community.

    I hope you can openly come forth and let us know that this was an honest mistake and you just had no clue what you were doing. Doing just that is what makes us who we are and why people call us one of the greatest communities in IT.

    Ted Krueger

    ReplyDelete
  4. I can't get behind you on your hand wringing on this one. Plagiarism isn't some arcane copyright law, as you make it seem. All through our education, at least in the US, we've had it drilled into us what appropriate quotation, footnoting, and paraphrasing should look like when reference other peoples' work.

    This wasn't the first time I've witnessed that plagiarism has been discussed in the SQL community, and it won't be the last. And comparing those reactions to their specific cases to the reaction to this one, I have to say that it was a pretty measured response compared to the egregiousness of the plagiarism.

    As for your experience with Vidas, this same group of people you're chastising as being hanging judges have had conversations about that exact scenario, and while I don't want to put words in their mouth, based on this blog post I'm pretty sure you would be surprised by their opinions on that matter.

    ReplyDelete
  5. For me the whole things a bit over the top. Its not a witch hunt. I see why people are upset but give the guy a break.

    ReplyDelete
  6. I would like to point out that this guy holds a Masters degree from the University of Minnesota, and a Bachelors from Kenyatta University (Graduated Summa Cum Laude) [Google for his LinkedIN account.]

    So, I don't believe any further 'education' is necessary.
    I noticed that VERY quickly after this started his website changed so that the first page basically stated:
    'this is a collection of RSS feeds'.

    Did we over react? Yes we did.

    Why? I believe due to the valid, 100% plagiarism identified on other sites within the past few weeks, emotions are still a bit high on this topic.

    This was my first time seeing the Twitter community get upset. I'll be sure to remember ‘Today’ the next time the community gets upset.

    ReplyDelete
  7. I believe the community is over-reacting significantly.

    Could he acknowledge the sources better? Maybe. He could make it more obvious, and from the content it appears he did (NOTE: I coldn't easily find his blog).

    My real question is, how many people have bookmarked GREAT content, only to go back later and find it gone - the website or blog is either missing (blank) or a 404 error comes up becuase the site is completely gone? What do you do then?

    In a sense, he is saving the information so that others can view it and providing a link or reference back to the original author (to show it wasn't his original work).

    Is that really so wrong? Or am I just thinking of it differently? Yes, a lot of people have put in a lot of work to develop original content. And it's not meant to minimize their contribution. But let's make sure we have the reference material we want or need to help ourselves. (Now, where was that great site that lists DOS commands with some great explanations? Oh, yeah. That was one site that disappeared - RH Watson!)

    Tim
    kb0odu

    ReplyDelete
  8. My take on this: http://sqlchicken.com/2010/01/the-plague-of-plagiarism/

    ReplyDelete
  9. My take on today can be found http://itknowledgeexchange.techtarget.com/sql-server/todays-plagiarism-incident/.

    ReplyDelete
  10. Thank you all for your comments to date. I too am impressed by the professional level of debate on the issue that no doubt evokes visceral emotions. I'm not done on the subject - every comment here, and on Twitter today has given me (and I hope everyone else) a lot to think about.

    I'm sure my position will change somewhat as time goes on - due to more time for reflection, and the possibility of this happening to me - we all learn as we go. However much my post may appear to be "going easy" on this guy - that's really not the message I want to send. The message I want people to take away is that I think they should try to turn an apparent negative into a positive, if at all possible. The cliche'd "lemons into lemonade" type of thing.

    Thanks again for the comments.

    ReplyDelete
  11. It's an interesting take, but I don't think that the people involved so much over-reacted as swarmed. Over reaction would have been immediate calls to lawyers. Instead there was a cease & desist or tomorrow I call the lawyer. While the language was, blunt shall we say, the message was clear and accurate. If one guy had posted that message and that was it, this wouldn't be a big deal.

    It's just because five or six people posted the message that it seems like over-reaction. It's not. It's just a bi-product of what's possible with the communication mechanisms available to us. You get a post on twitter that says "Hey, someone is stealing from me and they stole from Jimmy Joe Bob as well." That hits, let's assume, 500 followers, one of whom is Jimmy. Jimmy looks and notes that Beth-Anne was also stolen from and he posts that. It goes to another, say, 200 distinct followers. Now something like 700 people are browsing through the poor shleb's sight, gawking at the theft, minor & silly though it may have been. Again, not an over-reaction, but a swarm. It's the swarm that's the problem, if there is one, not the reaction itself. I still think that was reasoned and appropriate.

    ReplyDelete
  12. Yeah, all of the people who had their content stolen have a right to protest its theft (whether the individual reposting it without permission meant it or even knew that he was violating copyright laws). However, I personally cringed when I saw threats hurled at the guy's career future. I know for a fact that 99% of all authors make little to no money from writing and I have turned down opportunities to write more than once because the effort didn't justify the reward.

    I am not saying the plagiarist is in any way to be defended. he is not. I am just wondering where the grace and compassion are for a fellow IT guy in an economy approaching 11% unemployment and where authorship earns you little to no income.

    Proportionality is all I'm suggesting.

    No one likes to see a Giant beat up a little guy.

    ReplyDelete
  13. I'm with Tim Benninghoff on this one. There isn't much grey area in copyright law and the DMCA. You don't just require attribution to post something online in its entirety, you also need permission. By posting something online, I am not giving implied consent for it to be copied. If someone approached me about putting my content in an RSS aggregator with appropriate attribution, I would be okay with that.

    I don't buy the personal collection of useful articles argument. There was a fairly sophisticated nested menu bar on the site. Flat hierarchies are easy to put together from tags and meta keywords. Tree taxonomies require thought and effort. To make the effort to pull from RSS aggregators, as well as build and update a taxonomy, requires effort.

    Peronally, I keep a collection of my favorite links. I call it delicious.com. I also store them in my bookmarks folder and sync those files through dropbox.

    When I went to college and got my B.A. in English we had signs in every classroom informing us in no uncertain terms what constituted plagiarism and just how fast the university would throw us into the street if we plagiarized and were caught. I find it difficult to believe that anyone could get a bachelor's degree and believe that copying content neither permission nor attribution is acceptable.

    If I wanted anyone to be able to republish my blog posts, I would use a Creative Commons license. That's why it exists. There's also a reason I don't use it and why I use a restrictive, traditional, copyright.

    ReplyDelete
  14. I agree with most of your comments and thoughts on this subnject however, there is one thing that got my blood boiling a bit.

    I enjoy your site and often check for any new articles from you. With that said, I cannot believe the way you more or less ripped on Vidas for your articles showing up on his site. If you really knew him or his reputation, you might actually be honored at some level to have your articles how on his site. He always notes who the article is from and never implies that he is the author. He is very well respected in the SSAS\SQL community and his site has very high visibility.

    ReplyDelete
  15. @Anonymous - I'm really not sure how I "ripped" Vidas. I probably don't know him as well as you, but I have met him, and we did have a very good email conversation. Did you miss the part where I said I was "insanely happy" with the outcome? If you're interpreting my "WTF?" comment as "ripping" - it wasn't intended to be that way. I was just being completely honest about my initial, visceral reaction - just as I was being honest about how that reaction changed quite quickly. Feel free to talk to Vidas about this.

    ReplyDelete
  16. @Anonymous - Hey, I can't believe I forgot to say thanks! I appreciate that you find something interesting here from time to time. Thanks for letting me know.

    ReplyDelete
  17. @Anonymous,

    Thank you very much for defending me and for nice comments about my
    website and me! When I first read Todds post I also was kind of
    shocked that Todd could even think that I tried to "steal" his
    content. But as I read further, I understood that this was just
    initial reaction. So I am happy that Todd blogged about what he thought. Based on his feedback last night I made authors name more visible.
    Todd is actually one of the few bloggers who gave me permission to re-post whole content of his SSAS related posts and I am thankful for this.

    ReplyDelete
  18. I really enjoyed your post, Todd. I think your well-reasoned approach makes a lot of sense and is much better than crowd-sourcing a mob.

    One point I'd like to mention (which I also made on Jorge's and BrentO's blog) is that I havent' seen anyone else make is that we now have an opportunity to decide how to react to similar occurrences in the future AS A GROUP. And believe me, they will definitely occur.

    Maybe you should lead the charge - via a new blog post - about your opinion of "what to expect from THIS community" when this happens again.

    Best regards,

    -Kev

    ReplyDelete
  19. Well said. I am glad at least SOME people are rational, calm, and think before they blow up.

    lots of good points :-)

    ReplyDelete
  20. I agree. Plagiarizing can be really annoying especially when you have put so much hard work in to it. In this technological age it's happen very often. I use PlagTracker.com for check my content and to find out whether it is being used somewhere else.
    http://www.plagtracker.com/#

    ReplyDelete