IRC Analysis - 99.9% Illegal?

Hacker's Playground or Researcher's Dream?

In response to those Slashdot readers who obviously didn't bother to read this article properly: This was a lighthearted, totally unscientific article and is not meant to be at all serious. You probably believe this is real, right? I am not claiming 99.9% of IRC is illegal, I am stating that, based on the keywords observed, 99.9% of the messages sent to the 60 largest IRC channels is in an illegal context (fact). I do not believe that applies to IRC as a whole, nor would it be sensible to claim so. I don't think I'd have written an O'Reilly book about IRC and written lots of open source software for IRC if I believed that was true. I know hundreds of people who use IRC in productive and legal ways, but I guess you skipped that bit and just went looking for the conclusions at the end of the article.

It seems reasonable to assume that a journalist researching IRC for the first time would be more inclined to visit one of the larger channels, and thus be more likely to conclude that it is all about illegal file sharing. This is one of the reasons why IRC gets an unfair bad press. That's what this article was trying to show, in a roundabout way.

There are no "lies" or "bullshit" in this article, just people who can't read and interpret things sensibly for themselves.

"Internet Relay Chat (IRC) is a system that allows groups of people to collaborate and chat from anywhere in the world"

I wrote the above in the introduction to some published material about the PieSpy IRC Bot which forms a small part of my Ph.D. thesis.

I have always thought of IRC as a useful medium for communication, and even felt compelled to write an O'Reilly book on the topic ("IRC Hacks"). So it always gets me riled when journalists and news reporters persistently portray IRC as some kind of hacker's playground, where illegal software and and credit card numbers are used as a virtual currency.

I set myself the mission of disproving this portrayal.

Academic Use of IRC

I like IRC - I use it a lot. During my student days, I found it extremely useful for keeping in touch with friends and colleagues from all around the world. I even used it to collaborate on a paper with somebody I had never met before, on the other side of the planet, who I subsequently met when the paper was successfully presented at a conference.

Not all IRC networks are visible to the entire world. Some universities run internal IRC networks, designed to be accessed only by staff and students. In a small world where everybody has access to a computer, this is a natural way to communicate and avoids the need to incur costs on telephone bills. Use of such networks is typically for both academic use and for general chit-chat.

Open Source and IRC

The IRC protocol is publicly documented in several RFCs. This has led to the creation of lots of IRC clients, many of which are open source. The open source involvement of IRC does not end there - many programmers use IRC to collaborate on development of open source software, as it is inherently suited to large conversations involving multiple participants.

Take for example the freenode IRC network, which depends on donations from its thousands of users to stay running. A casual look around the channels it has to offer reveals a plethora of activity and involvement. In #Gallery, you'll find dozens of users and developers of the open source PHP Gallery application; most of the Java developers will hang out in #java; general chat about mobile telephones will be going down in #mobitopia, and the Semantic Web Interest Group can be found lurking in #swig.

freenode is the network that I spend most of my time on, so it's no wonder I get defensive about people stereotyping IRC as a haven for bad guys and script kiddies. However, it would not be sensible to make an opinion based on just one IRC network, so I decided to take a look at the most popular channels on other IRC networks.

Analysis of the top 60 IRC channels

I fired up my IRC client and connected it to the ten largest IRC networks in the world.

When you are connected to an IRC network, it is usually possible to query the server for a list of the channels it provides, along with the number of users in each one. This list will only include public channels, but it is reasonable to assume that secret or hidden channels will not contain a significant number of users.

I joined the six largest channels on each IRC network. Some of these channels contained more than one thousand users. It was immediately obvious that most of the conversation going on revolved around illegal file sharing.

Aside: Carding and Phishing

It is worth noting that fraudsters dealing in "carding" and "phishing" are most likely to operate in small, secret channels. Sometimes these groups encrypt their messages so they are totally unintelligible to anyone who should happen to stumble upon their conversations or sniff their network traffic. Using the above approach, it was thus deemed unlikely that any fraudulent activities such as carding and phishing would be uncovered. It would not be possible to detect the majority of fraud without having some kind of special access to the servers comprising the IRC network. This layman's analysis will therefore concentrate on illegal file sharing which is easier to detect on IRC.

Illegal File Sharing

I left the IRC client running for a period of 36 hours in these 60 channels - the largest public channels in the world. Upon entering each channel, more often than not I was greeted with messages similar to the following:

If you are affiliated with any government, police. ANTI-Piracy Group, RIAA, MPAA, FBI, movie production company/distribution company or related groups you are violating code 431.322.12 of the Internet Privacy Act signed by Bill Clinton in 1995,Therefore you CANNOT threaten our ISP(s),person(s) or company storing these file and cannot :prosecute anyone. And you must LEAVE NOW.

This is an obvious indicator that something illegal is going on. I have no involvement with the government, FBI, etc. so I was apparently free to stay.

Within a few seconds, my IRC client was bombarded by a stream of messages advertising illegal unlicensed software, mp3s and the latest movies - items known collectively by illegal file sharers as "warez".

Armies of Autonomous Agents

The majority of the messages were being sent by autonomous agents known as IRC bots. These are computer programs that connect to an IRC network and offer to send files to other users. These are often referred to as "XDCC" bots, because of the protocol that they use to announce the list of files they are offering to send.

Each warez channel contains several XDCC bots, typically with identical names followed by some numbers (the IRC protocol does not permit two users to share the same nickname). These bots usually run on compromised machines with lots of bandwidth, such as poorly configured university servers or business web sites. Files can be served from any port number, so detecting these intrusions is obviously difficult enough to keep the hackers in business, for want of a better expression. The XDCC bots sit in the channel, periodically announcing the files they have to offer:

<[XDCC-Bot]01> *#1 6x [118.8M] Microsoft.Money.Deluxe.2005.ISO-RESET.rar
<[XDCC-Bot]01> *#2 7x [10M] A2Soft.CD.Creator.v1.0.WinALL.Incl.Keymaker-NiTROUS.rar
<[XDCC-Bot]01> *#3 36x [1.3M] aa-Trojan_Cleaner.zip
<[XDCC-Bot]01> *#4 20x [3.47M] Acoustica.CD.Label.Maker.v2.03.WinALL.Incl.Keymaker-EMBRACE.rar

The XDCC bot instructs users how to download each file it is offering. There does not appear to be any need to offer anything in return, which has to make you wonder why the hackers are doing this.

Viruses and Trojans

Occasionally, a user will attempt to send a file to other users in a channel without being asked. These files may be offered under the description of free porn, or passwords to adult web sites, but are usually executable files - most likely viruses or trojans of some sort. It is not uncommon for such malware to spread through IRC, and infected users are liable to end up being used as a platform to run another XDCC bot and provide more storage space for future warez.

Monitoring for Warez

During the 36 hour monitoring period, four keywords were monitored:

Norton and Symantec product anti-virus software and personal firewalls. Jasc produce the popular Paint Shop Pro paint package. Microsoft probably don't need introducing.

Results

Monitoring all 60 channels, I counted the frequency of each keyword over the 36 hour period. Each occurrence was manually verified as being in either a legal or illegal context. For example, two people discussing the new features in the latest version of Microsoft Word would be regarded as a legal context.

KeywordOccurrencesLegal
Contexts
Illegal
Contexts
Norton443144427
Microsoft322763221
Symantec256802568
Jasc3720372

Conclusions

Two rather surprising observations can be made from this ad-hoc analysis of the 60 largest IRC channels:

  1. Based on those keywords being monitored, 99.9% of IRC traffic to the top 60 channels is "illegal".
  2. Norton products are more popular than Microsoft products (perhaps IRC users have more need for virus scanners?)

As much as it goes against my desires, based on the above figures, I can only concede to the correctness of the traditional stereotypical view of IRC:

It is a haven for warez and trojans.

Nonetheless, it is unfair to tarnish everyone with the same brush. Sure, it looks like IRC is used to perform a lot of illegal file sharing, but as I have also tried to show, it is used for lots of constructive purposes, too. I know that these 60 channels are unlikely to be representative of the entire IRC population.

The whole argument of whether IRC is bad or not is rather moot - in the strictest sense of the word. IRC is a protocol - just like HTTP is a protocol. Do we hear journalists and news reporters blaming HTTP for widespread proliferation of warez? I don't think we do, yet the much larger user base of HTTP suggests that it is undoubtedly used to transport a greater volume of warez than IRC does.

IRC is a big, dangerous city full of crime. I just happen to live with a bunch of people in one of the nice streets in the suburbs. There are lots of suburbs.

Paul Mutton is an expert on Internet Relay Chat (IRC). He has authored the book "IRC Hacks" for O'Reilly, the PircBot IRC API, several open source IRC bots and several other publications featuring IRC. Paul can be contacted at the address below.

 

Search this site

 

Copyright Paul Mutton 2001-2013
http://www.jibble.org/
Feedback welcomed
email

~
Dreamhost
Web Hosting

~
Dreamhost
Web Hosting