MySpace Band Statistics

Lately I’ve been doing a lot of research on social networks centered, at least partially, on music. In doing this, I began gathering some statistics on bands that use MySpace. I wanted to get a ballpark figure of how active bands are in the MySpace network. My friend Dan collaborated with me on writing some scripts to gather data. We wrote 3 scripts to get our stats.

Script #1 is a PHP script that looked at MySpace profiles and saved only band profiles to my hard drive. After I chose my range of MySpace profiles to analyze, it was faster to split up the range between a few scripts and run multiple threads. I also tried to make my requests through a proxy but found it to be too slow.

$start_time = time();
$total_music_profile_count = 0;
$friendID_start = 100000000;     //Start at 100 million
$friendID_end =   101000000;    //101 million
$PATH = ""; //insert path here
$ch = curl_init();           
 
/* You can use Tor to surf anonymously... I found this too slow */
//$tor_address = '127.0.0.1:8118';           
 
for($friendID=$friendID_start; $friendID<$friendID_end; ++$friendID){
     //Uncomment these to surf anonymously
     /*
     curl_setopt ($ch, CURLOPT_PROXY, $tor_address);
     curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1);
     curl_setopt ($ch, CURLOPT_HTTPPROXYTUNNEL, true);
     curl_setopt ($ch, CURLOPT_PROXYTYPE, CURLPROXY_SOCKS5);
     */           
 
     curl_setopt ($ch, CURLOPT_URL,
"http://profile.myspace.com/index.cfm?fuseaction=user.viewprofile&friendID=$friendID");
     curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
     $result = curl_exec($ch);
     $pattern = '/MySpace music profile/i';           
 
     if(preg_match ($pattern, $result ,$matches)){
          ++$total_music_profile_count;
          $filename = "$PATH/mp_$friendID.html";
          $file_HANDLE = fopen($filename, "wb");
          $bytes_written = fwrite($file_HANDLE, $result);
          fclose($file_HANDLE);
     }
     print "Processed MySpace profile with friend ID: $friendID\n";
}           
 
$end_time = time();
$time_elapsed = $end_time - $start_time;

Script #2 is a ruby script that parsed the saved profiles and wrote the band id, band/artist name, profile view count, friends count, last login date, and member since date to a MySQL database. You will notice that the band_member_text regex is commented out. We wanted to know the number of band members in each band, but the content was too variable to evaluate. *I apologize about the syntax highlighting issue you see below. WP-Syntax (powered by Geshi) seems to have a problem understanding quotes in regular expressions.

#!/usr/bin/rubyrequire 'rubygems'           
 
require 'mysql'
require 'date'         
 
def main           
 
     db = Mysql.new(hostname, user, password, database) #insert db info here
     path = ""    #insert path here           
 
     entries = Dir.entries(path).reject { |e| e == "." || e == ".." }
     total_profile_views=0
     total_friends=0
     today = Date.today
     max_avg_friends_per_day = 0
     max_avg_profile_views_per_day = 0
     n = 0           
 
     entries.each do |e|
          text = IO.read(path + "/" + e)
          band_id_regex = /^mp_([0-9]+).html/i
          band_name_regex = /<title>\s*MySpace\.com\s*\-\s*([^\-]+)/
          profile_views_regex = /Profile Views: (\s*)([0-9]+)/i
          num_friends_regex = /has <span class="redbtext">(\d+)<\/span> friends/
          last_login_regex = /last login: ( )?(\s*)([0-1]?[0-9]\/[0-3]?[0-9]\/[1-2][0-9]
{3})/i
          member_since_regex = /Member Since<\/span><\/td><td id="ProfileMember Since"
width="175" bgcolor="#d5e8fb" style="WORD-WRAP: break-word"<((\d|\/){8,10})>\/td</           
 
          if e =~ band_id_regex
               band_id = $1.to_i
          else
               band_id = 0
          end           
 
          if text =~ band_name_regex
               band_name = $1
               band_name.gsub!(/'+/, " ")
          else
               band_name = ""
          end           
 
          if text =~ profile_views_regex
               profile_views = $2.to_i
               total_profile_views += profile_views
          else
               profile_views = 0
          end           
 
          if text =~ num_friends_regex
               num_friends = $1.to_i
               total_friends += num_friends
          else
               num_friends = 0
          end           
 
          if text =~ last_login_regex
               last_login = $3
               ll_date = Date.parse(last_login)
          else
               last_login = "1900-01-01"
    	       puts "last login parsing error for file: " + e
          end           
 
          member_since_found = true
          if text =~ member_since_regex
              member_since = $1
    	      ms_date = Date.parse(member_since)
         else
    	      member_since = "1900-01-01" #dummy date
    	      member_since_found = false
    	      ms_date = Date.parse(member_since)
              puts "member_since parsing error for file: " + e
         end           
 
         if(member_since_found)
              avg_profile_views_per_day = (profile_views)/(today - 3 - ms_date);
              avg_profile_views_per_day = avg_profile_views_per_day.to_f
              avg_friends_per_day = (num_friends)/(today - 3 - ms_date);
      	      avg_friends_per_day = avg_friends_per_day.to_f
         else
      	      avg_friends_per_day = 0.0
              avg_profile_views_per_day = 0.0
         end           
 
          if max_avg_friends_per_day < avg_friends_per_day
               max_avg_friends_per_day = avg_friends_per_day
          end           
 
          if max_avg_profile_views_per_day < avg_profile_views_per_day
               max_avg_profile_views_per_day = avg_profile_views_per_day
          end           
 
          mysql_query = "INSERT INTO music_profile_stats (band_id, band_name,
profile_views, num_friends, last_login) VALUES (#{band_id}, '#{band_name}',
#{profile_views}, #{num_friends}, '#{last_login}')"
          db.query(mysql_query)
     end
end
main

Results thus far:

  • Total MySpace profiles analyzed: 2,132,917
  • Total band profiles found: 87,710 (out of the 13.5 million on MySpace - count taken in March of 2008)
  • Average number of profile views per band: 3,403
  • Average number of friends per band = 201

For each band, I recorded the average amount of profiles views they received per day, as well as the average number of friends they acquired per day. Script #3 is a php script that compiled results about band activity by looking at this data.

require_once('./mysql_connect.php');         
 
bucketize("music_profile_stats", "avg_friends_per_day", 0.01, 4, 1500);
bucketize("music_profile_stats", "avg_prof_views_per_day", 0.03, 5, 12500);           
 
function bucketize($table, $column, $bucket_start, $bucket_multiplier, $max_val){           
 
     $query = "SELECT * FROM $table WHERE 1=1 ORDER BY $column";
     $result = mysql_query($query);
     $bucket = array();
     $num = 0;
     array_push($bucket, $num);
     $num = $bucket_start;           
 
     while($num < $max_val){
 	array_push($bucket, $num);
 	$num *= $bucket_multiplier;
     }           
 
     $bucket_step = 0;
     while($row = mysql_fetch_array($result)){
 	if(($bucket_step < (count($bucket) -1))&&($row[$column] >=
 $bucket[$bucket_step+1])) $bucket_step += 1;           
 
 	$temp = $bucket[$bucket_step];
 	$stats["$temp"] += 1;
     }
     print "**********************************************\n";           
 
     foreach($stats as $k=>$v){
 	//normalize data
 	$temp = $k*365;
 	print "$temp per year => $v\n";
     }
     print "**********************************************\n";
}

The results from this script are graphed below.
Average Friends Acquired Per Year

Average Profile Views Per Year

Conclusion:

  • 54% of the bands in my sample have, on average, less than 1 profile view per day and acquire no more than 4 friends per year (very inactive)
  • 8% of the bands in my sample have, on average, between 1 and 10 profile views per day and acquire between 4 and 15 friends per year (somewhat inactive)

I would consider the remaining 38% of my sample to be at least active members. Extrapolating these results, of the 13.5 million bands on MySpace, I estimate that approximately 5 million of these bands are active in the MySpace network.

No TweetBacks yet. (Be the first to Tweet this post)
Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • MySpace
  • Slashdot
  • StumbleUpon
  • Technorati
  • TwitThis

If you enjoyed this post, make sure you subscribe to my RSS feed!

This entry was posted in Software and tagged , , , , , , , , . Bookmark the permalink. Both comments and trackbacks are currently closed.

17 Comments

  1. matt
    Posted April 11, 2008 at 5:02 am | Permalink

    wow, talk about coming out of the blog-gate fast and hard. nice first post, i’m anxious to see what’s next.

  2. Posted October 27, 2008 at 6:28 am | Permalink

    Well written article.

  3. Posted November 24, 2008 at 1:01 pm | Permalink

    hello–great article. do you know out of the 5 million or so active bands- how many would you think are unsigned and need funding. and how many of them are based in the USA. Thanks

  4. Posted November 28, 2008 at 10:21 pm | Permalink

    Jeremy,

    There is no way to tell for sure how many of these bands are unsigned and need funding, but you could certainly use this data to make a reasonable guess. I did not include functionality to determine band location in these scripts, but I am sure it is possible to do with some degree of accuracy.

  5. felix
    Posted December 8, 2008 at 7:21 am | Permalink

    really cool job! but where are the profile informations saved? i cant find anything after running it. i gave the php script the complete path, but nothing there :(

    thx, felix

  6. felix
    Posted December 9, 2008 at 7:56 am | Permalink

    atually, i m looking for a solution to get all bands
    - from germany
    - genre rock
    - signed
    - more than 150.000 profile views.

    is that possible?
    yours, felix

  7. Posted December 9, 2008 at 8:33 am | Permalink

    Felix,

    To answer your first question, the data is first saved in files in the path you specify. Notice these lines in script #1:

    $filename = “$PATH/mp_$friendID.html”;
    $file_HANDLE = fopen($filename, “wb”);
    $bytes_written = fwrite($file_HANDLE, $result);
    fclose($file_HANDLE);

    The files are stored in the path described by $file_name. Make sure fopen and fwrite are working.

    To answer your second question,
    - from Germany: probably possible but not implemented in this script
    - genre rock: again, probably possible but not implemented in this script
    - signed: i think you would have to guess here –> if((profile views > 50,000)&&(num_friends > 2000)) band = signed; …something like that
    - more than 150,000 profile views: this data is accessible from these script results

  8. Alex
    Posted December 30, 2008 at 1:29 am | Permalink

    Hi,

    This is really a great article. :)

    Tell me, is there a way for you to compare the relation #friends/#profile views/#plays? A daily average, or even a yearly one would present valuable source of info, I’m sure, since the number of plays can be dominant when determining a band’s popularity on Myspace.

    Great job, anyway! :D

  9. anton
    Posted February 5, 2009 at 6:50 am | Permalink

    NERD

  10. Posted April 29, 2009 at 1:53 pm | Permalink

    Great info!

  11. Nick
    Posted June 18, 2009 at 5:45 pm | Permalink

    Great article, Tony! Quick question – how did you count or calculate the 13.5 million bands number? What does that include? I’m doing some research on this stuff and would really love to know.

    Thanks a lot,

    Nick

  12. Posted June 18, 2009 at 6:22 pm | Permalink

    I just added up the number of bands in each genre showed at the bottom of this page: http://music.myspace.com/

    Obviously 13.5 million is no longer accurate, but it is certainly not off by an order of magnitude…it’s still close

  13. Nick
    Posted June 19, 2009 at 3:55 pm | Permalink

    Hi Tony,

    I’m not sure “Top Genres” numbers listed at the bottom of http://music.myspace.com/ actually count bands on MySpace. Here’s the count from today, adding up to ~21MM bands:

    Hip Hop 2,682,753 2682753
    Rap 2,534,305 2534305
    Rock 1,898,020 1898020
    R&B 1,672,610 1672610
    Other 1,130,460 1130460
    Alternative 907,461 907461
    Acoustic 788,026 788026
    Experimental 642,357 642357
    Pop 773,596 773596
    Metal 645,195 645195
    Indie 586,409 586409
    Punk 487,151 487151
    Hardcore 472,478 472478
    Electronica 252,457 252457
    Crunk 502,483 502483
    Emo 252,457 252457
    Techno 356,667 356667
    Reggae 333,455 333455
    Two-Step 336,664 336664
    Electro 304,379 304379
    DeathMetal 293,604 293604
    Club 282,235 282235
    Country 277,381 277381
    Latin 270,929 270929
    Reggaeton 264,176 264176
    Jazz 239,784 239784
    Classic Rock 153,111 153111
    House 221,237 221237
    Soul 221,145 221145
    Funk 211,247 211247
    Blues 204,349 204349
    Folk 204,349 204349
    Comedy 195,909 195909
    TOTAL 20,598,839

    Three problems I see with this count are that it doesn’t include all genres (there’s a bunch more listed if you click “Show More Genres”), many artists show up multiple times, and each artist can list up to three genres that they are a part of. If you go here: http://topartists.myspace.com/index.cfm?fuseaction=music.topBands and search for “2-step”, you can see that there are multiple pages for Adam Lambert, and some of them list him as 2-step/2-step/2-step. I’m not sure whether each of these listings is counted in MySpace’s “Top Genres” table (haven’t had time to test this, perhaps by signing up as a new band myself).

    Do you or others have any thoughts on this stuff? Am I missing something here? In trying to figure out the number of active unique bands on MySpace, your analysis of who’s active is great – how can we figure out the number of uniques?

    Thanks,

    Nick

  14. Posted June 20, 2009 at 10:25 am | Permalink

    Since this was meant to be very ballpark, not so worried about the other genres. The second point is important and change some of these numbers. I agree that my analysis of active is much better than uniques based on the information you have brought to my attention.

    One way you could account for the multiple genre issue is take a random sample of myspace band pages and check how many genres each band has. Then you can divide into the total. So if you find the average to be 2.3 genres per band, ( 22 million / 2.3 ) would give you uniques. If you want to be really accurate, then I agree you should take other genres into consideration.

    If you want to be REALLY accurate, you could probably write the script to parse profile ids from 0 to some really large number. I would guess that every time a MySpace user is added, the friendID increments by 1 so you may be able to cover most of the range by creating a MySpace account and looking at your friendID. Ideally, a newly created MySpace account would have one of the highest friendID’s so you could use that as your upper limit. I cannot guarantee this will work without looking into it more, but that is my best guess right now.

  15. Posted September 30, 2009 at 10:59 pm | Permalink

    great and interesting read, i’m really curious how i can apply this to my band page and gather statistics on daily plays along with the types of data you presented, since myspace daily plays resets at midnight each night. maybe you can offer some insight or even write something for the heck of it since i can’t program. you could then offer a web service for people to sign up and that could be lucrative for you.

    it would be great to show a growth trend of both plays and friends to someone who may be interested in my band or anyone’s band. we currently get about 100 per day with only 500 friends, which i think is impressive. i notice other bands with 20,000 friends do about the same. thanks for your time and info.

    thanks, ~yod

  16. Posted October 1, 2009 at 7:38 am | Permalink

    Yod,

    Daily song plays might be difficult to get because they are in the Flash player (so they cannot be easily scraped). I will have to look into their API to see if that data is offered.

    I also agree that those statistics are very nice to have and I am actually working on web marketing project for bands so I will certainly look into this.

    Thanks for your comments and congratulations on your high rate of “active users.” 100 plays per day with 500 friends is great, as is your music.

  17. Posted January 10, 2011 at 11:51 am | Permalink

    Been sometime since the post or last comment but wondered if you looked at the average number of plays per active band.