The tools
First you will need to install a couple of programs. The programs described below have both Windows and Linux/*nix versions. Below are where to get the Windows versions. If you are partial to Linux, you can use your favorite package manager to download and install what you need for the distribution. For the *BSDs, you can use the ports collections to install the utilities.
I will describe how to use these tools to capture the packets I want. If you are a programmer, then you can take what I describe and write a script or program in the language of your choice. I wrote mine in Ruby and also ported it to Perl. In Ruby at least it is trivial.
Wireshark - This is the former Ethereal packet sniffer. Wireshark not only has a GUI front end, it also has a command line interface (tshark).
Winpcap - If you install the newest version of Wireshark it will install winpcap as well.
HTTP Sniffer - Optional if you don't want to mess with wireshark or tshark. In some ways, HTTP Sniffer is better suited but a little harder to work with because it is still under development.
wget or curl - Either of these command line programs are great to use. My preference for most tasks is wget. Curl has features wget does not have. For our purposes however, wget will work fine.
VideoLan - VideoLan is a client-side Audio-Video player that plays .flv files on your computer as well as straight off a video website.
Very brief howto on packet sniffing
After installing the above programs you will need to determine the interface you are using. In Linux or Unix tshark or if you use tcpdump will determine your interface automatically. In Windows you should issue the following command from the C:> prompt:
tshark -D
you will see something like the following:
1. \Device\NPF_GenericDialupAdapter (Adapter for generic dialup and VPN capture)
2. \Device\NPF_{EB5F7518-B4A0-4D10-B795-ED9744D59228} (VMware Virtual Ethernet Adapter)
3. \Device\NPF_{15A058C5-FA9F-4FA5-A38C-74D322C36EA4} (Broadcom 802.11b (Microsoft's Packet Scheduler) )
4. \Device\NPF_{026B866A-79B3-43C7-A0F0-6809D6C7C4E9} (VMware Virtual Ethernet Adapter)
5. \Device\NPF_{AAD42C94-85EC-47A6-B4D7-EC374D0628A2} (National Semiconductor Corp. DP83815 10/100 MacPhyter3v PCI Adapter (Microsoft's Packet Scheduler) )
I had my ethernet interface disconnected and was using the Broadcom 802.11b interface(# 3.). When you have determined the active interface you can next check to see if it is correct by issuing the following from the command line:
C:\> tshark -i 3
You will see immediately something similar to the following:
17.846631 192.168.1.100 -> 192.168.1.104 TCP informer > netbios-ssn [FIN, ACK] Seq=1008 Ack=811 Win=64725 Len=0
17.846871 192.168.1.104 -> 192.168.1.100 TCP netbios-ssn > informer [ACK] Seq=811 Ack=1009 Win=65535 Len=0
17.849765 192.168.1.104 -> 192.168.1.100 TCP netbios-ssn > informer [FIN, ACK] Seq=811 Ack=1009 Win=65535 Len=0
17.849825 192.168.1.100 -> 192.168.1.104 TCP informer > netbios-ssn [ACK] Seq=1009 Ack=812 Win=64725 Len=0
If you don't see anything scrolling on your screen you chose the wrong interface and you should attempt to rediscover the correct interface using the:
C:\> tshark -D
command again.
Here is another brief explanation of the above lines. The first field on the left indicates the timestamp of the current session. The next field is the source ip address and the 3rd field is the destination ip address. The fourth field is the protocol and the last field is a description of the event.
Most if not all of this session has nothing useful for what we are going to use it for. You can see that my laptop is making netbios requests to a Samba server. The capture shows the session timestamp since the capture started, source ip address, destination ip address, TCP or UDP, source port and destination port and other packet information not relevant to the task at hand.
Looking for the correct packets
Now that we know how to capture packets, we should move on to how to find the one that will do something useful. We want to capture packets that will tell us where youtube's videos are stored so that we can download them.
All the bits and bytes swarming around the Internet are visible. Some packets are encrypted and not humanly readable but they are there nonetheless. HTTP traffic is very visible and public. Sometimes it is encrypted and sometimes it is obscured either by necessity or on purpose. Youtube .flv files are not encrypted so we won't need to worry about those. Youtube .flv files are somewhat obscured and that is what we want to discover. The information we want will be a TCP stream that carries the HTTP header payload that will tell us exactly where the flv file is located.
Capturing the packets
First start up wireshark or tshark but don't start capturing traffic yet. Next in your browser search for a video you want to download. In this example I am going to download the Snowbound video by Donald Fagan. Once you have searched for and found the video start wireshark. The URL for this particular video is: Snowbound - Donald Fagen. Play the video while capturing packets with wireshark. It is OK to stop the video once it starts streaming.
After the video is stopped, switch to the wireshark program and stop capturing packets. In the filter box type http and click on 'Apply' and wireshark will filter out everything but the http protocol. This is what we want.
Once you have done the following you will notice that there are a lot of GET requests. This is what we are looking for. They are in the HTTP headers and possess the information we need. Next to one of these GETs you should see the following:
get_video?video_id=0MGtr121fFI
Now highlight and right-click on the line and choose 'Follow TCP Stream' from the menu items. A pop-up will appear. Scroll down to near the bottom of the pop-up and find the world 'Location'. You should see something like the following:
http://chi-v9.chi.youtube.com/get_video?video_id=0MGtr121fFI
You will notice that it is a typical URL. Also the host part of the URL at youtube: chi-v9.chi.youtube.com will change. This is probably due to load balancing of the youtube servers. Now all you need to do in order to confirm that this works is to copy this URL and paste it into your browser. When you hit the Enter key in Firefox a pop-up will appear asking if you want to save it or not. You want to save it to disk and give the get_video default name another name. It can be anything you want but make sure you save it with the extension of .flv for flash video. Now if you have installed VideoLan or another flash video player then you can click on that file and play it from your hard drive.
Another example: redtube.com
Redtube.com is another flash video sharing site. Unlike youtube.com it is NSFW so you are warned.
Again using wireshark, find the video you want to download and before you click on the thumbnail image, start capturing packets. Click on the thumbnail and it will take you to the flash video stream. The http header we are looking for is:
GET /_videos_t4vn23s9jc5498tgj49icfj4678/0000003/J3FXZI1SQ.flv?start=0
The above get is what we need but we need to chop off the parameter '?start=0'. Otherwise the download will fail.
Next, we need to find the host server. We do that by searching for "host" and we will find:
dl.redtube.com
Now putting the URIs all together in a URL we get:
http://dl.redtube.com/_videos_t4vn23s9jc5498tgj49icfj4678/0000003/J3FXZI1SQ.flv
This will now get us the video. If you again put this in your browser's address bar a pop-up will appear and ask if you want to save the file. Go ahead and save it and give it a meaningful name. WARNING! and all that stuff...this video is not safe for work.
Google videos (googlevideo.com)
Now that Google owns Youtube they are using those servers. Things get a little confusing and grabbing a Google video is different. This time I will use the HTTP Sniffer utility. This utility not only fetches the headers we want but also the streamed video data as well. Start the utility like you did with tshark or wireshark and 'debug' the browser you are using. HTTP Sniffer uses the debug term instead of the capture term. This was confusing for me at least. Then start streaming the Google video of your choice. Then go back to the HTTP Sniffer utility and you will see data. Right-click on one of the rows and select Save All. You may want to stop the Google video stream because it can produce quite a large log file. Remember where you save the log file.
Now you want to open the log file in your favorite text editor (vim or gvim is well suited for this) and search for something like the following:
HTTP/1.1 302 Found
Location: http://74.125.1.80/get_video?video_id=dcLMH8pwusw&origin=mia-v5.mia.youtube.
com
Connection: close
GET /get_video?video_id=dcLMH8pwusw&origin=mia-v5.mia.youtube.com HTTP/1.1
Host: 74.125.1.80
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,
image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
At this point you can put together the URIs again or use the 'Location' field and you should come up with the following:
http://74.125.1.80/get_video?video_id=dcLMH8pwusw&origin=mia-v5.mia.youtube.
com
put this in the Address field of your browser and once again a pop-up will appear asking if you want to save the file.
Using wget
First download wget. It is included on many *nix and Linux distributions. For Windows you can download a pre-compiled version here. I use cygwin on my Windows boxes so for me it was easy to install.
Once you have confirmed that you can save youtube videos it might be easier to use the wget or curl utilities to grab them. Wget has the ability to download files in many different ways. Wget runs from the command line in either Windows or *nix/Linux. The command line options you can use are:
C:\> wget -O MyPr0n.flv <OneOfTheURLsBelow>
http://chi-v9.chi.youtube.com/get_video?video_id=0MGtr121fFI
http://dl.redtube.com/_videos_t4vn23s9jc5498tgj49icfj4678/0000003/J3FXZI1SQ.flv
http://74.125.1.80/get_video?video_id=dcLMH8pwusw&origin=mia-v5.mia.youtube.
com
The -O option gives you the ability to save the file to a more human readable name.
Other hard ways to get flash video files
1. Search through the javascript on the web page by 'Viewing Page Source'. I started out doing it this way, it is tedious but worth doing if you want to learn how it is coded.
2. Decompile the .swf file. I read about this one but haven't tried it. Some swf files are compressed and you will first have to decompress them, then decompile them. There are several free and commercial software that do this.
OK, so here's the easy way
Now that we know what to look for you can write a script or program to automate all this. I was too lazy to do this and besides this article is getting too long. If all this is a bit much then just download the Orbit Downloader 2.0. There are any number of video/audio downloaders but Orbit is my favorite.
A final sidenote
I saw this comment by mr strange and used the above technique to download and save the montage about trhurler he refers to. I always enjoyed trhurler's comments and diaries since 2001 when I first became a Kuron. If you are interested the following link will download the flash video from myspacetv.com or paste it into a flash player...enjoy and Rest In Peace trhuler.