

The traffic was captured using Wireshark and tcpdump, generating a total amount of 28GB of data. torrent files from a public a repository and captured traffic sessions using the uTorrent and Transmission applications. To generate this traffic we downloaded different. TraP2P: This label is used to identify file-sharing protocols like Bittorrent. Within this label we captured voice calls using Facebook, Hangouts and Skype. VoIP: The Voice over IP label groups all traffic generated by voice applications. For our dataset we captured Skype file transfers, FTP over SSH (SFTP) and FTP over SSL (FTPS) traffic sessions. We captured traffic from Youtube (HTML5 and flash versions) and Vimeo services using Chrome and Firefox.įile Transfer: This label identifies traffic applications whose main purpose is to send or receive files and documents. Streaming: The streaming label identifies multimedia applications that require a continuous and steady stream of data. Under this label we have Facebook and Hangouts via web browsers, Skype, and IAM and ICQ using an application called pidgin. The clients were configured to deliver mail through SMTP/S, and receive it using POP3/SSL in one client and IMAP/SSL in the other.Ĭhat: The chat label identifies instant-messaging applications.

For instance, when we captured voice-calls using hangouts, even though browsing is not the main activity, we captured several browsing flows.Įmail: The traffic samples generated using a Thunderbird client, and Alice and Bob Gmail accounts. We also give a detailed description of the different types of traffic generated:īrowsing: Under this label we have HTTPS traffic generated by users while browsing or performing any task that includes the use of a browser. We captured a regular session and a session over VPN, therefore we have a total of 14 traffic categories: VOIP, VPN-VOIP, P2P, VPN-P2P, etc. Below we provide the complete list of different types of traffic and applications considered in our dataset for each traffic type (VoIP, P2P, etc.) We created accounts for users Alice and Bob in order to use services like Skype, Facebook, etc.

To generate a representative dataset of real-world traffic in ISCX we defined a set of tasks, assuring that our dataset is rich enough in diversity and quantity.
